| Output $/MTok | $10.00 |
|---|---|
| Input $/MTok | $2.50 |
| Cached input $/MTok | $1.2500 |
| Context window | 128,000 |
| Output cap | 16,384 |
| Modalities | in: text, image / out: text |
| Tool use | ✓ |
| Structured output | ✓ |
| Family | GPT-4o |
| Knowledge cutoff | 2023-10-01 |
| Verified | 2026-05-17 |
| Source | provider page ↗ |
| $/1M chars | $30.00 | $25.00 |
|---|---|---|
| Voice quality | neural | neural |
| Voice cloning | — | ✓ included |
| Voice count | — | 217 |
| Languages | en, es, de, fr, nl, it, ja | en, hi, es, mr, kn, ta, bn, gu, te, ml, pa, or |
| SSML support | — | — |
| TTFB | — | 200ms |
| Output formats | wav, mp3, linear16, mulaw, alaw, opus | pcm, wav, mp3, mulaw |
| Verified | 2026-05-19 | 2026-05-19 |
| Source | provider page ↗ | provider page ↗ |