| Output $/MTok | $0.40 | $8.00 |
|---|---|---|
| Input $/MTok | $0.10 | $2.00 |
| Cached input $/MTok | $0.0250 | $0.5000 |
| Context window | 1,048,576 | 200,000 |
| Output cap | 8,192 | 100,000 |
| Modalities | in: text, image, audio, video / out: text | in: text, image / out: text |
| Tool use | ✓ | ✓ |
| Structured output | ✓ | ✓ |
| Family | Gemini 2.0 | o-series |
| Knowledge cutoff | 2024-08-31 | 2024-06-01 |
| Verified | 2026-07-02 | 2026-07-02 |
| Source | provider page ↗ | provider page ↗ |
| $/1M chars | $30.00 | $50.00 | $19.50 |
|---|---|---|---|
| Voice quality | neural | neural | neural |
| Voice cloning | — | ✓ included | — |
| Voice count | 94 | — | — |
| Languages | en | 31+ | en, hi |
| SSML support | — | — | — |
| TTFB | 37ms | — | 200ms |
| Output formats | — | — | pcm, mp3, wav, ulaw, alaw |
| Verified | 2026-07-02 | 2026-07-02 | 2026-07-02 |
| Source | provider page ↗ | provider page ↗ | provider page ↗ |