SpaceXAISpeech Generation

xAI Text to Speech

View rankings x.ai

ReleasedMar 2026

xAI Text to Speech is a high-fidelity neural voice synthesis model designed for low-latency, expressive audio generation. Developed as part of xAI's media engine, the model is optimized for real-time conversational applications, providing sub-second latency suitable for full-duplex voice agents. It supports more than 25 languages and offers a diverse set of pre-configured vocal profiles, including the voices Ara, Eve, Leo, Rex, and Sal.

The model is characterized by its support for speech tags, which allow for granular control over the delivery and prosody of the generated audio. Users can insert inline markers such as [pause] and [laugh] to simulate natural human interruptions, or use wrapping tags like <whisper>, <slow>, and <build-intensity> to modify the emotional tone and pacing of specific text segments. This system enables the model to handle complex storytelling and nuanced dialogue beyond standard text recitation.

Technically, the model is designed to operate within a unified speech-to-speech stack. It integrates directly with xAI’s language models and transcription services to minimize the end-to-end latency typically found in cascaded AI systems. The engine supports various output formats, including high-fidelity MP3 and WAV, as well as telephony-optimized protocols like G.711 (μ-law/A-law) and PCM, ensuring compatibility across web, mobile, and telecommunication platforms.

While the underlying architecture remains proprietary, the model is engineered for high-volume inference and robust text normalization. This allows it to accurately convert abbreviations, dates, and specialized terminology into spoken form. The model's infrastructure is the same technology utilized for voice features within the Grok assistant and integrated into broader ecosystems including Tesla vehicle software and Starlink customer support interfaces.

Create with Crafiq

Generate images, 3D models, video and audio in one studio.

Explore the studio

How xAI Text to Speech ranks

xAI Text to Speech is highlighted in the table below. Switch the metric to see how the ordering changes.

ELO arena ranking for speech synthesis quality based on blind pairwise human preference comparisons. Data from Artificial Analysis

#
1	Alibaba	Qwen-Audio-3.0-TTS-Plus Alibaba	1,234±16	Jul 2026
2	SpeechifyAI	Simba 3.2 SpeechifyAI	1,230±16	Jul 2026
3	Google	Gemini 3.1 Flash TTS Google	1,215±13	Apr 2026
4	Inworld	Inworld TTS 1.5 Max Inworld	1,212±19	Jan 2026
5	Cartesia	Sonic 3.5 Cartesia	1,208±14	May 2026
6	Alibaba	Fun-Realtime-TTS Alibaba	1,205±15	May 2026
7	StepFun	StepAudio 2.5 TTS StepFun	1,199±16	Apr 2026
8	Smallest.ai	Lightning V3.1 Pro (Jul 2026) Smallest.ai	1,198±17	Jul 2026
9	Inworld	Realtime TTS 1.5 Max Inworld	1,198±14	Jan 2026
10	Inworld	Realtime TTS-2 Inworld	1,197±14	May 2026
11	VUI Labs	Luna TTS VUI Labs	1,194±16	Jun 2026
12	Alibaba	Fun-Realtime-TTS-Preview Alibaba	1,193±15	May 2026
13	MiniMax	Speech 2.8 HD MiniMax	1,178±12	Feb 2026
14	StepFun	StepAudio 2.5 TTS StepFun	1,175±18	Apr 2026
15	ElevenLabs	Eleven v3 ElevenLabs	1,175±12	Jun 2025
16	async	Async Flash v1.5 async	1,164±15	May 2026
17	Inworld	Inworld TTS 1 Max Inworld	1,161±13	Aug 2025
18	Inworld	Inworld TTS 1.5 Mini Inworld	1,159±16	Jan 2026
19	MiniMax	Speech 2.8 Turbo MiniMax	1,150±12	Jan 2026
20	Smallest.ai	Lightning V3.1 Pro TTS (Jun 2026) Smallest.ai	1,146±15	Jun 2026
21	StepFun	Step TTS 2 (Mar 2026) StepFun	1,142±14	Mar 2026
22	async	Async Pro v1.0 async	1,140±15	May 2026
23	Fish Audio	Fish Audio S2.1 Pro Fish Audio	1,138±15	Jun 2026
24	MiniMax	Speech 2.6 HD MiniMax	1,138±12	Oct 2025
25	Inworld	Realtime TTS 1.5 Mini Inworld	1,137±13	Jan 2026
26	MiniMax	Speech 2.6 Turbo MiniMax	1,128±12	Oct 2025
27	Microsoft Azure	Azure HD 2.5 Microsoft Azure	1,126±13	Nov 2025
28	Speechify	SIMBA 3.0 Speechify	1,122±14	Feb 2026
29	Inworld	Inworld TTS 1 Inworld	1,121±12	Aug 2025
30	Fish Audio	Fish Audio S2 Pro Fish Audio	1,120±14	Mar 2026
31	StepFun	Step Audio EditX (Mar 2026) StepFun	1,113±14	Mar 2026
32	MiniMax	Speech-02-HD MiniMax	1,112±12	May 2025
33	SpaceXAI	xAI Text to Speech SpaceXAI	1,108±21	Mar 2026
34	StepFun	Step TTS 2open weights StepFun	1,105±19	Aug 2025
35	OpenAI	TTS-1 HD OpenAI	1,103±12	Nov 2023
36	ElevenLabs	Turbo v2.5 ElevenLabs	1,102±11	Jul 2024
37	ElevenLabs	Multilingual v2 ElevenLabs	1,102±11	Aug 2023
38	ElevenLabs	ElevenLabs v3 - Alpha ElevenLabs	1,095±11	Jun 2025
39	Gradium	Gradium TTS Gradium	1,094±16	Mar 2026
40	Smallest.ai	Lightning V3.1 Pro Smallest.ai	1,090±14	Mar 2026
41	Maya Research	Maya 2 Flash Maya Research	1,089±17	Jul 2026
42	Gradium	Gradium TTS (Jun 2026) Gradium	1,088±15	Jun 2026
43	ElevenLabs	Flash v2.5 ElevenLabs	1,086±11	Dec 2024
44	MiniMax	Speech-02-Turbo MiniMax	1,084±12	Apr 2025
45	Resemble AI	Chatterbox HD Resemble AI	1,083±14	May 2025
46	OpenAI	TTS-1 OpenAI	1,083±11	Nov 2023
47	Mistral	Voxtral TTS Mistral	1,075±14	Mar 2026
48	Google	Gemini 2.5 Flash Lite TTS Google	1,075±13	Jul 2025
49	Google	Studio Google	1,072±12	Oct 2022
50	Cartesia	Sonic 3 Cartesia	1,070±12	Oct 2025
51	Fish Audio	OpenAudio S1 Fish Audio	1,068±12	Jun 2025
52	OpenAI	GPT-Realtime-2 OpenAI	1,066±15	May 2026
53	Xiaomi	MiMo-V2.5-TTS Xiaomi	1,064±14	Apr 2026
54	Speechify	SIMBA 1.6 Speechify	1,064±14	Nov 2024
55	Amazon	Polly Generative Amazon	1,064±12	May 2024
56	StepFun	Step Audio EditXopen weights StepFun	1,061±18	Nov 2025
57	MiniMax	T2A-01-HD MiniMax	1,061±12	Jan 2025
58	NVIDIA	Magpie-Multilingual 357M (Feb 2026) NVIDIA	1,060±14	Feb 2026
59	Kokoro	Kokoro 82M v1.0 Kokoro	1,057±11	Jan 2025
60	Hume AI	Octave 2 Hume AI	1,056±13	Oct 2025
61	OpenAI	GPT-4o mini TTS OpenAI	1,056±22	Mar 2025
62	Hithink	Speech 2.6 Hithink	1,055±15	Jul 2026
63	Google	Chirp 3: HD Google	1,054±12	Apr 2025
64	Maya Research	Maya 2 Global Maya Research	1,048±17	Jul 2026
65	Amazon	Polly Long-Form Amazon	1,047±14	Nov 2023
66	Maya Research	Maya1 Maya Research	1,046±12	Nov 2025
67	async	Async Flash v1.0 async	1,046±12	Jul 2025
68	Fish Audio	OpenAudio S1 Mini Fish Audio	1,045±20	Jun 2025
69	Google	Gemini 2.5 Flash TTS Google	1,044±14	Sep 2025
70	Google	Journey Google	1,043±14	Dec 2023
71	Rime	Coda Rime	1,042±14	May 2026
72	Cartesia	Sonic English (Oct '24) Cartesia	1,038±12	Oct 2024
73	Google	Gemini 2.5 Flash TTS (Dec 2025) Google	1,037±13	Dec 2025
74	Google	Gemini 2.5 Pro TTS Google	1,036±14	Dec 2025
75	Google	Gemini 2.5 Pro (Dec 2025) Google	1,035±13	Dec 2025
76	Speechify	Simba Speechify	1,035±12	Jun 2024
77	Microsoft Azure	MAI-Voice-1 Microsoft Azure	1,029±14	Apr 2026
78	Microsoft Azure	Azure Neural Microsoft Azure	1,027±19	Sep 2018
79	Smallest.ai	Lightning v3.1 Smallest.ai	1,025±14	Mar 2026
80	Hume AI	Octave TTS Hume AI	1,025±12	Feb 2025
81	MiniMax	T2A-01-Turbo MiniMax	1,022±12	Jan 2025
82	Xiaomi	MiMo-V2-TTS Xiaomi	1,017±15	Mar 2026
83	Resemble AI	Chatterbox Resemble AI	1,013±12	Jun 2025
84	Fish Audio	Fish Speech 1.5 Fish Audio	1,010±12	Dec 2024
85	NVIDIA	Magpie-Multilingual 357M NVIDIA	1,005±12	Aug 2025
86	Rime	Arcana v3 Rime	1,003±14	Feb 2026
87	Zyphra	Zonos-v0.1 Zyphra	1,000±0	Feb 2025
88	Murf AI	Murf Speech Gen 2 Murf AI	972.0±12	Mar 2024
89	LMNT	LMNT LMNT	970.0±13	Sep 2023
90	Microsoft Azure	VibeVoice 1.5B Microsoft Azure	966.0±14	Aug 2025
91	StepFun	Step TTS Miniopen weights StepFun	959.0±10	Feb 2025
92	Microsoft Azure	VibeVoice 7B Microsoft Azure	957.0±14	Aug 2025
93	OpenVoice	OpenVoice v2 OpenVoice	956.0±14	Apr 2024
94	NVIDIA	Magpie Multilingual NVIDIA	940.0±15	Mar 2025
95	Neuphonic	Neuphonic TTS Neuphonic	934.0±14	Oct 2024
96	Alibaba	Qwen3 TTS Flash Alibaba	931.0±15	Sep 2025
97	Alibaba	Qwen3 TTS Alibaba	917.0±14	Jan 2026
98	Coqui	XTTS v2 Coqui	916.0±15	Oct 2023
99	Google	WaveNet Google	902.0±12	Sep 2016
100	StyleTTS	StyleTTS 2 StyleTTS	890.0±16	Jun 2023
101	Rime	Mist V2 Rime	885.0±15	Feb 2025
102	Google	Neural2 Google	883.0±12	Jun 2022
103	Amazon	Polly Neural Amazon	881.0±14	Jul 2019
104	Google	Standard Google	876.0±12	Mar 2018
105	Murf AI	Falcon (Beta) Murf AI	857.0±15	Nov 2025
106	Noiz	Noiz TTS Noiz	853.0±17	Nov 2025
107	MetaVoice	MetaVoice v1 MetaVoice	835.0±18	Feb 2024
108	Amazon	Polly Standard Amazon	810.0±16	Nov 2016

Artificial Analysis Arena data from Artificial Analysis· updated Jul 26, 2026