MiniMax logo
MiniMax

T2A-01-HD

Released Jan 2025

T2A-01-HD (also known as speech-01-hd) is a text-to-audio model developed by the Chinese AI firm MiniMax. Released as the high-fidelity flagship of the T2A-01 series, the model is designed for realistic speech synthesis and voice cloning. It is optimized to capture complex tonal nuances, timbre similarity, and natural prosody, distinguishing it from the lower-latency T2A-01-Turbo variant.

The model features zero-shot voice cloning capabilities, allowing it to replicate a speaker's voice from as little as 6 to 10 seconds of audio input. It supports 17 languages, including English (with regional variants), Mandarin, Cantonese, Japanese, Korean, and several European languages. Users can manually adjust parameters such as pitch, speed, and emotional tone to fine-tune the delivery for various applications, such as audiobooks and digital assistants.

While MiniMax has released weights for some of its text and multimodal models, T2A-01-HD is primarily available as a closed-source service. It is integrated into the Hailuo AI platform and offered via the MiniMax open platform API, providing access to a library of over 300 pre-made voices.

Rankings & Comparison