Amazon logo
Amazon

Polly Standard

Released Nov 2016

Amazon Polly Standard is a cloud-based text-to-speech engine that utilizes concatenative synthesis to convert text into lifelike speech. Introduced in 2016 as the foundation of the Amazon Polly service, the Standard engine operates by stringing together small segments of recorded human speech, known as phonemes, to produce audio output.

This engine supports a broad range of languages and dozens of different voices across several dialects. While Amazon has since introduced more complex Neural and Generative engines that utilize deep learning models, the Standard engine is characterized by its lower latency and continued support for a wide catalog of international voices. It provides users with control over speech output via Speech Synthesis Markup Language (SSML), allowing for the adjustment of phrasing, emphasis, and pronunciation.

Rankings & Comparison