Onstage, Google announced new text-to-speech previews that allow developers to take advantage of “native audio output” for improved customization. Google says that native audio output, driven by its latest Gemini models, enables more expressive, natural speech — voices that capture subtle nuances and that can seamlessly switch to a whisper.
Native audio output works in over 24 languages and can change languages on the fly, according to Google. It’s available in the Gemini API starting today.