Speech-to-text, text-to-speech, and audio intelligence API — transcribe, synthesize, and analyze audio and video at scale.