Menu
≈ why?
See the rankings

Use case · Voice building blocks

Best voice building blocks: text-to-speech and infrastructure

These are the engines, not the finished agent: the text-to-speech, voice cloning and low-level infrastructure you assemble into your own product. Pick these when you are building rather than buying.

Editorial preview ordering. The order reflects our provisional 1–10 scores, set to get the framework in place, not yet blind test calls. Pricing, voice counts and compliance are sourced and dated on each platform's page. How we test.

What to look for: Look for voice quality and range, bring-your-own-voice cloning, latency you can live with, and pricing that scales by characters or audio rather than per call.

Some links here are affiliate links, we may earn a commission. How this works.

ElevenLabs Profiled
Best voice quality

The most natural-sounding AI voice we have heard, whether you are voicing a video or putting it on a live phone line.

$ 0.10–0.30 /min all-in ≈ €0.09–0.26≈ £0.07–0.22≈ ₹9.57–28.71≈ R$0.50–1.51≈ A$0.14–0.42
Not yet call-tested Prices 2026-06-03
Cartesia Profiled
FastestBudget pick

The speed specialist whose fast, natural speech is what keeps a live phone agent feeling real rather than laggy.

$ 0.08–0.15 /min all-in ≈ €0.07–0.13≈ £0.06–0.11≈ ₹7.66–14.36≈ R$0.40–0.75≈ A$0.11–0.21
Not yet call-tested Prices 2026-06-03
Murf AI Profiled
Easiest to use

Studio-quality voiceover for your video, course or advert, no microphone needed. Built for narration, not live calls.

$ 0.14–0.18 /min all-in ≈ €0.12–0.15≈ £0.10–0.13≈ ₹13.40–17.23≈ R$0.70–0.90≈ A$0.20–0.25
Not yet call-tested Prices 2026-06-03
LiveKit Profiled
Most flexible

The open-source real-time stack that carries voice-agent audio, plus a framework to wire your own STT, LLM and voice.

$ 0.02–0.20 /min all-in ≈ €0.02–0.17≈ £0.01–0.15≈ ₹1.91–19.14≈ R$0.10–1.00≈ A$0.03–0.28
Not yet call-tested Prices 2026-06-03
Hume AI Profiled
Best voice quality

A voice that picks up how the caller is feeling and answers in kind, for warmer and more human conversations.

$ 0.05–0.13 /min all-in ≈ €0.04–0.11≈ £0.04–0.10≈ ₹4.79–12.44≈ R$0.25–0.65≈ A$0.07–0.18
Not yet call-tested Prices 2026-06-03
Pipecat Profiled

Open-source Python framework where you pick every voice-agent part, free to self-host, with Daily's cloud for scaling.

$ 0.03–0.20 /min all-in ≈ €0.02–0.17≈ £0.02–0.15≈ ₹2.58–19.14≈ R$0.14–1.00≈ A$0.04–0.28
Not yet call-tested Prices 2026-06-03
Rime Profiled

Enterprise text-to-speech built for high-stakes phone calls, where a mispronounced name loses the customer.

$ 0.04–0.10 /min all-in ≈ €0.03–0.09≈ £0.03–0.07≈ ₹3.83–9.57≈ R$0.20–0.50≈ A$0.06–0.14
Not yet call-tested Prices 2026-06-03

OpenAI's speech-to-speech model and API for building your own voice agent, billed by audio tokens, not by the minute.

$ 0.07–0.48 /min all-in ≈ €0.06–0.41≈ £0.05–0.36≈ ₹6.70–45.94≈ R$0.35–2.41≈ A$0.10–0.67
Not yet call-tested Prices 2026-06-03
Deepgram Profiled
Budget pick

Fast, accurate speech-to-text to power high-volume voice apps, for teams happy to build on a developer API.

$ 0.08–0.18 /min all-in ≈ €0.07–0.15≈ £0.06–0.13≈ ₹7.66–17.23≈ R$0.40–0.90≈ A$0.11–0.25
Not yet call-tested Prices 2026-06-03
AssemblyAI Profiled
Budget pick

Accurate streaming speech-to-text with built-in audio intelligence, for teams who want the listening half done well.

$ 0.00–0.01 /min all-in ≈ €0.00–0.01≈ £0.00–0.01≈ ₹0.24–0.72≈ R$0.01–0.04≈ A$0.00–0.01
Not yet call-tested Prices 2026-06-03
Show 1 more
Speechmatics Profiled

Enterprise speech-to-text with very broad language coverage and real on-prem options, for teams who self-host.

$ 0.00–0.01 /min all-in ≈ €0.00–0.01≈ £0.00–0.01≈ ₹0.38–1.15≈ R$0.02–0.06≈ A$0.01–0.02
Not yet call-tested Prices 2026-06-03

Want the numbers side by side? Open the full ranking table, build your own comparison, or estimate spend in the cost calculator.