Menu
≈ why?
See the rankings
← All platforms

Speechmatics

Speech-to-text

Enterprise speech-to-text with very broad language coverage and real on-prem options, for teams who self-host.

Best for wide language coverage, or running speech-to-text on your own hardware
Watch for one building block to wire up, not a finished agent

Paid link, we may earn a commission. How this works.

Our scores editorial preview
4.6 Fair overall / 10
Voice quality 2
Voice range 3
Ease of use 5
Value 9
All-in /min $0.00–0.01
headline /min $0.00
✓ HIPAA✓ SOC 2 Type II✓ GDPR

Scored on the same voice-agent rubric as the full platforms, so a building block like this scores low on the axes it does not address. Read its value score against its job.

See how it stacks up · Full rankings →

The languages-and-deployment specialist. Speechmatics turns speech into text in 55+ languages and will run inside your own data centre, not just its cloud. It is one building block though, not a whole phone agent. No voice, no language model, no phone line.

What you'll pay

About $0.00 to 0.01 for a minute of conversation, once the phone line and the AI are added in.

That's roughly $0.24–0.72 an hour. Plans: $0/mo (Free).

Pricing

$ 0.00–0.01/min The total you actually pay for one minute of conversation once every piece is added up: the platform, the AI, the voice and the phone line. ≈ €0.00–0.01≈ £0.00–0.01≈ ₹0.38–1.15≈ R$0.02–0.06≈ A$0.01–0.02 headline $0.00 /min
Show the cost breakdown
What the platform charges to run the agent, before the phone line and the AI usage are added on.
The step that turns what the caller says out loud into text the AI can read. $0.00 /min
The AI 'brain' that reads what the caller said and works out what to say back.
The step that turns the AI's written reply back into a spoken voice.
The phone line itself: the service that connects the call to a real phone number. Usually billed on top of the platform.
The total you actually pay for one minute of conversation once every piece is added up: the platform, the AI, the voice and the phone line. $0.00–0.01 /min

Speechmatics prices per HOUR of audio, not per minute. The pricing page states the Pro plan starts from $0.24/hr, which is about $0.004 a minute, the headline figure carried here. That is the floor: per-service rates run higher and vary by model and mode. A third-party 2026 breakdown (PulseSignal) lists Pro batch at $0.0050/min Standard and $0.0083/min Enhanced, and real-time at $0.0067/min Standard and $0.0117/min Enhanced, which is roughly $0.30 to $0.70 an hour depending on model and mode. We treat the vendor's own from $0.24/hr as primary and the per-service split as indicative until confirmed from the account portal. The free tier is 480 minutes a month (8 hours). A 20% volume discount applies above 500 hours a month per service. This is speech-to-text only: there is no language model, no text-to-speech and no telephony, so those components are 0 here. To run a full phone agent you add an LLM, a voice engine and a phone line separately, each a cost on top.

Plans & what you get

Every plan in one place: the monthly fee, what each one includes, and the features it unlocks. Anything beyond a plan's allowance, or on a pay-as-you-go tier, is billed at the per-minute rate above. A blank in the features means the vendor's plan page does not state it for that plan, not that it is unavailable.

FreeProEnterprise
Price FreeCustom
Included 480 minutes Pay per use
Plan notes 480 free minutes per month, no card required, 2 concurrent real-time sessionsPay-as-you-go on usage, from $0.24/hr, capped at 6,000 hours/month, 50 concurrent real-time sessionsCustom pricing, no rate limits, on-prem/container deployment, volume discounts from 24,000 hours/year
What each plan unlocks
API access Yes Yes
Concurrent calls 2 real-time sessions 50 real-time sessions
Priority support Custom deployment + volume pricing
  • Free Free
    480 minutes

    480 free minutes per month, no card required, 2 concurrent real-time sessions

    API access
    Yes
    Concurrent calls
    2 real-time sessions
    Priority support
  • Pro
    Pay per use

    Pay-as-you-go on usage, from $0.24/hr, capped at 6,000 hours/month, 50 concurrent real-time sessions

    API access
    Yes
    Concurrent calls
    50 real-time sessions
    Priority support
  • Enterprise Custom

    Custom pricing, no rate limits, on-prem/container deployment, volume discounts from 24,000 hours/year

    API access
    Concurrent calls
    Priority support
    Custom deployment + volume pricing

Each plan bundles a set amount of talk time a month.

Prices in USD as set by the vendor · last checked 2026-06-03 · vendor pricing →

At a glance

· Plugging in your own phone-number supplier instead of using the platform's numbers. Handy if you already run your own phone setup. · Handing the call to a human with context: the AI briefs the person first, instead of a cold drop where the caller repeats themselves. · Kicking off a whole list of outbound calls at once, rather than dialling one at a time. · A standard way to let the agent use outside tools mid-call, like a booking system or your CRM. (MCP stands for Model Context Protocol.)
Speech-to-text
Speechmatics Ursa
Text-to-speech
Languages
en, es, fr, de, it, pt, nl, pl, ru, ar, hi, zh, ja, ko, cy
Integrations
Real-time API (streaming), Batch API (recorded files), On-prem containers (CPU / GPU), Kubernetes self-host, Virtual Appliance (on-prem VM), Native SDKs

Compliance

✓ HIPAA✓ SOC 2 Type II✓ GDPR

Our full take

Speechmatics is a speech-to-text engine, and that is the whole point to get straight first. It listens to audio and writes down the words. It does not generate a reply, it does not speak back, and it does not dial a phone. So if you are shopping for a finished voice agent that answers your calls, this is not that. It is one of the parts you would build that agent from, and it is a good one.

Where it earns its place is languages. Speechmatics transcribes 55+ languages off a single model, which means you get the regional accents and dialects (Brazilian Portuguese, Canadian French, and so on) without bolting on a separate pack for each. Most of the cheaper speech-to-text engines top out around seven or ten languages. Deepgram, the closest building-block vendor we cover, lists seven. If your callers speak Tagalog, Welsh, Swahili or Urdu, that gap is the entire reason to look here.

The second reason is where it runs. Most speech-to-text APIs only run in the vendor’s cloud, you send them audio and they send back text. Speechmatics will also run inside your own data centre, as a container on your own hardware (CPU or GPU), on Kubernetes, or as a pre-built virtual machine they call a Virtual Appliance. For a hospital or a bank that cannot let call audio leave the building, that on-premises option (meaning it runs on your own servers, not someone else’s cloud) is often a hard requirement, not a nice-to-have. It is the kind of thing you cannot retrofit, so it matters that it is there from the start.

Now the pricing, and here is the bit that trips people up. Speechmatics bills per hour of audio, not per minute like the agent platforms. The pricing page lists the Pro plan as starting from $0.24 an hour, which works out at about $0.004 a minute, the figure shown at the top of this page. Treat that as the floor, not the average. The headline is the cheapest service at the lowest tier, and the real rate climbs with the model you pick and the mode you run.

Here is roughly how it splits, so you can see the workings. A third-party 2026 breakdown puts Pro batch transcription (processing a recorded file after the fact) at about $0.0050 a minute on the Standard model and $0.0083 on the higher-accuracy Enhanced model. Real-time transcription (live, as the audio streams in) runs around $0.0067 Standard and $0.0117 Enhanced. In per-hour terms that is roughly $0.30 to $0.70 an hour depending on model and mode. We are flagging that split as indicative rather than gospel, because we are sourcing it from a pricing aggregator, not from the vendor’s own rate card, which loads behind the account portal. The vendor’s own from $0.24/hr is what we are treating as primary. There is a free tier of 480 minutes a month (8 hours) to test with, no card needed, and a 20% volume discount once you cross 500 hours a month.

One honest caveat on cost. That per-minute number looks tiny next to a $0.06-a-minute agent platform, and it is, but it is not comparing like for like. Speechmatics is charging you for one job, the transcription. The platforms are charging for transcription plus the language model plus the voice plus the phone line bundled together. To build a full phone agent on Speechmatics you still have to pay for an LLM, a text-to-speech engine and telephony separately. Add those up and the real per-minute cost lands a lot closer to the bundled platforms than the $0.004 headline suggests.

On compliance, Speechmatics is unusually well-documented for a building block. Its own security page states SOC 2 Type II, ISO/IEC 27001:2022, GDPR and full HIPAA compliance, with AES 256 encryption at rest and TLS 1.2 or higher in transit, plus a public trust centre where you can pull the actual reports. We have ticked HIPAA, SOC 2 Type II and GDPR here because the vendor states them directly. We left SOC 2 Type I unticked: the page names Type II, not Type I, and we do not assume one from the other. For a regulated buyer, that combination of on-prem deployment plus written certifications is the strong card.

My read: Speechmatics is the one you reach for when language coverage or on-premises deployment is non-negotiable, and you have the engineering to assemble the rest of the agent around it. The voice-quality and ease-of-use scores sit lower here than for a finished platform, and that is fair, this is infrastructure, not a product you switch on. If you just want calls answered without standing up your own stack, a bundled platform will get you there faster. If you need to transcribe twenty languages, or keep the audio on your own servers, very little else competes.

The 1 to 10 scores on this page are an editorial preview, our provisional read to get the framework in place, not a measured result. We have not run Speechmatics through our own test calls yet, so there is no Voxrater latency figure here. The pricing, language, deployment and compliance detail is sourced from Speechmatics’ own pricing, security, languages and deployments pages plus one third-party pricing breakdown, captured 2026-05-31.

Alternatives to Speechmatics

Other platforms that overlap with Speechmatics on the same kind of work, ranked by how many capabilities they share, then by cheaper all-in cost per minute. Compare any of them side by side on the compare page.

Tracking Speechmatics? Get the next test result

We re-test and re-price the platforms we cover. Join the list and the next dated update lands in your inbox.

Newsletter launching soon.

Sources

  1. Speechmatics pricing verified 2026-06-02: Pro from $0.24/hr (= $0.004/min), 2,400 free minutes/mo; speech-to-text only, so no per-minute voice-output rate. · captured 2026-06-02
  2. Speechmatics pricing page re-captured 2026-06-02 for the quarterly re-verification (screenshot in evidence/). · captured 2026-06-02
  3. Speechmatics pricing page: per-plan features (Free, Pro, Enterprise), Pro from $0.24/hr, Free 480 min/month, 6,000 hr/month cap, 24,000 hr/year enterprise discount · captured 2026-05-31
  4. Third-party 2026 per-service breakdown: batch $0.0050/$0.0083 per min, real-time $0.0067/$0.0117 per min (Standard/Enhanced) · captured 2026-05-31
  5. Speechmatics security page: SOC 2 Type II, ISO/IEC 27001:2022, GDPR and HIPAA claims, AES 256 / TLS 1.2+, Azure + on-prem · captured 2026-05-31
  6. Speechmatics languages page: 55+ languages for speech-to-text with accent/dialect coverage · captured 2026-05-31
  7. Features and deployments: SaaS, on-prem containers (CPU/GPU), Kubernetes, Virtual Appliance, real-time + batch · captured 2026-05-31
  8. Speechmatics partner programme: build/market/sell tracks plus partner marketplace · captured 2026-05-31