Voice AI Moves Past Answering the Phone
Aircall acquires Piper AI, AethexAI raises $3M for Africa and Middle East voice AI, and Zendesk ships regional voice configuration all in one week.
By Springvanta
Three things happened in the same week.
Aircall, the phone platform used by 23,000 businesses, acquired Piper AI on June 4 to add revenue intelligence and CRM automation to its voice tools. Two days earlier, AethexAI, a startup founded by ex-Goldman Sachs and Meta engineers, announced $3 million in pre-seed funding to build voice AI infrastructure for Africa and the Middle East, markets where Western platforms fall apart on dialect, latency, and cost. And on June 1, Zendesk rolled out voice configuration for its AI agents, letting support teams pick from high-fidelity regional voices across 60 languages.
Separately, these are small announcements. Together they point to something specific: voice AI platforms are done competing on "can it answer the phone." The new competition is what happens after the call ends, and whether it works for everyone who needs it.
Aircall buys its way past the call
Aircall has been building toward this since acquiring Vogent in May for specialized voice models. The Piper AI acquisition is different. This one is about what happens after the call ends.
Piper captures customer interactions across calls, video meetings, email, WhatsApp, and field activity, then converts them into CRM updates, deal scoring, and pipeline risk signals. It supports MEDDIC, BANT, and SPICED qualification frameworks out of the box. Customers using Piper reported cutting CRM data entry time by more than 50% in the first month and improving forecast accuracy by 50%.
The bet Aircall is making: sales teams shouldn't need a separate revenue intelligence tool on top of their phone system. If Aircall handles the call and Piper handles the follow-up, the whole post-call workflow lives in one place. That's a direct shot at Gong, Clari, and the other revenue intelligence vendors who analyze sales activity but don't own the channel where the conversation happened.
For SpringVanta buyers, this matters because it closes a gap that's been real in practice. Voice AI agents can answer calls, qualify leads, and book appointments. But someone still has to update the CRM, log the outcome, and trigger the next step. If the phone platform handles that automatically, the voice agent goes from answering calls to running the intake-to-pipeline workflow end to end.
AethexAI builds for the markets everyone else skipped
Most voice AI platforms were trained on American and European English. That works fine in San Francisco and London. In Lagos, Cairo, or Dubai, it falls apart. The dialects are different. Code-switching between English, French, and Arabic is constant. Telecom infrastructure is patchy, and routing calls through Western data centers adds enough latency to make the whole thing unusable.
AethexAI's founders, Mariama Diallo and Ayooluwa Odemuyiwa, spent time on the ground in Africa and the Middle East before writing code. What they found: a call center in Egypt had automated a significant portion of its calls and then rolled the system back because the results were bad. Support centers across the region told them that finding engineers who could build voice automation at the right cost was a constant problem.
Their solution was to not use the standard tools. Rather than orchestrating through Vapi or LiveKit with large models hosted outside the region, they built their own Kora model series, ranging from 300 million to 1.7 billion parameters, and their own orchestration layer. Small models mean lower latency. Local deployment means no routing through distant data centers. The company claims $0.03 per minute, an order of magnitude below Western platforms.
To train the models, AethexAI used anonymized recordings from a call center partner and shipped hard drives to radio stations across Africa to collect audio data. A contributor network of university students handles annotation and local name pronunciation. The startup says it's already handling more than 17,000 calls per day.
4DX Ventures, the lead investor, makes the market case plainly: enterprises in Africa and the Middle East process roughly three times the call volume of Western counterparts because voice remains the dominant channel. Western platforms were built for high-end GPU infrastructure and standard English. The gaps in dialect handling, code-switching, informal speech, and local telephony infrastructure are real enough that companies in these regions either can't use voice AI at all or have tried and failed.
For businesses evaluating voice AI, the takeaway is practical. If your customers or operations touch markets outside North America and Western Europe, the voice AI tools you're evaluating probably weren't built for those environments. Check whether the platform handles the specific dialects, languages, and network conditions where you operate. AethexAI's approach of building small, local models rather than relying on large remote ones is worth understanding even if you never become their customer.
Zendesk lets AI agents sound like where they're calling from
Zendesk's voice configuration update is smaller in scope but addresses a real friction point. Until now, Zendesk's voice AI agents had a default voice with no way to customize it for different markets. As of June 1, support teams can choose male or female voices across 60 languages, add high-fidelity regional voices for specific markets (Mexican Spanish, Brazilian Portuguese, Swiss German, Australian English), and preview voices in the dashboard before deploying.
This matters because voice is personal. A support call where the AI sounds clearly foreign or reads in a neutral accent that doesn't match the caller's dialect creates distance. Zendesk is betting that teams will pay for that specificity, at least in the markets where accent accuracy affects trust and conversion rates.
The feature is available to customers in Zendesk's voice AI agents early access program. Changes apply immediately to new conversations with no changes needed to existing procedures or knowledge bases.

What the three moves share
These three announcements don't share a market or a customer segment. Aircall targets sales and support teams at SMBs and mid-market companies. AethexAI serves enterprises in Africa and the Middle East. Zendesk sells to support organizations globally.
What they share is the direction. Voice AI platforms are moving past the initial question of whether an AI can handle a phone call. The new questions are: can it update the CRM without human intervention? Can it operate in the specific language and dialect of the person calling? Can it work on the actual network infrastructure where the call happens?
If a platform can't do all three, it'll be limited to markets where the network is fast, the language is standard English, and someone manually handles everything after the call.
Sources
- Aircall acquires Piper AI — Aircall official announcement, June 2026
- Aircall Acquires Piper AI to Automate CRM Updates, Sales Follow-Ups — TechRepublic, June 4, 2026
- These two founders left Goldman and Meta to build voice AI for markets everyone else overlooked — TechCrunch, June 3, 2026
- AethexAI raises $3M to build voice AI infrastructure for Africa and the Middle East — Tech.eu, June 4, 2026
- Announcing voice configuration for voice AI agents — Zendesk, June 1, 2026