AethexAI, a voice AI startup, has raised $3 million in pre-seed funding to address gaps in automated customer service for Africa and the Middle East. The company, founded by Mariama Diallo and Ayooluwa Odemuyiwa, is building a platform tailored to the unique needs of these markets, including localized dialects of English, French, and Arabic. The startup is also launching its platform for enterprises to try out its tech and sign up for its services, along with APIs and SDKs for developers to experiment with its models.

Rather than using existing orchestration tools like Vapi and LiveKit, AethexAI built its own small model and orchestration layer from scratch to handle the localized dialects of English, French, and Arabic spoken across its target markets. This decision was driven by the particular demands of operating in the region, where latency and jitter on automated calls were found to be 'outrageous.' The company’s models, part of its Kora series, range from 300 million to 1.7 billion parameters, a fraction of the size of typical large language models. To train these models, the startup used anonymized recordings from a call center partner and shipped hard drives to radio stations across Africa to collect more audio data. It also built a contributor network of university students to annotate data and pronounce local names. As a result, the startup says, it’s now handling more than 17,000 calls per day.

AethexAI’s founders noted that most major voice AI players were not built with Africa and the Middle East in mind. Walter Baddoo, co-founder and managing partner of 4DX Ventures, argued that the Africa and Middle East market is fundamentally different from the markets most voice AI companies were built to serve. Enterprises in Africa and the Middle East process roughly three times the call volume of their Western counterparts, as voice is still the dominant channel for customer interaction. Incumbent systems were built for Western markets characterized by high-end GPU infrastructure, standard English and European speech environments, and enterprise workflows common in the U.S. and Europe. That creates real gaps when enterprises need systems that handle dialects, code-switching, and informal speech patterns, and that work within their existing telephony infrastructure and their actual price points.

Source: techcrunch