Open Voice is an open source speech dataset (corpus) of native contributors who want to help computer system understand and speak African languages.

For Developers

Contribute to OpenVoice by helping us design and build the product.

For Partners & Sponsor

Sponsor us to advance AI and promote an inclusive African AI future. Gain early access to AI solutions, recognition, and networking opportunities.

For Speech Contributors

Lend your voice to train computers to speak your native language. It all takes a fun language game played on your device.

What is a speech dataset?

Imagine a collection of recordings where people speak in different languages and accents. This data helps computers understand human speech and build better technologies for everyone and we collect them through a fun language game that anyone can play, then make it available for anyone to use for training a machine learning model.

Why should you get involved?

Partner with us to support the growth and development of AI in Africa. As a partner of the Open Voice initiative, you’ll have the opportunity to contribute speech data, collaborate on projects, and showcase your commitment to advancing AI technology on the continent.

Gain access to cutting-edge AI solutions, thought leadership opportunities, and networking events as a valued partner. We offer a range of benefits designed to help you maximise the impact of your partnership and drive positive change in Africa.

Gain access to a diverse network of industry experts, researchers, and AI enthusiasts, facilitating collaboration, knowledge sharing, and potential opportunities.

How our Dataset
drives innovation and

Open Voice data is transforming technology and preserving cultural heritage. 

Automatic Speech Recognition (ASR)

Convert spoken language into written text.


Powering voice assistants, transcription services, and real-time captioning to make technology more accessible.

Natural Language Processing (NLP)

Enhancing text-based applications with speech data.


Improving sentiment analysis, named entity recognition, text summarisation, language translation, and conversational agents.

Natural Language Understanding (NLU)

Understand the meaning and intent behind spoken language.


Developing sophisticated virtual assistants and chatbots that can handle complex queries and provide accurate responses.

Speaker Recognition

Identifying or verifying speakers based on their voice.


Enhancing security systems and personalising user experiences in voice-activated applications.

Speech Synthesis (Text-to-Speech - TTS)

Converting text back into spoken language.


Generating natural-sounding speech for audiobooks, navigation systems, and assistive technologies.

Emotion Detection

Identifying emotions from speech patterns.


Improving customer service interactions, mental health monitoring, and user experience personalisation.

Linguistic Research

Studying the phonetics, phonology, and prosody of African languages.


Advancing academic research, preserving languages, and developing educational materials.

Accessibility Tools

Assisting people with disabilities.


Creating voice-controlled interfaces and real-time transcription for the hearing impaired.

Cultural and Historical Preservation

Documenting and preserving endangered languages.


Creating audio archives for future generations and supporting language revitalization efforts.

