Although Africa is home to a huge proportion of the world's languages – well over a quarter according to some estimates - many are missing when it comes to the development of artificial intelligence (AI).
This is both an issue of a lack of investment and readily available data.
Most AI tools, such as ChatGPT, used today are trained on English as well as other European and Chinese languages.
These have vast quantities of online text to draw from.
But as many African languages are mostly spoken rather than written down, there is a lack of text to train AI on to make it useful for speakers of those languages.
For millions across the continent this means being left out.
Researchers have recently released what is thought to be the largest known dataset of African languages.
Prof Vukosi Marivate from the University of Pretoria emphasized the importance of technology reflecting the languages in which people think and dream.
In response to this linguistic disparity, the African Next Voices project aims to create AI-ready datasets in 18 African languages, with plans for further expansion in the future.
The team's efforts have resulted in 9,000 hours of recorded speech from various regions capturing daily scenarios in sectors like farming, health, and education.
This project was made feasible by a $2.2 million grant from the Gates Foundation.
The collected data is open access, allowing developers to build tools that work in African languages.
Kelebogile Mosime, a farmer from South Africa, utilizes an AI app that supports her native language and aids in solving farming issues.
Meanwhile, Lelapa AI, a South African startup, is committed to creating AI tools in local languages, addressing accessibility problems for non-English speakers in essential service sectors.
The overarching goal of such initiatives is not merely to enhance convenience but to preserve the cultural and historical significance of African languages.