
In a remarkable display of innovation, Saheed Azeez, a final-year Mechanical Engineering student at the University of Lagos (UNILAG), has developed YarnGPT. This AI-driven text-to-speech model articulates text in a Nigerian accent. This pioneering project showcases the burgeoning talent within Nigeria’s academic institutions and addresses the significant challenge of creating AI systems that resonate with local linguistic nuances.
The Genesis of YarnGPT

Azeez’s journey into artificial intelligence gained attention in November 2024 with the creation of Naijaweb, a dataset comprising 230 million GPT-2 tokens sourced from Nairaland, Nigeria’s largest online forum. This endeavor aimed to enhance AI’s comprehension of Nigerian languages and cultural contexts. Building upon this foundation, Azeez embarked on a more ambitious project: developing a text-to-speech model capable of delivering outputs in a Nigerian accent.
Read: From Sade to Burna Boy: Nigerians Who’ve Made Grammy History
Overcoming Data Challenges
One of the primary hurdles in developing YarnGPT was the acquisition of high-quality audio data reflecting Nigerian speech patterns. Given the industry’s vast output of over 2,500 movies annually, Azeez initially turned to Nollywood films. However, he encountered issues with subpar audio quality and inaccurate subtitles, which impeded reliable data extraction. To overcome this, Azeez supplemented his dataset with high-quality audio resources from platforms like Hugging Face, integrating them with the Nigerian movie excerpts to train his model effectively.
Technical Endeavors and Financial Constraints
Training a model of this nature demands substantial computational resources. Without personal access to a Graphics Processing Unit (GPU), Azeez relied on cloud computing services such as Google Colab. An initial investment of $50 (approximately ₦80,000) did not yield the desired results, as the model underperformed, leading to the depletion of his cloud credits. Undeterred, Azeez discovered Oute AI, a platform that had developed a text-to-speech model using an autoregressive approach. This method involves the model predicting one word at a time, appending it to the text, and then predicting the subsequent word, akin to how language models like ChatGPT generate sentences. This autoregressive framework provided more coherent and natural-sounding speech outputs.
The Intricacies of Audio Tokenization
A critical aspect of developing YarnGPT was tokenizing audio data. Unlike text with clear delimiters between words, audio is continuous and lacks natural pauses. To process this, the model needed to segment the continuous sound waves into smaller, manageable units, converting the audio into a sequence of discrete values. This process enabled the model to understand and generate speech that mirrors Nigerian accents’ natural flow and intonation.
Community Engagement and Future Prospects

Azeez produced a concise video demonstration to introduce to a broader audience, which garnered significant attention from social media and founders, including acknowledgment from industry leaders like Timi Ajiboye, Co-founder of Hellicarrier. The model’s capability extends beyond English, as it can articulate text in Nigerian languages such as Hausa, Igbo, and Yoruba. This versatility opens avenues for applications in content creation, language learning, and accessibility tools, providing voice-overs in Nigerian accents and languages.
Read: Sikiru Adepoju: The Rhythmic Legacy of A Grammy Percussionist
The Broader Implications for Artificial Intelligence in Nigeria
While innovators like Azeez are making commendable strides, Nigeria faces challenges in the AI sector, primarily due to limited access to extensive datasets and computational resources. Azeez emphasizes the importance of localizing existing artificial intelligence models to cater to Nigerian languages and accents, suggesting that adapting pre-existing models could be a pragmatic approach to advancing AI development in the country.
The Nigerian government has expressed intentions to position the nation as a significant player in AI development. With talents like Azeez leading grassroots innovation, there is optimism that Nigeria can bridge the current gaps and emerge as a hub for AI excellence in Africa.
Azeez’s journey underscores the potential within Nigeria’s academic institutions and the importance of supporting such initiatives to foster technological advancement that is inclusive and reflective of local cultures.