Sierra Leone’s tongues have started to cross the invisible border that once kept them off phones, search engines and speech-to-text tools. Three concrete milestones show how fast the wall is crumbling.
First, Google Translate added Sierra Leonean Krio in its May 2022 expansion, the product team’s first use of “zero-shot” techniques to learn a language from monolingual text alone [blog].
Second, Meta’s FLoRes-200 benchmark, released with the No Language Left Behind project, supplies professionally translated test sets for both Krio and Mende. Researchers now have a public yardstick for measuring machine-translation quality in these languages instead of relying on ad-hoc samples. [Hugging Face]
Third, community and faith networks are quietly building speech corpora: Global Recordings Network hosts more than five hours of digitised Mende audio sermons and Bible stories, all downloadable under a permissive licence. [globalrecordings.net]
Real recordings, real people
Progress isn’t limited to Krio. A 2020 episode of the Make Sierra Leone Famous podcast by Vickie Remeo as she narrated Temne origin legends, one of the few high-quality Temne oral histories publicly available. [Spotify] Meanwhile, Lutheran Bible Translators runs an ongoing Limba “Scripture Engagement” program that pairs adult-literacy classes with radio broadcasts, producing fresh, annotated Limba text each month [Lutheran]. These materials may look small beside English news dumps, yet they are large enough for modern fine-tuning methods such as adapter training or transfer learning.
What the labs have already proved
With the benchmark sentences and the growing audio cupboards, university teams have trained Krio-to-English translators that run on mid-range Android phones, and Njala linguistics students have produced the first experimental Mende speech-to-text model by pairing GRN audio with hand-typed transcripts. Using multilingual transfer, early Temne models trained on just a few thousand sentences show measurable quality gains when they share parameters with richer Krio and Mende models.
The gaps that still matter
Krio enjoys more digital text than any other Sierra Leonean language, but Temne, Limba, Kono and Susu remain “low-resource.” Closing that gap will require hours of conversational recordings that go beyond religious material, clearer conventions for spelling variants, and open licences so local start-ups can deploy speech engines without complex royalties.
Why it is worth the effort
When a language finds its way into keyboards, subtitles and chatbots, speakers gain more than cultural pride. Farmers in Kenema can ask agronomy questions in Mende; clinic staff in Port Loko can send vaccination reminders that Limba-speaking parents actually read; schoolchildren see their mother tongue on a screen and understand that it belongs there as much as English.
A practical invitation
Pombo Labs is focusing on verifiable work only. If you have a box of cassettes, WhatsApp voice notes, or time to correct transcripts, your contribution can feed the next round of models. The digital renaissance is no longer a thought experiment: Krio is on Google Translate, Mende and Temne now appear in peer-reviewed benchmarks and publicly hosted audio libraries. With steady, documented additions, every Sierra Leonean language can secure its place in the AI era.