CDLI Inclusive Speech Technology
Senses Hub partners with CDLI, GDI Hub (UCL), and Strathmore University to build inclusive ASR for non-standard speech across African languages. Funded by UK International Development and Google.org. Collecting 50hrs/language of dysarthria, stuttering, and cleft palate speech data across Kenyan English, Kiswahili, Ugandan English, Luganda, Kinyarwanda, and Rwandese English.
- Build inclusive ASR for non-standard speech across African languages.
- Collecting 50hrs/language of dysarthria, stuttering, and cleft palate speech data
- Across Kenyan English, Kiswahili, Ugandan English, Luganda, Kinyarwanda, and Rwandese English.
Project Overview
Senses Hub partners with CDLI, GDI Hub (UCL), and Strathmore University to build inclusive ASR for non-standard speech across African languages. The project is funded by UK International Development and Google.org, and is collecting 50hrs/language of dysarthria, stuttering, and cleft palate speech data across Kenyan English, Kiswahili, Ugandan English, Luganda, Kinyarwanda, and Rwandese English.
Key Partners
CDLI · GDI Hub · UCL · Strathmore University · iLabAfrica · Njeri Maria Foundation
Funders
UK Intl Dev · Google.org
Non-Standard Speech Datasets
50hrs/language across 6 African languages. Conditions include dysarthria, stuttering, and cleft palate, with open-source release planned.
Custom Cards Workshops
Co-design phrase collection in Nairobi (June 2025). Languages include English, Kiswahili, Sheng, and regional dialects.
Innovation Sprint 2025
5-month hackathon (Jul-Nov 2025) across 3 tracks: Research, Modelling, and Product. Prize pool: $5,000. Demo Day: 21 Nov at Strathmore University.
Demo Day video embed (captioned):
Ethics and Consent Framework
Dedicated section with a plain-language explanation of how speech data is collected, stored, anonymised, and used. This framework should be linked from all project pages sitewide.
- Informed consent before recording
- Participant-rights and withdrawal process
- Secure storage and role-based access controls
- Clear data use policy for research and open-source outputs
Impact and Outputs
Quarterly updated by CMS. Includes links to published papers and cdl-inclusion.com.
Get Involved - 4 Pathways
2. Annotator / Speech Therapist
Support labelling and clinical interpretation of speech data.
Apply as Specialist3. Developer / Researcher
Build models, tools, and accessible applications with the datasets.
Join the Build4. Organisation Partner
Collaborate on training, deployments, and community outreach.
Partner With UsContact: CDLI@senseshub.vision