Google Launches WAXAL African Languages Dataset

On February 3, Google launched WAXAL, an open speech dataset covering 21 African languages, comprising over 11,000 hours of audio and nearly 2 million recordings, including about 1,250 transcribed hours and 20 studio hours for TTS. The dataset is owned by African partner institutions and released under a permissive license to support ASR/TTS research and commercial deployment, advancing local digital sovereignty.
Scoring Rationale
Official, open dataset with partner ownership and commercial license increases local capacity, but dialect coverage and data quality issues remain.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems


