Welcome to the home for our corpus!

VoxClamantis v1.0 includes first-pass phoneme-level alignments for more than 600 languages, high-resource alignments for ~50 languages, and phonetic measures for all vowels and sibilants. The paper was presented at the ACL2020 meeting: check out the video here.

Our data is available through the links above – more information and tools to explore, use, and analyze the corpus will be continually added!

We welcome questions, comments, and contributions!

Map of the Corpus Languages