Corpora

Corpora

The Providence (English) Corpus

The Providence Corpus consists of twice-monthly digital audio/video recordings of hour-long mother-child spontaneous speech interactions from six English-speaking children between approximately 1-3 years. The data were collected in and around southern New England from 2000-2004, and total approximately 363 hours. The child utterances are transcribed in broad phonetic transcription. This work was funded by NIH, carried out by Katherine Demuth and colleagues at Brown University in Providence, RI. The data are available on the CHILDES database.

Those wishing to use the corpus should cite the following reference:
Demuth, K., Culbertson, J. & Alter, J. 2006. Word-minimality, epenthesis, and coda licensing in the acquisition of English. Language & Speech, 49, 137-174.

The Lyon (French) Corpus

The Lyon Corpus consists of twice-monthly digital audio/video recordings of hour-long mother-child spontaneous speech interactions from four French-speaking children between approximately 1-3 years. The data were collected in Lyon, France from 2000-2004, and total approximately 185 hours. The child utterances are transcribed in broad phonetic transcription. The work was funded by NIH, and carried out in collaboration with Harriet Jisa and colleagues at Dynamique du Langage at the University of Lyon 2, France. The data are available on the CHILDES database.

Those wishing to use the corpus should cite the following reference:
Demuth, K. & Tremblay, A. 2008. Prosodically-conditioned variability in children's production of French determiners. Journal of Child Language, 35, 99-127.

The Demuth Sesotho Corpus

The Demuth Sesotho Corpus contains 98-hours of spontaneous speech interactions with four children aged 2-4. Audio taping took place at monthly intervals for 3-4 hours during interactions with family and peers in rural Lesotho. The data are morphologically tagged, and available as part of the CHILDES database. Corpus preparation and research have been funded by NSF, Fulbright, and SSRC.

Those wishing to use this corpus should notify Katherine Demuth and cite the following reference:
Demuth, K. 1992. Acquisition of Sesotho. In D. Slobin (ed.), The Cross-Linguistic Study of Language Acquisition, vol 3, 557-638. Hillsdale, N.J.: Lawrence Erlbaum Associates.

Download the Demuth Sesotho Corpus here.

Back to the top of this page