PhonBank Arabic Kern Corpus

Sophie Kern

University of Lyon
Sophie.Kern@cnrs.fr
website

Participants:	4
Type of Study:	naturalistic, longitudinal, monolingual
Location:	Tunisia
Media type:	audio
DOI:	doi:10.21415/T59S3X

Browsable transcripts
Phon data
CHAT data
Link to media folder

Citation information

Kern, Sophie, Barbara L. Davis, & Inge Zink (2009). From babbling to first words in four languages: Common trends, cross language and individual differences. In Francesco d’Errico & Jean-Marie Hombert (eds.) Becoming eloquent: Advances in the Emergence of language, human cognition and modern culture. John Benjamins Publishing Company.
Kern, Sophie & Barbara L. Davis. (2009). Emergent complexity in early vocal acquisition: Cross-linguistic comparisons of canonical babbling. In François Pellegrino, Egidio Marsico, Ioana Chitoran & Christophe Coupé (eds.), Approaches to phonological complexity. Berlin, Mouton de Gruyter.
Kern, Sophie (2005). De l'universalité et des spécificités du développement langagier précoce. In Hombert, Jean-Marie (ed.) Aux origines du langage et des langues. Paris: Fayard.
Davis, Barbara, Sophie Kern, Anne Vilain & Claire Lalevée (2008). Des babils à Babel: les premiers pas de la parole. Revue Française de Linguistique Appliquée. 13(2), 81-91.

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

Participant Name Age Range Sessions Sex
Fereyel 0;08.09 – 2;00.21 26 F
Iyed 0;08.22 – 2;00.00 30 F
Malek 0;10.11 – 2;00.04 24 M
Zaidaan 0;04.15 – 2;00.10 31 F

Five types of data were collected. First, one hour of spontaneous vocalization data was audio and video recorded every two weeks from 8 months of age through 25 months of age. Recording took place in children’s homes. The parents were told to follow their normal types of activities with their child. Second, minimally 1,000 dictionary entries from the ambient language employed by the parents of each child participant were analyzed for comparison with the child data for that language. Parental reports were administered using adaptations of the MacArthur Development Inventories (Fenson et al., 1993) respectively elaborated for Dutch for French participants. Mothers filled out the questionnaire once in a month. For the remaining languages there is no adaptation yet, but one could imagine using the spontaneous data to elaborate the same instrument. An object manipulation categorization task was administered every two months. This task was conceived to evaluate the children’s spontaneous nonverbal categorization abilities. Several toys, which were consistent across the language groups served as stimuli. Each task involved a contrast of objects from two different categories (animal, means of locomotion, furniture).

Children were developing normally by community standards as well as reports from parents and physicians regarding developmental milestones were observed in the normal daily environment. These children were becoming monolingual speakers of French, Romanian and Tunisian Arabic. Languages were chosen to include different language families as well as for the contrasts they show in phonetic and phonological features of interest for early language development. These characteristics include word length, syllabic types as well as phonemic inventory diversity.

One hour of spontaneous vocalization data was audio and video recorded every two weeks from 8 months of age through 25 months of age. Recording took place in children’s homes. The parents were told to follow their normal types of activities with their child. After collection, data was phonetically transcribed using the International Phonetic Alphabet (IPA). Broad phonetic transcriptions were used, supplemented by some diacritics (mainly for palatalized, pharyngealized, nasalized sounds and duration of sounds). Tokens considered as single utterance strings were bounded by one second of silence, noise or adult speech.

This work was supported by the EUROCORE Program “The Origin of Man, Language and Languages” (OMLL) and the French CNRS program “Origine de l’Homme, du Langage et des Langues” (OHLL).

Participant Name	Age Range	Sessions	Sex
Fereyel	0;08.09 – 2;00.21	26	F
Iyed	0;08.22 – 2;00.00	30	F
Malek	0;10.11 – 2;00.04	24	M
Zaidaan	0;04.15 – 2;00.10	31	F