PhonBank Arabic Kern Corpus


Sophie Kern

University of Lyon

website

Participants: 4
Type of Study: naturalistic, longitudinal, monolingual
Location: Tunisia
Media type: audio
DOI: doi:10.21415/T59S3X

Browsable transcripts

Phon data

CHAT data

Link to media folder

Citation information

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

Participant NameAge RangeSessionsSex
Fereyel0;08.09 – 2;00.2126F
Iyed0;08.22 – 2;00.0030F
Malek0;10.11 – 2;00.0424M
Zaidaan0;04.15 – 2;00.1031F

Five types of data were collected. First, one hour of spontaneous vocalization data was audio and video recorded every two weeks from 8 months of age through 25 months of age. Recording took place in children’s homes. The parents were told to follow their normal types of activities with their child. Second, minimally 1,000 dictionary entries from the ambient language employed by the parents of each child participant were analyzed for comparison with the child data for that language. Parental reports were administered using adaptations of the MacArthur Development Inventories (Fenson et al., 1993) respectively elaborated for Dutch for French participants. Mothers filled out the questionnaire once in a month. For the remaining languages there is no adaptation yet, but one could imagine using the spontaneous data to elaborate the same instrument. An object manipulation categorization task was administered every two months. This task was conceived to evaluate the children’s spontaneous nonverbal categorization abilities. Several toys, which were consistent across the language groups served as stimuli. Each task involved a contrast of objects from two different categories (animal, means of locomotion, furniture).

Children were developing normally by community standards as well as reports from parents and physicians regarding developmental milestones were observed in the normal daily environment. These children were becoming monolingual speakers of French, Romanian and Tunisian Arabic. Languages were chosen to include different language families as well as for the contrasts they show in phonetic and phonological features of interest for early language development. These characteristics include word length, syllabic types as well as phonemic inventory diversity.

One hour of spontaneous vocalization data was audio and video recorded every two weeks from 8 months of age through 25 months of age. Recording took place in children’s homes. The parents were told to follow their normal types of activities with their child. After collection, data was phonetically transcribed using the International Phonetic Alphabet (IPA). Broad phonetic transcriptions were used, supplemented by some diacritics (mainly for palatalized, pharyngealized, nasalized sounds and duration of sounds). Tokens considered as single utterance strings were bounded by one second of silence, noise or adult speech.

This work was supported by the EUROCORE Program “The Origin of Man, Language and Languages” (OMLL) and the French CNRS program “Origine de l’Homme, du Langage et des Langues” (OHLL).