PhonBank French Paris Corpus

Aliyah Morgenstern
Université Sorbonne Nouvelle


Christophe Parisse
Université Paris Ouest

Participants: 7
Type of Study: naturalistic
Location: France
Media type: video
DOI: doi:10.21415/T5PS3B

Browsable transcripts

Phon data

CHAT data

Link to media folder

Citation information

Leroy, M., Mathiot, E., & Morgenstern, A. (2009). Pointing gestures and demonstrative words: Deixis between the ages of one and three. Studies in Language and Cognition Cambridge Scholars Publishing. Editors: Jordan Zlatev, Marlene Johansson Falck, Carita Lundmark and Mats Andrén. 386-404.

Morgenstern, A., Parisse, C. (2007). Codage et interprétation du langage spontané d'enfants de 1 à 3 ans. Corpus n°6 "Interprétation, contextes, codage", 55-78.

Morgenstern, A., Sekali, M. (2009). What can child language tell us about prepositions? A contrastive corpus-based study of cognitive and social-pragmatic factors. Studies in Language and Cognition, Cambridge Scholars Publishing. Editors: Jordan Zlatev, Marlene Johansson Falck, Carita Lundmark and Mats Andrén. 261-275.

Morgenstern A. (2006) Un JE en construction. Ontogenèse de l’auto-désignation chez l’enfant. Bibliothèque de Faits de langues. Ophrys.

Morgenstern A. with the collaboration of Benazzo, S., Leroy, M., Mathiot, E., Parisse. C., & Sekali, M.(2009). L’enfant dans la langue. De l’observation du naturaliste à l’analyse du linguiste. (Presses de la Sorbonne Nouvelle).

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

The corpus was financed by the Agence Nationale de la Recherche in the context of the Research program directed by Aliyah Morgenstern entitled “Acquisition du langage et Grammaticalisation” (also called “Projet Léonard” by the name of the first child filmed 15 years before the project started). The project lasted three years: 2005-2008. The website for the project is Aliyah Morgenstern coordinated the project and did the filming for Léonard and Théophile. Martine Sekali filmed Madeleine. Françoise Bourdoux, Stéphanie Caet, and Aliyah Morgenstern did the transcription. Christophe Parisse aligned the transcription to the video, ran checking on the transcriptions, and created the morphological analyzer for French.

The aim of the project was to collect new French data, improve our transcription system and our coding, study the appearance and development of grammatical tools used by children between one and three years old, and compare them to the use of the same tools in adult speech. We tried to establish the order in which these markers appear and their link to the pragmatic context (requests, narratives, explanations...). We used the verbal and non verbal context (situation, prosody, mimics and gestures) in order to conduct qualitative analyses. The children's verbal productions were analyzed in relation to the adults' role in the dialogue and their interpretation of their children's utterances. Each corpus was analyzed by the researcher who collected the data, along with another researcher who did not know the child. After the initial team grew bigger, the data was studied in a broader framework. We brought together specialists from various fields of language acquisition in order to tackle language development from a multimodal and interdisciplinary perspective on the same longitudinal data. The analyses aimed at finding regularities in acquisition for each child and across the children.

All the researchers were given the video recordings and the transcriptions of the same three longitudinal follow-ups. The researchers each analyzed the same data set according to their competence and conducted a study in parallel with the others. Meetings were organized in order to share observations, discuss results and have a better view of the interface between the different levels of children’s linguistic development. This helped us propose a more ambitious project entitled COLAJE (Communication Langagière chez le Jeune Enfant) which obtained financing for another 4 years from the Agence Nationale de la Recherche (2008-2011).


Anaé was born on July 24th, 2006. She is the youngest of three children, with two older brothers. Her mother is a linguist and her father is a high school English teacher. Anaé is lively, stubborn, has a great sense of humor, and is very close to her mother. Aliyah Morgenstern and Marie Leroy have filmed her for 30 to 60 minutes a month since she was 1;04, always in her natural environment. Her linguistic development is rather fast, and she shows great creativity in her language constructions. Of particular note is her invention of her own rules regarding verbal morphology and gender, sometimes directly contradicting her parents’ explanations

Antoine was born on April 10th, 2006. His father is a consultant in banking technology, and his mother is the manager of a travel agency. He is the eldest child in his family, with a little brother who is almost 3 years younger. Antoine is a gracious, social, and careful little boy who likes to explore the world around him. Christophe Parisse filmed him for the first time when he was only 13 days old! Since then, Christophe has met with the family for approximately one hour every month. As Antoine is not very talkative, he prefers to communicate with looks or implications. It is therefore difficult to qualify his linguistic development, since he doesn’t produce a large amount of utterances. However, when he does express himself, he speaks correctly.

Léonard was born on October 15th, 1990, the only child of a Parisian family. His mother is a writer who often told him stories where he was the hero. His father, a film producer and director, filmed him regularly. Léonard is a lively and imaginative little boy whose linguistic development was sufficiently rapid. Aliyah Morgenstern started filming him when he was 1;08, continuing her longitudinal study with one hour of filming a month, in his natural environment, until he turned 3;03.

Madeleine was born on April 14th, 2005. Her mother runs a production company and her father is a senior executive at an industrial catering company. When filming started, Madeleine had an older sister, 11 years her senior. Martine Sékali filmed Madeleine for one hour a month between the ages of 10 months and four years, always in Madeleine’s home. Since she turned four, the filming has taken place every three months. Madeleine is a very talkative little girl with an impressive linguistic development. At the age of 2;02 she already displayed a very precise vocabulary, grammatical markers, prepositions, and conjunctions, as well as various determiners. She is also capable of making jokes, telling stories, describing her actions and masters various tenses and aspects. At the age of five she entered CP, already knowing how to read, a very social and happy little girl. She now has a very active and dynamic three-year-old brother, Côme, who has just entered nursery school. Madeleine enjoys playing and talking with him.

Théophile was born on July 4th, 2005. His mother is a violinist and his father is a senior executive. He is good with his hands, curious, and always on the hunt for new experiences. When the recordings started, Théophile was an only child. Aliyah Morgenstern started to film Théophile when he was only seven months old, and she continues to film him on a regular basis in his home. Théophile’s longitudinal data shows that his linguistic development is not that fast. In fact, at 2;01 his vocabulary was very limited and often consisted of onomatopoeic repetitions. However, he did display several morphological markers such as the past participle of “boum” (his word for fall), and “boumé” (fell). If at the beginning of the corpus his language developed at a slower rate than that of Madeleine or Léonard, by the time he turned five Théophile had become quite talkative and funny. He loves to tell stories and jokes, and to play with his little brother and his baby sister.

Adrien est né le 28 décembre 2004. Il est fils unique. Sa mère est secrétaire juridique et son père est ingénieur. Adrien est un petit garçon coquin, qui aime faire des blagues, surtout à son chat. Adrien a un développement linguistique typique et adore répéter les nouveaux mots. Naomi Yamaguchi l’a filmé une heure par mois depuis qu’il a 15 mois et arrêté les enregistrements à 4;11,20.

Ellie is British, born on March 6th, 2009, and is currently the youngest child in the COLAJE project. She lives in Rugby in Warwickshire, and is an only child. Her mother is a research assistant in biotechnology, and her father is an electronic engineer. Ellie is a lively, curious, and endearing little girl. Her grandmother, Bernadette Evans, has been filming her in her natural environment for about one hour a month since the end of her ninth month. Her linguistic development is on the fast side.

Warnings regarding transcript-video linkage:

Antoine: For file 21-1_08_05 the audio and video are out of sync at the end.

Julie: For 02-0_11_18 the last 16 minutes of video are not transcribed. (This is noted in a com% line. )