PhonBank German Stuttgart Corpus

PhonBank German Stuttgart Corpus

Bernd Möbius
Computational Linguistics
Saarland University


Britta Lintfert
University of Stuttgart


Participants: 8
Type of Study: naturalistic
Location: USA
Media type: audio
DOI: doi:10.21415/T5XP4V

Browsable transcripts

Phon data

CHAT data

Link to media folder

Citation information

Lintfert, Britta. 2009. Phonetic and Phonological Development of Stress in German. Universität Stuttgart Ph.D. Dissertation.

In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references.

Project Description

NameAge RangeSessionsSexRelatives
EL0;5 - 2;212FBL, LL
HH0;5 – 3;116FVH, CH
BW1;2 – 2;1113MNW, PW
NB0;8 – 2;828MMB, DB
FZ1;0 – 2;615FVZ
RL3;1 – 4;611MAL
ED2;3 – 4;39FAD, LD, VD
LL3;8 – 7;725FBL, EL

The German-Stuttgart and German-TAKI corpora were constructed as part of a study on the acquisition of stress in German founded by the German Research Foundation. In this project the researchers recorded and analyzed acoustic data of 11 children from 6 months up to 15 years to develop an exemplar-based model of stress acquisition in German and to build up a prosodic annotated speech corpus of babbling, first words and meaningful speech from German speaking children.

The recordings were made at the children’s homes in familiar play situations with their parents. We recorded using a Sony DAT TCD-D100 and a high-quality wireless microphone NADY LT-4 (Lavalier) E-701 (600 Ohm). We tried to keep the distance between the microphone and the mouth as constant as possible during the entire session to obtain high-quality recordings for acoustic analysis. The recorded data were transferred to a computer and down sampled to 16 kHz. The following table summarizes information about the children and the recordings.

  1. Babbling and first words: 5–18 months of age. Between 5 and 18 months of age, the speech data from six children (3 boys, 3 girls) were collected. The infants were audio-recorded every 6–8 weeks, starting between five and seven months, when first CV-syllable productions occur, at their homes in familiar play situations with their parents. No person unfamiliar to the child was present during the recordings. During a recording session a parent (normally the mother) played with the child and later on, at the one-word stage, looked and talked about a picture book and picture cards. Before the recordings started, the parents were instructed on how to use the recording equipment. The parents’ utterances were recorded and analyzed as well. All of the infants lived in monolingual German-speaking families and had no unusual prenatal, sensory or developmental concerns or hearing problems. At the age of 12 months, the speech development of all infants was tested using a parental questionnaire for early recognition of children at risk (Grimm & Doil, 2004).
  2. Mixing-phase: 18–36 months of age. Between 18 and 36 months of age, we collected speech data every 6–8 weeks. Because we have different recording tasks developed for this age we called this phase mixing. The recordings were done until the third birthday in the way described for the babbling phase. The children born in 2004 (OZ, BW, NB, FZ) were also tested at each recording session for their use of stress in multisyllabic words.
  3. Card Naming (from 18 months of age). We created picture cards representing two- and three-syllable German words with stress on the first, second and third syllable with the vowels /a/, /i/, and /o/ in stressed and unstressed position. From the age of 18 months, children were tested with these cards. With this task, the development of different stress schemes can be tested and described. The words were: Akrobat, Anorak, Banane, Bikini, Buchstabe, Eisenbahn, Elefant, Eskimo, Fliege, Flughafen, Fotograf, Gardine, Giraffe, Gitarre, Gorilla, Kamel, Kanone, Kobolde, Kokosnuss, Korkodil, Lastwagen, Lawine, Malerin, Matrose, Mikrofon, Müllwagen, Paket, Papagei, Pinguin, Pistole, Polizei, Postauto, Postbote, Prinzessin, Pullover, Radfahrer, Rennfahrer, Sandale, Saxophon, Schmetterling, Skifahrer, Spiegel, Stadion, Teddybär, Tomate, Trampolin, Trompete, Vulkan(e), Wohnwagen, and Zitrone.
  4. TAKI task: from 36 months of age. Beginning at 36 months, recordings were made each 10–12 weeks using the TAKI task proposed by Allen (1980). We created five pairs of animal toys and assigned nonce names to each. Within each pair, the nonce names differed only in terms of the position of main stress. For example, in a pair with a brown bear and a polar bear. Both were called “bimo”, but the stress was on the first syllable for the brown bear and the second syllable for the polar bear. The nonce names were all bi-syllabic or tri-syllabic and consisted of consonant-vowel (CV) syllables formed from vowels /a i o/ and consonants /b d m n/. All of these stress patterns are possible in German, although stress on the second or third syllable is much less common than stress on the first syllable.

Annotation labels

  1. CV coding (.cv). To study the development of syllable structure, we marked the beginning and ending of each vowel and consonant.
  2. Stable Vowel (.marks). The beginning and end of the stable phase of the vowel is marked. No influence of the surrounding context should be hearable. The stable phase of a vowel is characterized by parallel formants (observable in the spectrogram) and a constant waveform (observable in the time signal). VA (Anfang) marks the beginning of the vowel and VE (Ende) the ending. The indixes “u” are used for unstressed (unbetont), “b” for stressed (betont) vowels.
  3. Stress (.stress): Perceptual prominence for each syllable: no prominence (0), most prominent (1), prominent but not most (2).
  4. Orthographic transcription (.trans). Each utterance was annotated on the syllabic level by two trained transcribers.
  5. Cover Symbol Transcription (.phones–cover). Manner and place of articulation describes the consonantal structure: L=labial, A=Alveolar, V=velar, G=glottal, O=other; P=plosive, N=nasal, F=fricative, G=glide O=other. Tongue height and manner describes the vocalic structure: H=high, M=medium, T=low, tief. V=front, vorne, Z=central, zentrale, H=back, ?
  6. Narrow transcription in XSAMPA (.phones).
  7. Syllable structure coding (.sylstr): Syllable structure, words and syllable marker.

Corpus curation for PhonBank

  1. Textgrid creation: The recordings were originally segmented and coded in WaveSurfer. These files were converted into Praat TextGrids using a script. Annotations for Syllable Structure, Tones, Stress, XSAMPA transcription, and Orthographic transcription were converted into TextGrid tiers.
  2. Generation of record data: The TextGrids are visible within the Speech Analysis window in Phon, and were used to generate Phon record data. The default tiers Orthography and IPA Actual were generated from the TextGrids, while the stress and syllable structure tiers were added to record data as user-defined tiers.
  3. IPA Actual: XSAMPA transcriptions were converted to their respective IPA symbols and diacritics.