PhonBank Setswana Matlhaku Corpus


Keneilwe Matlhaku
African Languages and Literature
University of Botswana

Participants: 3
Type of Study: naturalistic
Location: Botswana
Media type: audio
DOI: doi:10.21415/98C0-2Z80

Browsable transcripts

Phon data

CHAT data

Link to media folder

Citation Information

Matlhaku, Keneilwe (in preparation) Phonological and phonetic factors affecting the early consonantal development in Setswana. Ph.D. Dissertation. Memorial University of Newfoundland.

In accordance with CHILDES rules, any use of data from this corpus must be accompanied by at least one of the above references.
Participant NameAge Range Number of Sessions Sex
B 3;02.22 – 3;06.05 10 M
T 2;05-03 – 2;08.15 12 F
W 1;10.18 – 2;02.02 11 M

Project Description

This study documents three child learners of Setswana; particularly, the SeKwena dialect. All child participants were monolingual language learners, with Setwana as their only native language, in line with the linguistic profiles of their caregivers. B was audio recorded from ages 3;02.22 to 3;06.05. T was recorded from ages 2;05-03 to 2;08.15. W was recorded from ages 1;10.18 to 2;02.02.

As a condition for the children to participate in the study, caregivers completed a questionnaire to ensure that all children had perceptual knowledge of basic Setswana words. The investigator verified that all children were typically developing and healthy, and had no history of vision or hearing problems. Additionally, the children had no history or concerns regarding their speech or language development prior to participating in the study.

The data were recorded in the villages of Molepolole and Mankgodi, both situated in Kweneng District, Botswana. The data were collected in 2019 over a period of four months. The investigator carried out the recordings bi-weekly (or weekly) when possible, in order to maximize data sampling and collection. The children were audio-recorded in their daily environments under the caregivers’ monitoring for a minimum of 30 minutes to a maximum of one-hour per session. There was minimal to no participation of the caregivers.

The investigator used picture books to elicit words that covered a maximum of the sounds of Setswana across all positions within which they can appear. There were no fixed word lists or any other means of guided speech elicitation. As such, this did not guarantee that all children produced all the target sounds in all of the recordings, due to the relatively free nature of the data elicited. The picture books were used to facilitate communication and enhance engagement with the children. Elicitation started with the children asked to informally and spontaneously name pictures they saw in a picture book. A prompting question was presented to the child if they did not immediately name the object or action they saw on the picture.

Spontaneous conversations also formed part of this study. The investigator allowed the children to guide the trajectory of the conversation, which enabled them to produce longer and/or more complex words, phrases and sentences. The investigator then focused on repeating the child’s production using the adult form (as opposed to the child's form, so as not to reinforce what could be an erroneous pronunciation) in order to facilitate subsequent identification of the speech forms attempted by the child. More generally, these strategies naturally followed the types of interaction that normally takes place between a child and an adult, especially in contexts where the adult is focused on providing the child with a stimulating environment for language learning.

The children’s productions were audio recorded via a Zoom H1n handy audio recorder. The recordings were stored in WAV format at CD quality (16-bit sample size at 44.1kHz). The device was positioned out of sight and reach of the child in the recording setting. This method is the least intrusive, in that it involves virtually no change to the child's everyday environment, and avoids potential disruptions or issues related to self-consciousness, which could occur given children's natural awareness of, and interest toward, electronic devices.

These data were prepared using the Phon software program (https://www.phon.ca/). The data were segmented, transcribed, and annotated by the investigator, a native speaker of Setswana. The transcriptions were then verified for syllabification and alignment prior to data compilation.