This repository contains all data and code necessary to reproduce the analysis and figures for the paper Vetchinnikova, S., Konina, A., Williams, N., Mikušová, N., & Mauranen, A. (Forthcoming). Chunking up speech in real-time: Linguistic predictors and cognitive constraints. Language and Cognition.
The paper reports on a behavioral experiment where 50 experiment participants listened to 97 short extracts of natural speech and simultaneously marked chunk boundaries as they felt appropriate using a purpose-built application for tablets (https://www.chunkitapp.online/). The extracts were then annotated for pause duration, prosodic boundary strength, syntactic boundary strength, chunk duration, and bigram surprisal at the word level. The effect of each predictor on chunk boundary perception was modelled using mixed effects logistic regression with listeners and extracts as random effects. The results showed that listeners used all cues, suggesting cue degeneracy, which facilitates observed substantial variation across listeners and extracts in cue importance and effect magnitude. Chunk duration had a strong effect, supporting the cognitive constraint hypothesis. The direction of the surprisal effect indicated that perceptual chunks were not multi-word units inviting a distinction between perceptual and usage-based chunking.
Th repository also incudes a document detailing the syntactic annotation of the extracts and a folder with audio files used in the experiment. Please refer to the readme file for the description of the files uploaded.