Studies on unsupervised and weakly supervised methods in computational modeling of early language acquisition

Research output: ThesisDoctoral ThesisCollection of Articles


This thesis addresses computational modeling of early language acquisition using statistical learning mechanisms. There is a constantly increasing amount of evidence from experimental psychology and brain imaging studies that human infants are sensitive to the statistical structure of sensory input and that their ability to extract statistics of speech signals plays a central role in learning of the native language. The idea of domain-general statistical learning mechanisms in language acquisition is in contrast to the nativist view of language acquisition, in which many language-specific innate factors have been traditionally assumed to exist in the human brain. This thesis presents a series of computational studies addressing the questions of what kind of representations are learnable from speech signals and what kind of computational mechanisms are needed for the learning. The core idea is to model language acquisition from the perspective of a tabula rasa agent that does not have any advance knowledge of language or its relevant units such as phones, phonemes, syllables, or words, but simply comes into being with a number of generic statistical learning algorithms. When exposed to speech input in different experimental settings, these algorithms then start to model recurring patterns in the data and link these patterns to contextual variables such as simulated visual input associated with the speech contents. From a machine learning perspective, the studied methods correspond to unsupervised and weakly supervised machine learning algorithms, since language learning takes place without explicit supervision. As a result of these studies, it is shown that spoken words can be learned from continuous speech based on the statistical structure of the speech input and without assuming a phonetic or other linguistically motivated intermediate representation of language. Different strategies for grounding the acoustic word patterns into their visual referents are also studied, and new methods for segmentation of speech into phone-like units and clustering of acoustic features into discrete categories are presented. Finally, it is shown that frequency characteristics of the human auditory system can also be derived from the statistics of speech signals, suggesting that distributional learning in auditory perception may not be limited to learning of linguistic representations of speech.
Translated title of the contributionOhjaamattomat ja heikosti ohjatut menetelmät kielenoppimisen laskennallisessa mallinnuksessa
Original languageEnglish
QualificationDoctor's degree
Awarding Institution
  • Aalto University
  • Laine, Unto K. , Supervising Professor
  • Laine, Unto K. , Thesis Advisor
Print ISBNs978-952-60-5096-6
Electronic ISBNs978-952-60-5097-3
Publication statusPublished - 2013
MoE publication typeG5 Doctoral dissertation (article)


  • computational modeling
  • language acquisition
  • pattern discovery
  • speech processing
  • cognitive modeling
  • speech segmentation
  • unsupervised learning


Dive into the research topics of 'Studies on unsupervised and weakly supervised methods in computational modeling of early language acquisition'. Together they form a unique fingerprint.

Cite this