Studies on Bird Vocalization Detection and Classification of Species

Seppo Fagerlund

Research output: ThesisDoctoral ThesisCollection of Articles


The topic of this thesis is automatic identification of bird vocalization and bird species based on the sounds they produce. Two main approaches for recordings bird sounds are presented: active recording and passive continuous recording. The aim of the active recording method is to capture sounds of a particular bird species or an individual. On the other hand, passive continuous recordings – which can be captured without human presence – are used in acoustical monitoring and are intended to include all sounds in the local environment. The automatic identification system begins by segmenting distinct sound events from the recordings. The purpose of segmentation is to detect syllables in bird sounds. Active recordings with one bird individual typically have a high signal-to-noise ratio that helps in the 
task. Segmentation of passive continuous recordings is more demanding due to the possibility of many simultaneous sound sources and a varying signal level of sound events. Audio events in recordings, comprising sounds from many sources, are also often overlapping which adds complexity to the segmentation phase. After the audio events have been segmented, feature extraction and classification are performed. Within feature extraction the audio signals are represented with a low number of attributes (compared to the original data) that characterize particular sound events. Feature extraction performs dimension reduction by removing redundant information from the original data. Suitable features depend on the data and should be selected so that they discriminate sounds from different sources. The classification phase decides on which class each sound event belongs to based on the feature representation. The main focus of this thesis is to develop and examine feature representations for different types of bird sounds suitable for automatic classification. Special attention has been given to birds that produce inharmonic and noisy sounds due to the diverse structure of their vocalizations. A method based on short time-domain structures was found to be efficient for many different types of sounds. It also exhibited efficiency for sound event detection in continuous recordings.continuous recordings.
Translated title of the contributionLintujen äänien havaitseminen ja lintulajien tunnistaminen.
Original languageEnglish
QualificationDoctor's degree
Awarding Institution
  • Aalto University
  • Laine, Unto, Supervisor
  • Laine, Unto, Advisor
Print ISBNs978-952-60-5921-1
Electronic ISBNs978-952-60-5922-8
Publication statusPublished - 2014
MoE publication typeG5 Doctoral dissertation (article)


  • bird song
  • bioacoustics
  • sound event detection
  • feature extraction
  • pattern recognition
  • automated recognition

Fingerprint Dive into the research topics of 'Studies on Bird Vocalization Detection and Classification of Species'. Together they form a unique fingerprint.

Cite this