Projects per year
Abstract
The major impulse-like excitation in the speech signal is due to abrupt closure of the vocal folds, which takes place at the glottal closure instant (GCI) or epoch in each cycle. GCIs are used in many areas of speech science and technology, such as in prosody modification, voice source analysis, formant extraction and speech synthesis. It is difficult to observe these discontinuities (corresponding to GCIs) in the speech signal because of the superimposed time-varying response of the
vocal tract system. This paper examines the phase part of different frequency components of the speech signal to extract epochs. Three analysis methods to decompose the speech signal into different frequency components are considered. These methods are the short-time Fourier transform (STFT), narrow bandpass filtering (NBPF), and single frequency filtering (SFF). The locations of the discontinuities in the speech signal are obtained from the instantaneous frequency (IF) (i.e., the time derivative of the phase) of each of the frequency components. A method for automatic detection of epochs using the amplitude weighted IF is proposed. Performance of the proposed epoch detection method is compared with four state-of-the-art methods in clean and telephone quality speech. The performance of the proposed method is comparable with the performance of the existing epoch detection methods for clean speech but better for telephone quality speech.
vocal tract system. This paper examines the phase part of different frequency components of the speech signal to extract epochs. Three analysis methods to decompose the speech signal into different frequency components are considered. These methods are the short-time Fourier transform (STFT), narrow bandpass filtering (NBPF), and single frequency filtering (SFF). The locations of the discontinuities in the speech signal are obtained from the instantaneous frequency (IF) (i.e., the time derivative of the phase) of each of the frequency components. A method for automatic detection of epochs using the amplitude weighted IF is proposed. Performance of the proposed epoch detection method is compared with four state-of-the-art methods in clean and telephone quality speech. The performance of the proposed method is comparable with the performance of the existing epoch detection methods for clean speech but better for telephone quality speech.
Original language | English |
---|---|
Article number | 101443 |
Number of pages | 14 |
Journal | Computer Speech and Language |
Volume | 78 |
Early online date | 27 Aug 2022 |
DOIs | |
Publication status | Published - Mar 2023 |
MoE publication type | A1 Journal article-refereed |
Keywords
- speech analysis
- phase processing
- instantaneous frequency
- group delay
- excitation source
- glottal closure instants
- epochs
Fingerprint
Dive into the research topics of 'Analysis of Instantaneous Frequency Components of Speech Signals for Epoch Extraction'. Together they form a unique fingerprint.Projects
- 1 Finished
-
HEART: Speech-based biomarking of heart failure
Alku, P. (Principal investigator)
01/09/2020 → 31/08/2024
Project: Academy of Finland: Other research funding
Press/Media
-
Recent Research from Aalto University Highlight Findings in Computer Science (Analysis of Instantaneous Frequency Components of Speech Signals for Epoch Extraction)
02/03/2023
1 item of Media coverage
Press/Media: Media appearance