Abstract
The efficiency of many speech processing methods rely on accurate modeling of the distribution of the signal spectrum and a majority of prior works suggest that the spectral components follow the Laplace distribution. To improve the probability distribution models based on our knowledge of speech source modeling, we argue that the model should in fact be a multiplicative mixture model, including terms for voiced and unvoiced utterances. While prior works have applied Gaussian mixture models, we demonstrate that a mixture of generalized Gaussian models more accurately follows the observations. The proposed estimation method is based on measuring the ratio of $L_p$-norms between spectral bands. Such ratios follow the Beta-distribution when the input signal is generalized Gaussian, whereby the estimated parameters can be used to determine the underlying parameters of the mixture of generalized Gaussian distributions.
Original language | English |
---|---|
Title of host publication | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Publisher | International Speech Communication Association |
Pages | 344-348 |
Number of pages | 5 |
Volume | 2017-August |
ISBN (Print) | 978-1-5108-4876-4 |
DOIs | |
Publication status | Published - Aug 2017 |
MoE publication type | A4 Article in a conference publication |
Event | Interspeech - Stockholm, Sweden Duration: 20 Aug 2017 → 24 Aug 2017 Conference number: 18 http://www.interspeech2017.org/ |
Publication series
Name | Interspeech: Annual Conference of the International Speech Communication Association |
---|---|
ISSN (Electronic) | 1990-9772 |
Conference
Conference | Interspeech |
---|---|
Country | Sweden |
City | Stockholm |
Period | 20/08/2017 → 24/08/2017 |
Internet address |
Keywords
- probability distribution mixture models
- speech production modeling