Abstrakti
Increase in fundamental frequency (f0) is one of the most robust and best-studied phenomena characterizing Lombard speech. In this work, three types of global transformation of f0contours from normal speech to Lombard condition are investigated: (1) a linear re-scaling of the quiet condition contour to match the mean and standard deviation of f0in Lombard speech, (2) a non-linear regression between the f0values in quiet condition against the corresponding f0values in the Lombard speech and (3) a multiple non-linear regression using components obtained by a wavelet decomposition of the quiet condition contours. The quality of fits is evaluated on a phonetically controlled corpus of Finnish sentences with varying prosodic focus and ambient noise conditions. The results show that the non-linear regression yields a smaller root mean squared error that the simple rescaling. Both methods are outperformed by the technique based on continuous wavelet transformation that uses hierarchical information encoded in speech signal. The findings are discussed in terms of their theoretical implications as well as their possible technological applications.
Alkuperäiskieli | Englanti |
---|---|
Sivut | 489-493 |
Sivumäärä | 5 |
Julkaisu | Proceedings of the International Conference on Speech Prosody |
Vuosikerta | 2016-January |
DOI - pysyväislinkit | |
Tila | Julkaistu - 2016 |
OKM-julkaisutyyppi | A4 Artikkeli konferenssijulkaisussa |
Tapahtuma | International Conference on Speech Prosody - Boston, Yhdysvallat Kesto: 31 toukok. 2016 → 3 kesäk. 2016 Konferenssinumero: 8 |