Utilizing WAV2VEC in database-independent voice disorder detection

Tutkimustuotos: Artikkeli kirjassa/konferenssijulkaisussaConference article in proceedingsScientificvertaisarvioitu

12 Sitaatiot (Scopus)
64 Lataukset (Pure)

Abstrakti

Automatic detection of voice disorders from acoustic speech signals can help to improve reliability of medical diagnosis. However, the real-life environment in which speech signals are recorded for diagnosis can be different from the environment in which the detection system’s training data was originally collected. This mismatch between the recording conditions can decrease detection performance in practical scenarios. In this work, we propose to use a pre-trained wav2vec 2.0 model as a feature extractor to build automatic detection systems for voice disorders. The embeddings from the first layers of the context network contain information about phones, and these features are useful in voice disorder detection. We evaluate the performance of the wav2vec features in single-database and crossdatabase scenarios to study their generalizability to unseen speakers and recording conditions. The results indicate that the wav2vec features generalize better than popular spectral and cepstral baseline features.
AlkuperäiskieliEnglanti
OtsikkoProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’23)
KustantajaIEEE
Sivumäärä5
ISBN (elektroninen)978-1-7281-6327-7
DOI - pysyväislinkit
TilaJulkaistu - 2023
OKM-julkaisutyyppiA4 Artikkeli konferenssijulkaisussa
TapahtumaIEEE International Conference on Acoustics, Speech, and Signal Processing - Rhodes Island, Kreikka
Kesto: 4 kesäk. 202310 kesäk. 2023

Julkaisusarja

NimiProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
ISSN (elektroninen)2379-190X

Conference

ConferenceIEEE International Conference on Acoustics, Speech, and Signal Processing
LyhennettäICASSP
Maa/AlueKreikka
KaupunkiRhodes Island
Ajanjakso04/06/202310/06/2023

Sormenjälki

Sukella tutkimusaiheisiin 'Utilizing WAV2VEC in database-independent voice disorder detection'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä