Exploration of temporal dynamics of frequency domain linear prediction cepstral coefficients for dialect classification

Rashmi Kethireddy*, Sudarsana Reddy Kadiri, Suryakanth V. Gangashetty

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

15 Downloads (Pure)

Abstract

Speakers exhibit dialectal traits in speech at sub-segmental, segmental, and supra-segmental levels. Any feature representation for dialect classification should appropriately represent these dialectal traits. Traditional segmental features such as mel-frequency cepstral coefficients (MFCCs) fail to represent sub-segmental and supra-segmental dialectal traits. This study proposes to use frequency domain linear prediction cepstral coefficients (FDLPCCs) for dialect classification inspired by its long temporal summarization during pole estimation. The i-vectors and x-vectors derived from both baseline (MFCCs, linear prediction cepstral coefficients (LPCCs), perceptual LPCCs (PLPCCs), RASTA filtered PLPCCs (PLPCC-R) and proposed (FDLPCC) features are used for identifying the dialects with support vector machine (SVM) and feed-forward neural network (FFNN) as classifiers. Proposed FDLPCC features have shown to perform better than baseline features such as MFCCs and PLPCC-Rs (best among LPCCs variants) by an absolute improvement of 3.4% and 3.9% (in unweighted average recall (UAR)), with i-vector + SVM system and 1.6% and 4.6% (in UAR), i-vector + FFNN system respectively. It is also found that there exists a complementary information between the proposed and baseline features. Furthermore current studies are compared with previous studies and it is found that performances of current studies are better than previous studies.

Original languageEnglish
Article number108553
JournalApplied Acoustics
Volume188
DOIs
Publication statusPublished - Jan 2022
MoE publication typeA1 Journal article-refereed

Keywords

  • Dialect classification
  • Frequency domain linear prediction
  • i-vectors
  • Long temporal variations
  • x-vectors

Fingerprint

Dive into the research topics of 'Exploration of temporal dynamics of frequency domain linear prediction cepstral coefficients for dialect classification'. Together they form a unique fingerprint.

Cite this