Dirichlet process mixture models for clustering i-vector data

Shreyas Seshadri, Ulpu Remes, Okko Rasanen

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    1 Citation (Scopus)

    Abstract

    Non-parametric Bayesian methods have recently gained popularity in several research areas dealing with unsupervised learning. These models are capable of simultaneously learning the cluster models as well as their number based on properties of a dataset. The most commonly applied models are using Dirichlet process priors and Gaussian models, called as Dirichlet process Gaussian mixture models (DPGMMs). Recently, von Mises-Fisher mixture models (VMMs) have also been gaining popularity in modelling high-dimensional unit-normalized features such as text documents and gene expression data. VMMs are potentially more efficient in modeling certain speech representations such as i-vector data when compared to the GMM-based models, as they work with unit-normalized features based on cosine distance. The current work investigates the applicability of Dirichlet process VMMs (DPVMMs) for i-vector-based speaker clustering and verification, showing that they indeed show superior performance in comparison to DPGMMs in the tasks. In addition, we introduce an implementation of the DPVMMs with variational inference that is publicly available for use.

    Original languageEnglish
    Title of host publication2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings
    PublisherIEEE
    Pages5470-5474
    Number of pages5
    ISBN (Electronic)9781509041176
    DOIs
    Publication statusPublished - 16 Jun 2017
    MoE publication typeA4 Conference publication
    EventIEEE International Conference on Acoustics, Speech, and Signal Processing - New Orleans, United States
    Duration: 5 Mar 20179 Mar 2017

    Publication series

    NameProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing
    PublisherIEEE
    ISSN (Electronic)2379-190X

    Conference

    ConferenceIEEE International Conference on Acoustics, Speech, and Signal Processing
    Abbreviated titleICASSP
    Country/TerritoryUnited States
    CityNew Orleans
    Period05/03/201709/03/2017

    Keywords

    • Non-parametric methods
    • speaker clustering
    • unsupervised learning
    • variational inference
    • von Mises-Fisher mixtures

    Fingerprint

    Dive into the research topics of 'Dirichlet process mixture models for clustering i-vector data'. Together they form a unique fingerprint.

    Cite this