Federated Learning for Privacy Preserving On-Device Speaker Recognition

Abraham Zewoudie, Tom Bäckström

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsProfessional

218 Downloads (Pure)


State-of-the-art speaker recognition systems are usually trained on a single computer using speech data collected from multiple users. However, these speech samples may contain private information which users are not willing to share. To overcome such potential breaches of privacy, we investigate the use of federated learning in speaker recognition. Distributed learning methods such as federated learning enable us to train a shared model without sharing the private data by training the models on edge devices where the data resides. In the proposed system, each edge device trains an individual model which is subsequently sent to a secure aggregator. To provide contrasting data without the need for transmitting data, we use a generative adversarial network (GAN) to generate impostor data at the edge. Afterwards, the secure aggregator merges the individual models, builds a global model and transmits the global model to the edge devices through a main server. Experimental results on the Voxceleb-1 dataset show that the use of federated learning for speaker recognition system provides two advantages. Firstly, it retains privacy since the raw data does not leave the edge devices. Secondly, experimental results show that the aggregated model provides better average equal error rate than the individual models.
Original languageEnglish
Title of host publicationISCA Symposium on Security and Privacy in Speech Communication proceedings
PublisherInternational Speech Communication Association (ISCA)
Number of pages5
Publication statusPublished - 2021
MoE publication typeD3 Professional conference proceedings
EventISCA Symposium on Security and Privacy in Speech Communication - Virtual, Online
Duration: 10 Nov 202112 Nov 2021
Conference number: 1


ConferenceISCA Symposium on Security and Privacy in Speech Communication
CityVirtual, Online


Dive into the research topics of 'Federated Learning for Privacy Preserving On-Device Speaker Recognition'. Together they form a unique fingerprint.

Cite this