Abstract
State-of-the-art speaker recognition systems are usually trained on a single computer using speech data collected from multiple users. However, these speech samples may contain private information which users are not willing to share. To overcome such potential breaches of privacy, we investigate the use of federated learning in speaker recognition. Distributed learning methods such as federated learning enable us to train a shared model without sharing the private data by training the models on edge devices where the data resides. In the proposed system, each edge device trains an individual model which is subsequently sent to a secure aggregator. To provide contrasting data without the need for transmitting data, we use a generative adversarial network (GAN) to generate impostor data at the edge. Afterwards, the secure aggregator merges the individual models, builds a global model and transmits the global model to the edge devices through a main server. Experimental results on the Voxceleb-1 dataset show that the use of federated learning for speaker recognition system provides two advantages. Firstly, it retains privacy since the raw data does not leave the edge devices. Secondly, experimental results show that the aggregated model provides better average equal error rate than the individual models.
Original language | English |
---|---|
Title of host publication | ISCA Symposium on Security and Privacy in Speech Communication proceedings |
Publisher | International Speech Communication Association |
Number of pages | 5 |
DOIs | |
Publication status | Published - 2021 |
MoE publication type | D3 Professional conference proceedings |
Event | ISCA Symposium on Security and Privacy in Speech Communication - Virtual, Online Duration: 10 Nov 2021 → 12 Nov 2021 Conference number: 1 |
Conference
Conference | ISCA Symposium on Security and Privacy in Speech Communication |
---|---|
City | Virtual, Online |
Period | 10/11/2021 → 12/11/2021 |