Evaluation of Zero Frequency Filtering based Method for Multi-pitch Streaming of Concurrent Speech Signals

Mariem Bouafif Mansali*, Tom Bäckström, Zied Lachiri

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

148 Downloads (Pure)


Multiple pitch streaming from a mixture is a challenging problem for signal processing and especially for speech separation. In this paper, we use a Zero frequency filtering (ZFF) based new system to stream pitch of multiple concurrent speakers. We propose a workflow to estimate pitch values of all sources in each single frame then streaming them into trajectories, each corresponding to a distinct source. The method consists of detecting and localizing the involved speakers in a mixture, followed by a ZFF based approach where involved speakers’ pitches are iteratively streamed from the observed mixture. The robustness of the proposed system is tested over two, and three overlapping speech mixtures collected in reverberant environment. The results indicate that our proposal brings ZFF to a competitive level with another recently proposed streaming approach.
Original languageEnglish
Title of host publication28th European Signal Processing Conference, EUSIPCO 2020 - Proceedings
PublisherEURASIP – European Association For Signal Processing
Number of pages5
ISBN (Electronic)978-9-0827-9705-3
Publication statusPublished - 24 Jan 2021
MoE publication typeA4 Conference publication
EventEuropean Signal Processing Conference - Amsterdam, Netherlands
Duration: 24 Aug 202028 Aug 2020
Conference number: 28

Publication series

NameEuropean Signal Processing Conference
ISSN (Print)2219-5491
ISSN (Electronic)2076-1465


ConferenceEuropean Signal Processing Conference
Abbreviated titleEUSIPCO


  • pitch estimation
  • Zero frequency filtering
  • Epochs
  • multipitch
  • streaming


Dive into the research topics of 'Evaluation of Zero Frequency Filtering based Method for Multi-pitch Streaming of Concurrent Speech Signals'. Together they form a unique fingerprint.

Cite this