OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Evaluation of Post-Processing Algorithms for Polyphonic Sound Event Detection

Cances, Léo and Guyot, Patrice and Pellegrini, Thomas Evaluation of Post-Processing Algorithms for Polyphonic Sound Event Detection. (2019) In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2019), 20 October 2019 - 23 October 2019 (New Paltz, NY, United States).

(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Official URL: https://doi.org/10.1109/WASPAA.2019.8937143


Sound event detection (SED) aims at identifying sound events (audio tagging task) in recordings and then locating them temporally (segmentation task). This last task ends with the segmentation of the frame-level class predictions, that determines the onsets and offsets of the sound events. This step is often overlooked in scientific publications. In this paper, we focus on the post-processing algorithms used to identify the sound event boundaries. Different post-processing steps are investigated through smoothing, thresholding, and optimization. In particular, we evaluate different approaches for temporal segmentation, namely statistics-based and parametric methods. Experiments were carried out on the DCASE 2018 challenge task 4 data. We compared post-processing algorithms on the temporal prediction curves of two models: one based on the challenge's baseline and one based on Multiple Instance Learning (MIL). Results show the crucial impact of the post-processing methods on the final detection scores. When using ground truth audio tags to retain the final temporal predictions of interest, statistics-based methods yielded a 29.9% event-based F-score on the evaluation set with MIL. Moreover, the best results were obtained using class-dependent parametric methods with a 43.9% F-score. The post-processing methods and optimization algorithms have been compiled into a Python library named "aeseg".

Item Type:Conference or Workshop Item (Paper)
Additional Information:Thanks to IEEE editor. The definitive version is available at http://ieeexplore.ieee.org This papers appears in Proceedings of WASPAA 2019. The original PDF of the article can be found at: https://ieeexplore.ieee.org/document/8937143 Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
HAL Id:hal-02942302
Audience (conference):International conference proceedings
Uncontrolled Keywords:
Institution:Université de Toulouse > Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE)
French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UT3 (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Laboratory name:
Deposited On:01 Sep 2020 13:02

Repository Staff Only: item control page