Thlithi, Marwa and Pinquier, Julien and Pellegrini, Thomas and André-Obrecht, Régine Filterbank coefficients selection for segmentation in singer turns. (2016) In: 14th International Workshop on Content-Based Multimedia Indexing (CBMI 2016), 15 June 2016 - 17 June 2016 (Bucharest, Romania).
|
(Document in English)
PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader 367kB |
Official URL: http://dx.doi.org/10.1109/CBMI.2016.7500273
Abstract
Audio segmentation is often the first step of audio indexing systems. It provides segments supposed to be acoustically homogeneous. In this paper, we report our recent experiments on segmenting music recordings into singer turns, by analogy with speaker turns in speech processing. We compare several acoustic features for this task: FilterBANK coefficients (FBANK), and Mel frequency cepstral coefficients (MFCC). FBANK features were shown to outperform MFCC on a “clean” singing corpus. We describe a coefficient selection method that allowed further improvement on this corpus. A 75.8% F-measure was obtained with FBANK features selected with this method, corresponding to a 30.6% absolute gain compared to MFCC. On another corpus comprised of ethno-musicological recordings, both feature types showed a similar performance of about 60%. This corpus presents an increased difficulty due to the presence of instruments overlapped with singing and to a lower recording audio quality.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Additional Information: | Thanks to IEEE editor. The definitive version is available at http://ieeexplore.ieee.org This papers appears in Proceedings of GBMI 2016. Electronic ISBN: 978-1-4673-8695-1 The original PDF of the article can be found at: http://ieeexplore.ieee.org/document/7500273/ Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
HAL Id: | hal-01447347 |
Audience (conference): | International conference proceedings |
Uncontrolled Keywords: | |
Institution: | French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE) Université de Toulouse > Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE) Université de Toulouse > Université Toulouse III - Paul Sabatier - UT3 (FRANCE) Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE) Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE) Other partners > Université du Maine (FRANCE) |
Laboratory name: | |
Statistics: | download |
Deposited On: | 18 Jan 2017 12:50 |
Repository Staff Only: item control page