OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

CNN-based phone segmentation experiments in a less-represented language

Manenti, Céline and Pellegrini, Thomas and Pinquier, Julien CNN-based phone segmentation experiments in a less-represented language. (2016) In: 17th Annual Conference of the International Speech Communication Association (INTERSPEECH 2016), 8 September 2016 - 12 September 2016 (San Francisco, United States).

[img]
Preview
(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
569kB

Official URL: http://dx.doi.org/10.21437/Interspeech.2016

Abstract

These last years, there has been a regain of interest in unsupervised sub-lexical and lexical unit discovery. Speech segmentation into phone-like units may be a first interesting step for such a task. In this article, we report speech segmentation experiments in Xitsonga, a less-represented language spoken in South Africa. We chose to use convolutional neural networks (CNN) with FBANK static coefficients as input. The models take binary decisions whether a boundary is present or not at each signal sliding frame. We compare the use of a model trained exclusively on Xitsonga data to the use of a bootstrap model trained on a larger corpus of another language, the BUCKEYE U.S. English corpus. Using a two-convolution-layer model, a 79% F-measure was obtained on BUCKEYE, with a 20 ms error tolerance. This performance is equal to the human inter-annotator agreement rate. We then used this bootstrap model to segment Xitsonga data and compared the results when adapting it with 1 to 20 minutes of Xitsonga data.

Item Type:Conference or Workshop Item (Paper)
Additional Information:Thanks to ISCA : International Speech Communications Association. This papers appears in volume 2/5 of Proceedings of INTERSPEECH 2016 ISBN: 978-1-5108-3313-5 ISSN: 1990-9772 at : http://www.isca-speech.org/archive/Interspeech_2016/ The definitive version is available at: http://www.isca-speech.org/archive/Interspeech_2016/pdfs/0796.PDF
HAL Id:hal-01500519
Audience (conference):International conference proceedings
Uncontrolled Keywords:
Institution:Université de Toulouse > Institut National Polytechnique de Toulouse - INPT (FRANCE)
French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UPS (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Laboratory name:
Statistics:download
Deposited By: IRIT IRIT
Deposited On:16 Mar 2017 15:58

Repository Staff Only: item control page