OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Unsupervised Speech Unit Discovery Using K-means and Neural Networks

Manenti, Céline and Pellegrini, Thomas and Pinquier, Julien Unsupervised Speech Unit Discovery Using K-means and Neural Networks. (2017) In: 5th International Conference on Statistical Language and Speech Processing (SLSP 2017), 23 October 2017 - 25 October 2017 (Le Mans, France).

[img]
Preview
(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
200kB

Official URL: https://doi.org/10.1007/978-3-319-68456-7_14

Abstract

Unsupervised discovery of sub-lexical units in speech is a problem that currently interests speech researchers. In this paper, we report experiments in which we use phone segmentation followed by clustering the segments together using k-means and a Convolutional Neural Network. We thus obtain an annotation of the corpus in pseudo-phones, which then allows us to find pseudo-words. We compare the results for two different segmentations: manual and automatic. To check the portability of our approach, we compare the results for three different languages (English, French and Xitsonga). The originality of our work lies in the use of neural networks in an unsupervised way that differ from the common method for unsupervised speech unit discovery based on auto-encoders. With the Xitsonga corpus, for instance, with manual and automatic segmentations, we were able to obtain 46% and 42% purity scores, respectively, at phone-level with 30 pseudo-phones. Based on the inferred pseudo-phones, we discovered about 200 pseudo-words.

Item Type:Conference or Workshop Item (Paper)
Additional Information:Thanks to Springer editor. This papers appears in volume 10583 of Lecture Notes in Computer Science ISSN : 0302-9743 ISBN 978-3-319-68455-0 The original PDF is available at: https://link.springer.com/chapter/10.1007/978-3-319-68456-7_14
HAL Id:hal-02559766
Audience (conference):International conference proceedings
Uncontrolled Keywords:
Institution:Université de Toulouse > Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE)
French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UT3 (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Laboratory name:
Statistics:download
Deposited On:21 Apr 2020 10:01

Repository Staff Only: item control page