OATAO - Open Archive Toulouse Archive Ouverte Open Access Week

Char+CV-CTC: Combining Graphemes and Consonant/Vowel Units for CTC-Based ASR Using Multitask Learning

Heba, Abdelwahab and Pellegrini, Thomas and Lorré, Jean-Pierre and André-Obrecht, Régine Char+CV-CTC: Combining Graphemes and Consonant/Vowel Units for CTC-Based ASR Using Multitask Learning. (2019) In: 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019), 15 September 2019 - 19 September 2019 (Graz, Austria).

(Document in English)

PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader

Official URL: https://doi.org/10.21437/Interspeech.2019-1975


Previous work has shown that end-to-end neural-based speech recognition systems can be improved by adding auxiliary tasks at intermediate layers. In this paper, we report multitask learning (MTL) experiments in the context of connectionist temporal classification (CTC) based speech recognition at character level. We compare several MTL architectures that jointly learn to predict characters (sometimes called graphemes) and consonant/vowel (CV) binary labels. The best approach, which we call Char+CV-CTC, adds up the character and CV logits to obtain the final character predictions. The idea is to put more weight on the vowel (consonant) characters when the vowel (consonant) symbol ‘V’ (‘C’) is predicted in the auxiliary-task branch of the network. Experiments were carried out on the Wall Street Journal (WSJ) corpus. Char+CV-CTC achieved the best ASR results with a 2.2% Character Error Rate and a 6.1% Word Error Rate (WER) on the Eval92 evaluation subset. This model outperformed its monotask model counterpart by 0.7% absolute in WER and also achieved almost the same performance of 6.0% as a strong baseline phone-based Time Delay Neural Network (“TDNN-Phone+TR2”) model.

Item Type:Conference or Workshop Item (Paper)
Additional Information:Abstract : https://www.isca-speech.org/archive/Interspeech_2019/abstracts/1975.html
HAL Id:hal-02419431
Audience (conference):International conference proceedings
Uncontrolled Keywords:
Institution:French research institutions > Centre National de la Recherche Scientifique - CNRS (FRANCE)
Université de Toulouse > Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE)
Université de Toulouse > Université Toulouse III - Paul Sabatier - UT3 (FRANCE)
Université de Toulouse > Université Toulouse - Jean Jaurès - UT2J (FRANCE)
Université de Toulouse > Université Toulouse 1 Capitole - UT1 (FRANCE)
Other partners > Linagora (FRANCE)
Laboratory name:
Deposited On:02 Dec 2019 14:20

Repository Staff Only: item control page