标题：Using prosody to improve Mandarin automatic speech recognition
作者：Ni, Chong-Jia ;Liu, Wen-Ju ;Xu, Bo
作者机构：[Ni, Chong-Jia ;Liu, Wen-Ju ;Xu, Bo ] National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190 更多
来源：Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010
摘要：In this paper, these problems of how to model and train Mandarin prosody dependent acoustic model and how to decode input speech based on prosody dependent speech recognition system will be discussed. We use automatic prosody labeling methods to annotate syllable prosodic break type and stress type on continuous speech corpus, and utilize our proposed methods to train prosody dependent tonal syllable model aiming at data sparse problem after prosody labeling. In this paper, we also utilize MSD-HSMM to model pitch, duration etc. influence factors of prosody, and at the same time, we unite MSD-HSMM model, prosody dependent tonal syllable duration model based on GMM and syntactical prosody model based on Maximum Entropy to decode. When compared with the baseline system, the performance of our prosody dependent speech recognition systems improves the correct rate of tonal syllable significantly. © 2010 ISCA.