Friday, 9 February 2018

Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition. (arXiv:1802.02656v1 [cs.CL])

The performance of automatic speech recognition systems degrades with increasing mismatch between the training and testing scenarios. Differences in speaker accents are a significant source of such mismatch. The traditional approach to deal with multiple accents involves pooling data from several accents during training and building a single model in multi-task fashion, where tasks correspond to individual accents. In this paper, we explore an alternate model where we jointly learn an accent classifier and a multi-task acoustic model. Experiments on the American English Wall Street Journal and British English Cambridge corpora demonstrate that our joint model outperforms the strong multi-task acoustic model baseline. We obtain a 5.94% relative improvement in word error rate on British English, and 9.47% relative improvement on American English. This illustrates that jointly modeling with accent information improves acoustic model performance.



from cs updates on arXiv.org http://ift.tt/2G0Dbpt
//

Related Posts:

0 comments:

Post a Comment