TY - GEN
T1 - Development of the MIT ASR system for the 2016 Arabic Multi-genre Broadcast Challenge
AU - Alhanai, Tuka
AU - Hsu, Wei Ning
AU - Glass, James
PY - 2017/2/7
Y1 - 2017/2/7
N2 - The Arabic language, with over 300 million speakers, has significant diversity and breadth. This proves challenging when building an automated system to understand what is said. This paper describes an Arabic Automatic Speech Recognition system developed on a 1,200 hour speech corpus that was made available for the 2016 Arabic Multi-genre Broadcast (MGB) Challenge. A range of Deep Neural Network (DNN) topologies were modeled including; Feed-forward, Convolutional, Time-Delay, Recurrent Long Short-Term Memory (LSTM), Highway LSTM (H-LSTM), and Grid LSTM (GLSTM). The best performance came from a sequence discriminatively trained G-LSTM neural network. The best overall Word Error Rate (WER) was 18.3% (p < 0:001) on the development set, after combining hypotheses of 3 and 5 layer sequence discriminatively trained G-LSTM models that had been rescored with a 4-gram language model.
AB - The Arabic language, with over 300 million speakers, has significant diversity and breadth. This proves challenging when building an automated system to understand what is said. This paper describes an Arabic Automatic Speech Recognition system developed on a 1,200 hour speech corpus that was made available for the 2016 Arabic Multi-genre Broadcast (MGB) Challenge. A range of Deep Neural Network (DNN) topologies were modeled including; Feed-forward, Convolutional, Time-Delay, Recurrent Long Short-Term Memory (LSTM), Highway LSTM (H-LSTM), and Grid LSTM (GLSTM). The best performance came from a sequence discriminatively trained G-LSTM neural network. The best overall Word Error Rate (WER) was 18.3% (p < 0:001) on the development set, after combining hypotheses of 3 and 5 layer sequence discriminatively trained G-LSTM models that had been rescored with a 4-gram language model.
KW - Arabic
KW - Automatic Speech Recognition
KW - Deep Neural Networks
KW - MGB Challenge
UR - http://www.scopus.com/inward/record.url?scp=85015984030&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85015984030&partnerID=8YFLogxK
U2 - 10.1109/SLT.2016.7846280
DO - 10.1109/SLT.2016.7846280
M3 - Conference contribution
AN - SCOPUS:85015984030
T3 - 2016 IEEE Workshop on Spoken Language Technology, SLT 2016 - Proceedings
SP - 299
EP - 304
BT - 2016 IEEE Workshop on Spoken Language Technology, SLT 2016 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2016 IEEE Workshop on Spoken Language Technology, SLT 2016
Y2 - 13 December 2016 through 16 December 2016
ER -