TY - GEN
T1 - Spoken Metro Station Name Identification
T2 - 6th International Conference on Computational Intelligence in Communications and Business Analytics, CICBA 2024
AU - Mukherjee, Himadri
AU - Marciano, Matteo
AU - Dhar, Ankita
AU - Alaei, Alireza
AU - Roy, Kaushik
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - Tourism is an up-and-coming industry with a significant source of income for the states and central governments all over the world. It also drives the livelihood of multitudinous locals in tourist spots. One of the problems often faced by tourists is navigation through cities’ tourist spots. Signboards, banners, and informative texts are often written in local languages and English (at times). This poses several difficulties for travelers who neither know local languages nor English. They encounter a daunting challenge when trying to navigate within the local area, and frequently become victims of dishonest individuals who exploit their lack of knowledge. This ultimately paints a dark picture of a place in front of the world. Voice-based systems can be beneficial in this context. These systems can enable visitors to query about different places, get directions, know about attractions, and other to-do things in a city. They can get accurate answers by just “asking” about a place from the system, thus avoiding the need for reading/writing ability of the dominant languages of that place. This can furthermore help impaired people in their daily travel. This paper proposes a deep learning-based approach with deep learning to address some of the above-mentioned issues. At the outset, the system is trained to recognize the metro station names in Kolkata (Capital city of West Bengal, India) from speech. This functionality can not only help tourists to navigate in the city but also aid in speeding up the ticketing system within metro stations by introducing voice-based input to the automated ticket vending machines. To evaluate the proposed system, several experiments were performed on a dataset of 24 metro station names in Kolkata, and the best accuracy of over 95% successful recognition was obtained in non-studio conditions.
AB - Tourism is an up-and-coming industry with a significant source of income for the states and central governments all over the world. It also drives the livelihood of multitudinous locals in tourist spots. One of the problems often faced by tourists is navigation through cities’ tourist spots. Signboards, banners, and informative texts are often written in local languages and English (at times). This poses several difficulties for travelers who neither know local languages nor English. They encounter a daunting challenge when trying to navigate within the local area, and frequently become victims of dishonest individuals who exploit their lack of knowledge. This ultimately paints a dark picture of a place in front of the world. Voice-based systems can be beneficial in this context. These systems can enable visitors to query about different places, get directions, know about attractions, and other to-do things in a city. They can get accurate answers by just “asking” about a place from the system, thus avoiding the need for reading/writing ability of the dominant languages of that place. This can furthermore help impaired people in their daily travel. This paper proposes a deep learning-based approach with deep learning to address some of the above-mentioned issues. At the outset, the system is trained to recognize the metro station names in Kolkata (Capital city of West Bengal, India) from speech. This functionality can not only help tourists to navigate in the city but also aid in speeding up the ticketing system within metro stations by introducing voice-based input to the automated ticket vending machines. To evaluate the proposed system, several experiments were performed on a dataset of 24 metro station names in Kolkata, and the best accuracy of over 95% successful recognition was obtained in non-studio conditions.
KW - Convolutional neural network
KW - Inclusive tourism
KW - Spectrogram
KW - Station names
UR - http://www.scopus.com/inward/record.url?scp=85219184023&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85219184023&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-81342-9_4
DO - 10.1007/978-3-031-81342-9_4
M3 - Conference contribution
AN - SCOPUS:85219184023
SN - 9783031813412
T3 - Communications in Computer and Information Science
SP - 40
EP - 50
BT - Computational Intelligence in Communications and Business Analytics - 6th International Conference, CICBA 2024, Revised Selected Papers
A2 - Singh, Jyoti Prakash
A2 - Singh, Maheshwari Prasad
A2 - Singh, Amit Kumar
A2 - Mukhopadhyay, Somnath
A2 - Mandal, Jyotsna K.
A2 - Dutta, Paramartha
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 23 January 2024 through 25 January 2024
ER -