Spoken Metro Station Name Identification: A Deep Learning-Based Approach

Himadri Mukherjee, Matteo Marciano, Ankita Dhar, Alireza Alaei, Kaushik Roy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Tourism is an up-and-coming industry with a significant source of income for the states and central governments all over the world. It also drives the livelihood of multitudinous locals in tourist spots. One of the problems often faced by tourists is navigation through cities’ tourist spots. Signboards, banners, and informative texts are often written in local languages and English (at times). This poses several difficulties for travelers who neither know local languages nor English. They encounter a daunting challenge when trying to navigate within the local area, and frequently become victims of dishonest individuals who exploit their lack of knowledge. This ultimately paints a dark picture of a place in front of the world. Voice-based systems can be beneficial in this context. These systems can enable visitors to query about different places, get directions, know about attractions, and other to-do things in a city. They can get accurate answers by just “asking” about a place from the system, thus avoiding the need for reading/writing ability of the dominant languages of that place. This can furthermore help impaired people in their daily travel. This paper proposes a deep learning-based approach with deep learning to address some of the above-mentioned issues. At the outset, the system is trained to recognize the metro station names in Kolkata (Capital city of West Bengal, India) from speech. This functionality can not only help tourists to navigate in the city but also aid in speeding up the ticketing system within metro stations by introducing voice-based input to the automated ticket vending machines. To evaluate the proposed system, several experiments were performed on a dataset of 24 metro station names in Kolkata, and the best accuracy of over 95% successful recognition was obtained in non-studio conditions.

Original languageEnglish (US)
Title of host publicationComputational Intelligence in Communications and Business Analytics - 6th International Conference, CICBA 2024, Revised Selected Papers
EditorsJyoti Prakash Singh, Maheshwari Prasad Singh, Amit Kumar Singh, Somnath Mukhopadhyay, Jyotsna K. Mandal, Paramartha Dutta
PublisherSpringer Science and Business Media Deutschland GmbH
Pages40-50
Number of pages11
ISBN (Print)9783031813412
DOIs
StatePublished - 2025
Event6th International Conference on Computational Intelligence in Communications and Business Analytics, CICBA 2024 - Patna, India
Duration: Jan 23 2024Jan 25 2024

Publication series

NameCommunications in Computer and Information Science
Volume2366 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference6th International Conference on Computational Intelligence in Communications and Business Analytics, CICBA 2024
Country/TerritoryIndia
CityPatna
Period1/23/241/25/24

Keywords

  • Convolutional neural network
  • Inclusive tourism
  • Spectrogram
  • Station names

ASJC Scopus subject areas

  • General Computer Science
  • General Mathematics

Fingerprint

Dive into the research topics of 'Spoken Metro Station Name Identification: A Deep Learning-Based Approach'. Together they form a unique fingerprint.

Cite this