Training AI to Recognize Objects of Interest to the Blind and Low Vision Community

Tharangini Sankarnarayanan, Lev Paciorkowski, Khevna Parikh, Giles Hamilton-Fletcher, Chen Feng, Diwei Sheng, Todd E. Hudson, John Ross Rizzo, Kevin C. Chan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent object detection models show promising advances in their architecture and performance, expanding potential applications for the benefit of persons with blindness or low vision (pBLV). However, object detection models are usually trained on generic data rather than datasets that focus on the needs of pBLV. Hence, for applications that locate objects of interest to pBLV, object detection models need to be trained specifically for this purpose. Informed by prior interviews, questionnaires, and Microsoft's ORBIT research, we identified thirty-five objects pertinent to pBLV. We employed this user-centric feedback to gather images of these objects from the Google Open Images V6 dataset. We subsequently trained a YOLOv5x model with this dataset to recognize these objects of interest. We demonstrate that the model can identify objects that previous generic models could not, such as those related to tasks of daily functioning - e.g., coffee mug, knife, fork, and glass. Crucially, we show that careful pruning of a dataset with severe class imbalances leads to a rapid, noticeable improvement in the overall performance of the model by two-fold, as measured using the mean average precision at the intersection over union thresholds from 0.5 to 0.95 (mAP50-95). Specifically, mAP50-95 improved from 0.14 to 0.36 on the seven least prevalent classes in the training dataset. Overall, we show that careful curation of training data can improve training speed and object detection outcomes. We show clear directions on effectively customizing training data to create models that focus on the desires and needs of pBLV.Clinical Relevance - This work demonstrated the benefits of developing assistive AI technology customized to individual users or the wider BLV community.

Original languageEnglish (US)
Title of host publication2023 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350324471
DOIs
StatePublished - 2023
Event45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Sydney, Australia
Duration: Jul 24 2023Jul 27 2023

Publication series

NameProceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
ISSN (Print)1557-170X

Conference

Conference45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023
Country/TerritoryAustralia
CitySydney
Period7/24/237/27/23

ASJC Scopus subject areas

  • Signal Processing
  • Biomedical Engineering
  • Computer Vision and Pattern Recognition
  • Health Informatics

Fingerprint

Dive into the research topics of 'Training AI to Recognize Objects of Interest to the Blind and Low Vision Community'. Together they form a unique fingerprint.

Cite this