Simulating zero-resource spoken term discovery

Jerome White, Douglas W. Oard

Research output: Chapter in Book/Report/Conference proceedingConference contribution


If search engines are ever to index all of the spoken content in the world, they will need to handle hundreds of languages for which no automatic speech recognition systems exist. Zero-resource spoken term discovery, in which repeated content is detected in some acoustic representation, offers a potentially useful source of indexing features. This paper describes a text-based simulation of a zero-resource spoken term discovery system that allows any information retrieval test collection to be used as a basis for early development of information retrieval techniques. It is proposed that these techniques can be later applied to actual zero-resource spoken term discovery results.

Original languageEnglish (US)
Title of host publicationCIKM 2017 - Proceedings of the 2017 ACM Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Number of pages4
ISBN (Electronic)9781450349185
StatePublished - Nov 6 2017
Event26th ACM International Conference on Information and Knowledge Management, CIKM 2017 - Singapore, Singapore
Duration: Nov 6 2017Nov 10 2017

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings
VolumePart F131841


Other26th ACM International Conference on Information and Knowledge Management, CIKM 2017


  • N-gram retrieval
  • Simulation
  • Zero resource term discovery

ASJC Scopus subject areas

  • General Decision Sciences
  • General Business, Management and Accounting


Dive into the research topics of 'Simulating zero-resource spoken term discovery'. Together they form a unique fingerprint.

Cite this