DiVA: Indexing high-dimensional data by diving into vector approximations

Konstantinos Tsakalozos, Spiros Evangelatos, Alex Delis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Contemporary multimedia, scientific and medical applications use indexing structures to access their high-dimensional data. Yet, in sufficiently high-dimensional spaces, conventional tree-based access methods are eventually outperformed by simple serial scans. Vector quantization has been effectively used to index data that are mostly distributed uniformly. However, in real-world applications, clustered data and skewed query distributions are the norm. In this paper, we propose DiVA, an approach that selectively adapts the quantization step to accommodate varying indexing needs. This adaptation mechanism triggers the restructuring and possible expansion of DiVA so as to provide finer indexing granularity and enhanced access performance in certain hot areas of the search space. User-supplied policies help both identify such hot areas and satisfy versatile application requirements. Experimentation with our detailed prototype shows that in a real-world data set, DiVA yields up-to 64%reduced I/O compared to competing methods such as the VA-file and the A-tree.

Original languageEnglish (US)
Title of host publicationElectronic Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, ICME 2011
DOIs
StatePublished - 2011
Event2011 12th IEEE International Conference on Multimedia and Expo, ICME 2011 - Barcelona, Spain
Duration: Jul 11 2011Jul 15 2011

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Other

Other2011 12th IEEE International Conference on Multimedia and Expo, ICME 2011
CountrySpain
CityBarcelona
Period7/11/117/15/11

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint Dive into the research topics of 'DiVA: Indexing high-dimensional data by diving into vector approximations'. Together they form a unique fingerprint.

Cite this