Generalized substring selectivity estimation

Zhiyuan Chen, Flip Korn, Nick Koudas, S. Muthukrishnan

    Research output: Contribution to journalArticlepeer-review


    An approach to selectivity estimation of generalized Boolean substring queries with a focus on conjunctive multidimensional and Boolean queries was presented. The set hashing, a Monte Carlo technique, was used to succinctly represent the set of tuples containing a given substring as a signature vector of hash values. The analysis showed that using only linear storage, a large number of cross-counts were generated including those for complex co-occurrences of substrings.

    Original languageEnglish (US)
    Pages (from-to)98-132
    Number of pages35
    JournalJournal of Computer and System Sciences
    Issue number1
    StatePublished - 2003

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • General Computer Science
    • Applied Mathematics
    • Computer Networks and Communications
    • Computational Theory and Mathematics


    Dive into the research topics of 'Generalized substring selectivity estimation'. Together they form a unique fingerprint.

    Cite this