40 Years of suffix trees

Alberto Apostolico, Maxime Crochemore, Martin Farach-Colton, Zvi Galil, S. Muthukrishnan

    Research output: Contribution to journalReview articlepeer-review

    Abstract

    The suffix tree is the core data structure in string analysis. It has a rich history, with connections to compression, matching, automata, data structures and more. There are powerful techniques to build suffix trees and use them efficiently in many applications. The appearance of suffix trees merged with some interesting and independent developments in information theory. In his famous approach to the notion of information, Kolmogorov equated the information or structure in a string to the length of the shortest program that would be needed to produce that string by a Universal Turing Machine. In his original paper, Weiner listed a few applications of his 'bi-tree' including most notably offline string searching: preprocessing a text file to support queries that return the occurrences of a given pattern in time linear in the length of the pattern. Udi Manber and Eugene W. Myers took a different approach, however. In 1990, they introduced the ?suffix array,?31 which eliminated most of the structure of the suffix tree, but was still able to implement many of the same operations, requiring space equal to 2 integers per text character and searching in time.

    Original languageEnglish (US)
    Pages (from-to)66-73
    Number of pages8
    JournalCommunications of the ACM
    Volume59
    Issue number4
    DOIs
    StatePublished - Apr 2016

    ASJC Scopus subject areas

    • Computer Science(all)

    Fingerprint Dive into the research topics of '40 Years of suffix trees'. Together they form a unique fingerprint.

    Cite this