On the sorting-complexity of suffix tree construction

Martin Farach-Colton, Paolo Ferragina, S. Muthukrishnan

    Research output: Contribution to journalArticlepeer-review

    Abstract

    The suffix tree of a string is the fundamental data structure of combinatorial pattern matching. We present a recursive technique for building suffix trees that yields optimal algorithms in different computational models. Sorting is an inherent bottleneck in building suffix trees and our algorithms match the sorting lower bound. Specifically, we present the following results. (1) Weiner [1973], who introduced the data structure, gave an optimal O(n)-time algorithm for building the suffix tree of an n-character string drawn from a constant-size alphabet. In the comparison model, there is a trivial Ω(n log n)-time lower bound based on sorting, and Weiner's algorithm matches this bound. For integer alphabets, the fastest known algorithm is the O(n log n) time comparison-based algorithm, but no super-linear lower bound is known. Closing this gap.

    Original languageEnglish (US)
    Pages (from-to)987-1011
    Number of pages25
    JournalJournal of the ACM
    Volume47
    Issue number6
    DOIs
    StatePublished - 2000

    Keywords

    • Dam model
    • External-memory data structures
    • Ram model
    • Sorting complexity
    • Suffix array
    • Suffix tree

    ASJC Scopus subject areas

    • Software
    • Control and Systems Engineering
    • Information Systems
    • Hardware and Architecture
    • Artificial Intelligence

    Fingerprint

    Dive into the research topics of 'On the sorting-complexity of suffix tree construction'. Together they form a unique fingerprint.

    Cite this