Abstract
The suffix tree of a string is the fundamental data structure of combinatorial pattern matching. We present a recursive technique for building suffix trees that yields optimal algorithms in different computational models. Sorting is an inherent bottleneck in building suffix trees and our algorithms match the sorting lower bound. Specifically, we present the following results. (1) Weiner [1973], who introduced the data structure, gave an optimal O(n)-time algorithm for building the suffix tree of an n-character string drawn from a constant-size alphabet. In the comparison model, there is a trivial Ω(n log n)-time lower bound based on sorting, and Weiner's algorithm matches this bound. For integer alphabets, the fastest known algorithm is the O(n log n) time comparison-based algorithm, but no super-linear lower bound is known. Closing this gap.
Original language | English (US) |
---|---|
Pages (from-to) | 987-1011 |
Number of pages | 25 |
Journal | Journal of the ACM |
Volume | 47 |
Issue number | 6 |
DOIs | |
State | Published - 2000 |
Keywords
- Dam model
- External-memory data structures
- Ram model
- Sorting complexity
- Suffix array
- Suffix tree
ASJC Scopus subject areas
- Software
- Control and Systems Engineering
- Information Systems
- Hardware and Architecture
- Artificial Intelligence