The suffix tree is the core data structure in string analysis. It has a rich history, with connections to compression, matching, automata, data structures and more. There are powerful techniques to build suffix trees and use them efficiently in many applications. The appearance of suffix trees merged with some interesting and independent developments in information theory. In his famous approach to the notion of information, Kolmogorov equated the information or structure in a string to the length of the shortest program that would be needed to produce that string by a Universal Turing Machine. In his original paper, Weiner listed a few applications of his 'bi-tree' including most notably offline string searching: preprocessing a text file to support queries that return the occurrences of a given pattern in time linear in the length of the pattern. Udi Manber and Eugene W. Myers took a different approach, however. In 1990, they introduced the ?suffix array,?31 which eliminated most of the structure of the suffix tree, but was still able to implement many of the same operations, requiring space equal to 2 integers per text character and searching in time.
ASJC Scopus subject areas
- General Computer Science