TY - GEN

T1 - String matching under a general matching relation

AU - Muthukrishnan, S.

AU - Ramesh, H.

N1 - Funding Information:
In standard string matching, each symbol matches only itself. However, there are some string matching problems, in which a symbol may match a number of other *The work of this author was initiated while visiting IBM T.J. Watson Research Center, Yorktown Heights, and supported in part by NSF/DARPA grant CCR-89-06949 and NSF grant CCR-91-03953. lThe work of this author was supported in part by NSF grants CCR-8902221 and CCR-8906949.

PY - 1992

Y1 - 1992

N2 - In standard string matching, each symbol matches only itself. In other string matching problems, e.g., the string matching with “don’t-cares” problem, a symbol may match several symbols. In general, an arbitrary many-to-many matching relation might hold between symbols. We consider a general string matching problem in which such a matching relation is specified and those text positions are sought at which the pattern matches under this relation. Depending upon the existence of a simple, easily recognizable property in the given matching relation, we show that string matching either requires time linear in the text and pattern lengths or is at least as hard as boolean multiplication. Since the existence of a linear time algorithm for boolean multiplication has been a long-standing open question, designing linear time algorithms for matching relations in the latter category appears to be hard. As an application, we show that the matching relations of several independently studied string matching problems do indeed fall into the latter (hard) category. We also initiate the study of a generic string matching algorithm that works for any matching relation. We give an algorithm that given any matching relation, pattern and text runs in O(n(sm)1/3 polylog(m)), where n and m are the sizes of the text and the pattern respectively, and s is a factor related to the size of the given matching relation. This complexity is o(nm) except for very dense matching relations.

AB - In standard string matching, each symbol matches only itself. In other string matching problems, e.g., the string matching with “don’t-cares” problem, a symbol may match several symbols. In general, an arbitrary many-to-many matching relation might hold between symbols. We consider a general string matching problem in which such a matching relation is specified and those text positions are sought at which the pattern matches under this relation. Depending upon the existence of a simple, easily recognizable property in the given matching relation, we show that string matching either requires time linear in the text and pattern lengths or is at least as hard as boolean multiplication. Since the existence of a linear time algorithm for boolean multiplication has been a long-standing open question, designing linear time algorithms for matching relations in the latter category appears to be hard. As an application, we show that the matching relations of several independently studied string matching problems do indeed fall into the latter (hard) category. We also initiate the study of a generic string matching algorithm that works for any matching relation. We give an algorithm that given any matching relation, pattern and text runs in O(n(sm)1/3 polylog(m)), where n and m are the sizes of the text and the pattern respectively, and s is a factor related to the size of the given matching relation. This complexity is o(nm) except for very dense matching relations.

UR - http://www.scopus.com/inward/record.url?scp=85027608972&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85027608972&partnerID=8YFLogxK

U2 - 10.1007/3-540-56287-7_118

DO - 10.1007/3-540-56287-7_118

M3 - Conference contribution

AN - SCOPUS:85027608972

SN - 9783540562870

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 356

EP - 367

BT - Foundations of Software Technology and Theoretical Computer Science - 12th Conference, Proceedings

A2 - Shyamasundar, Rudrapatna

PB - Springer Verlag

T2 - 12th Conference on Foundations of Software Technology and Theoretical Computer Science, 1992

Y2 - 18 December 1992 through 20 December 1992

ER -