TY - GEN
T1 - Permutation editing and matching via embeddings
AU - Cormode, Graham
AU - Muthukrishnan, S.
AU - Sahinalp, Süleyman Cenk
N1 - Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2001
Y1 - 2001
N2 - If the genetic maps of two species are modelled as permutations of (homologous) genes, the number of chromosomal rearrangements in the form of deletions, block moves, inversions etc. to transform one such permutation to another can be used as a measure of their evolutionary distance. Motivated by such scenarios, we study problems of computing distances between permutations as well as matching permutations in sequences, and finding most similar permutation from a collection (nearest neighbor). We adopt a general approach: embed permutation distances of relevance into well-known vector spaces in an approximately distance-preserving manner, and solve the resulting problems on the well-known spaces. Our results are as follows: We present the first known approximately distance preserving embeddings of these permutation distances into well-known spaces. Using these embeddings, we obtain several results, including the first known ecient solution for approximately solving nearest neighbor problems with permutations and the first known algorithms for finding permutation distances in the data stream model. We consider a novel class of problems called permutation matching problems which are similar to string matching problems, except that the pattern is a permutation (rather than a string) and present linear or near-linear time algorithms for approximately solving permutation matching problems; in contrast, the corresponding string problems take significantly longer.
AB - If the genetic maps of two species are modelled as permutations of (homologous) genes, the number of chromosomal rearrangements in the form of deletions, block moves, inversions etc. to transform one such permutation to another can be used as a measure of their evolutionary distance. Motivated by such scenarios, we study problems of computing distances between permutations as well as matching permutations in sequences, and finding most similar permutation from a collection (nearest neighbor). We adopt a general approach: embed permutation distances of relevance into well-known vector spaces in an approximately distance-preserving manner, and solve the resulting problems on the well-known spaces. Our results are as follows: We present the first known approximately distance preserving embeddings of these permutation distances into well-known spaces. Using these embeddings, we obtain several results, including the first known ecient solution for approximately solving nearest neighbor problems with permutations and the first known algorithms for finding permutation distances in the data stream model. We consider a novel class of problems called permutation matching problems which are similar to string matching problems, except that the pattern is a permutation (rather than a string) and present linear or near-linear time algorithms for approximately solving permutation matching problems; in contrast, the corresponding string problems take significantly longer.
UR - http://www.scopus.com/inward/record.url?scp=84879509047&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84879509047&partnerID=8YFLogxK
U2 - 10.1007/3-540-48224-5_40
DO - 10.1007/3-540-48224-5_40
M3 - Conference contribution
AN - SCOPUS:84879509047
SN - 3540422870
SN - 9783540422877
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 481
EP - 492
BT - Automata, Languages and Programming - 28th International Colloquium, ICALP 2001, Proceedings
A2 - Orejas, Fernando
A2 - Spirakis, Paul G.
A2 - van Leeuwen, Jan
PB - Springer Verlag
T2 - 28th International Colloquium on Automata, Languages and Programming, ICALP 2001
Y2 - 8 July 2001 through 12 July 2001
ER -