TY - GEN

T1 - A robust model for finding optimal evolutionary trees extended abstract

AU - Farach, Martin

AU - Kannan, Sampath

AU - Warnow, Tandy

N1 - Publisher Copyright:
© 1993 ACM.

PY - 1993/6/1

Y1 - 1993/6/1

N2 - Constructing evolutionary trees for species sets is a fundamental problem in computational biology. One of the standard models assumes the ability to compute distances between every pair of species and seeks to find an edge-weighted tree T in which the distance dTij in the tree between the leaves of T corresponding to the species i and j fits the observed distance, Sometimes the desired tree is ultrametric, so that the tree can be rooted with the root equidistant to each leaf. Many measures for evaluating the "fit" between a distance function d and the path-distance T have been prposed, and most such measures have resulted in NP-hard optimization problems. T in this paper we propose a measure of fit which models the inaccuracy in the data, and present several problems for constructing additive and ultrametric trees using this measure. Many of the resultant optimization problems are ;VP-hard, and one (finding a minimum size ultrametric tree which increments from an input matrix) is hard to approximate. Specifically, there is a constant c > 0 such that unless P = NP, no polynomial time algorithm can exist which finds an approximate solution within the ratio of ne. However, we also present tight upper and lower bounds for the L∞-Minimum Increment to Ultrametric. Thus, wc present perhaps the first algorithm for constructing phylogcnctic trees from distance matrices which finds optimal trees for a reasonable criterion in polynomial time.

AB - Constructing evolutionary trees for species sets is a fundamental problem in computational biology. One of the standard models assumes the ability to compute distances between every pair of species and seeks to find an edge-weighted tree T in which the distance dTij in the tree between the leaves of T corresponding to the species i and j fits the observed distance, Sometimes the desired tree is ultrametric, so that the tree can be rooted with the root equidistant to each leaf. Many measures for evaluating the "fit" between a distance function d and the path-distance T have been prposed, and most such measures have resulted in NP-hard optimization problems. T in this paper we propose a measure of fit which models the inaccuracy in the data, and present several problems for constructing additive and ultrametric trees using this measure. Many of the resultant optimization problems are ;VP-hard, and one (finding a minimum size ultrametric tree which increments from an input matrix) is hard to approximate. Specifically, there is a constant c > 0 such that unless P = NP, no polynomial time algorithm can exist which finds an approximate solution within the ratio of ne. However, we also present tight upper and lower bounds for the L∞-Minimum Increment to Ultrametric. Thus, wc present perhaps the first algorithm for constructing phylogcnctic trees from distance matrices which finds optimal trees for a reasonable criterion in polynomial time.

UR - http://www.scopus.com/inward/record.url?scp=0027307380&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0027307380&partnerID=8YFLogxK

U2 - 10.1145/167088.167132

DO - 10.1145/167088.167132

M3 - Conference contribution

AN - SCOPUS:0027307380

T3 - Proceedings of the Annual ACM Symposium on Theory of Computing

SP - 137

EP - 145

BT - Proceedings of the 25th Annual ACM Symposium on Theory of Computing, STOC 1993

PB - Association for Computing Machinery

T2 - 25th Annual ACM Symposium on Theory of Computing, STOC 1993

Y2 - 16 May 1993 through 18 May 1993

ER -