Efficient algorithms for inverting evolution

Martin Farach, Sampath Kannan

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Evolution is a stochastic process which operates on the DNA of species. The evolutionary process leaves tell-tale signs in the DNA which can be used to construct phylogenies, or evolutionary trees, for a set of species. Maximum Likelihood Estimations (MLE) methods seek the evolutionary tree which is most likely to have produced the DNA under consideration. While these methods are widely accepted and intellectually satisfying, they have been computationally intractable. In this paper, we address the intractability of MLE methods as follows. We introduce a metric on stochastic process models of evolution. We show that this metric is meaningful by proving that in order for any algorithm to distinguish between two stochatic models that are close according to this metric, it needs to be given a lot of observations. We complement this result with a simple and efficient algorithm for inverting the stochastic process of evolution, that is, for building the tree from observations on the DNA of the species. Our result can be viewed as a result on the PAC-learnability of the class of distributions produced by tree-like processes. Though there have been many heuristics suggested for this problem, our algorithm is the first one with a guaranteed convergence rate, and further, this rate is within a polynomial of the lower-bound rate we establish. Ours is also the the first polynomial-time algorithm which is guaranteed to converge at all to the correct tree.

    Original languageEnglish (US)
    Title of host publicationProceedings of the 28th Annual ACM Symposium on Theory of Computing, STOC 1996
    PublisherAssociation for Computing Machinery
    Pages230-236
    Number of pages7
    ISBN (Electronic)0897917855
    DOIs
    StatePublished - Jul 1 1996
    Event28th Annual ACM Symposium on Theory of Computing, STOC 1996 - Philadelphia, United States
    Duration: May 22 1996May 24 1996

    Publication series

    NameProceedings of the Annual ACM Symposium on Theory of Computing
    VolumePart F129452
    ISSN (Print)0737-8017

    Conference

    Conference28th Annual ACM Symposium on Theory of Computing, STOC 1996
    Country/TerritoryUnited States
    CityPhiladelphia
    Period5/22/965/24/96

    ASJC Scopus subject areas

    • Software

    Fingerprint

    Dive into the research topics of 'Efficient algorithms for inverting evolution'. Together they form a unique fingerprint.

    Cite this