On the entropy of DNA: Algorithms and measurements based on memory and rapid convergence

Martin Farach, Michiel Noordewier, Serap Savari, Larry Shepp, Abraham Wyner, Jacob Ziv

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    We have applied the information theoretic notion of entropy to characterize DNA sequences. We consider a genetic sequence signal that is too small for asymptotic entropy estimates to be accurate, and for which similar approaches have previously failed. We prove that the match length entropy estimator has a relatively fast converge rate and demonstrate experimentally that by using this entropy estimator, we can indeed extract a meaningful signal from segments of DNA. Further, we derive a method for detecting certain signals within DNA - known as splice junctions - with significantly better performance than previously known methods. The main result of this paper is that we find that the entropy of genetic material which is ultimately expressed in protein sequences is higher than that which is discarded. This is an unexpected result, since current biological theory holds that the discarded sequences ("introns") are capable of tolerating random changes to a greater degree than the retained sequences ("exons").

    Original languageEnglish (US)
    Title of host publicationProceedings of the 6th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1995
    PublisherAssociation for Computing Machinery
    Pages48-57
    Number of pages10
    ISBN (Electronic)0898713498
    StatePublished - Jan 22 1995
    Event6th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1995 - San Francisco, United States
    Duration: Jan 22 1995Jan 24 1995

    Publication series

    NameProceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms

    Other

    Other6th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 1995
    Country/TerritoryUnited States
    CitySan Francisco
    Period1/22/951/24/95

    ASJC Scopus subject areas

    • Software
    • General Mathematics

    Fingerprint

    Dive into the research topics of 'On the entropy of DNA: Algorithms and measurements based on memory and rapid convergence'. Together they form a unique fingerprint.

    Cite this