Computing similarity between RNA strings

Vineet Bafna, S. Muthukrishnan, R. Ravi

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Ribonucleic acid (RNA) strings are strings over the four-letter alphabet {A, C, G, U} with a secondary structure of base-pairing between A- U and C - G pairs in the string 1. Edges are drawn between two bases that are paired in the secondary structure and these edges have traditionally been assumed to be noncrossing. The noncrossing base-pairing naturally leads to a tree-like representation of the secondary structure of RNA strings. In this paper, we address several notions of similarity between two RNA strings that take into account both the primary sequence and secondary base-palring structure of the strings. We present efficient algorithms for exact matching and approximate matching between two RNA strings. We define a notion of alignment between two RNA strings and devise algorithms based on dynamic programming. We then present a method for optimally aligning a given RNA string with unknown secondary structure to one with known sequence and structure, thus attacking the structure prediction problem in the case when the structure of a closely related sequence is known. The techniques employed to prove our results include reductions to well-known string matching problems allowing wild cards and ranges, and speeding up dynamic programming by using the tree structures implicit in the secondary structure of RNA strings. Keywords: RNA structure, edit distances, approximate matching, string algorithms, trees.

    Original languageEnglish (US)
    Title of host publicationCombinatorial Pattern Matching - 6th Annual Symposium, CPM 1995, Proceedings
    EditorsZvi Galil, Esko Ukkonen
    PublisherSpringer Verlag
    Pages1-16
    Number of pages16
    ISBN (Print)3540600442, 9783540600442
    DOIs
    StatePublished - 1995
    Event6th Annual Symposium on Combinatorial Pattern Matching, CPM 1995 - Espoo, Finland
    Duration: Jul 5 1995Jul 7 1995

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume937
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Other

    Other6th Annual Symposium on Combinatorial Pattern Matching, CPM 1995
    Country/TerritoryFinland
    CityEspoo
    Period7/5/957/7/95

    Keywords

    • Approximate matching
    • Edit distances
    • RNA structure
    • String algorithms
    • Trees

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • General Computer Science

    Fingerprint

    Dive into the research topics of 'Computing similarity between RNA strings'. Together they form a unique fingerprint.

    Cite this