BACKGROUND: As the rough draft of the human genome sequence nears a finished product and other genome-sequencing projects accumulate sequence data exponentially, bioinformatics is emerging as an important tool for studies of transposon biology. In particular, L1 elements exhibit a variety of sequence structures after insertion into the human genome that are amenable to computational analysis. We carried out a detailed analysis of the anatomy and distribution of L1 elements in the human genome using a new computer program, TSDfinder, designed to identify transposon boundaries precisely. RESULTS: Structural variants of L1 elements shared similar trends in the length and quality of their target site duplications (TSDs) and poly(A) tails. Furthermore, we found no correlation between the composition and genomic location of the pre-insertion locus and the resulting anatomy of the L1 insertion. We verified that L1 insertions with TSDs have the 5'-TTAAAA-3' cleavage site associated with L1 endonuclease activity. In addition, the second target DNA cut required for L1 insertion weakly matches the consensus pattern TTAAAA. On the other hand, the L1-internal breakpoints of deleted and inverted L1 elements do not resemble L1 endonuclease cleavage sites. Finally, the genome sequence data indicate that whereas singly inverted elements are common, doubly inverted elements are almost never found. CONCLUSIONS: The sequence data give no indication that the creation of L1 structural variants depends on characteristics of the insertion locus. In addition, the formation of 5' truncated and 5' inverted L1s are probably not due to the action of the L1 endonuclease.
|Published - Sep 19 2002
ASJC Scopus subject areas
- Ecology, Evolution, Behavior and Systematics
- Cell Biology