Sep 12 2011
DNA sequencing is not static. A considerable amount of DNA jumps around from place to place. While other elements compete for representation at a given locus, transposable elements accumulate by copying themselves to new locations in the genome. Transposable elements are sequences of DNA that can move around to different positions within the genome of a single cell, a process called transposition. In the process, they can cause mutations and change the amount of DNA in the genome.
Consequently, there may be tens of thousands of active copies of a single transposable element dotted around the genome of a single individual, and different individuals may have their insertions in different places in the genome. Being small and adapted to integrating themselves into novel places in then genome, transposable elements are also capable of moving between species. Because of this amazingly expansive drive (both within and between species, transposable elements have appeared to colonize all eukaryotic species and have radiated into a bewildering array of subtypes. Most species have multiple types or families of transposable elements, each present in multiple copies per genome. About half of our own genome is derived from transposable elements.
There are three main types of transposable elements, which have little in common in structure or mechanism other than the fact that they are relatively short and that they encode one protein. DNA (or Class II) transposons typically encode one protein. DNA transposons usually move by a mechanism analogous to cut and paste, rather than copy and paste, using an enzyme called transposase, which recognizes the ends of an element, cuts it out, and reinserts it elsewhere in the genome. These cut and paste mechanisms lead to an increase in copy number. Transposons typically produce insertion-type frame shift mutations.
The other two classes transpose via an RNA intermediate through the action of reverse transcriptase. Retrotransposons copy themselves to RNA, and then the RNA is copied into DNA by a reverse transcriptase and inserted back into the genome. Barbara McClintock (1902-1992) first discovered transposable elements in 1952 in the Ac and Ds elements of DNA transposons in maize.
The simpler class, long interspersed repetitive elements (LINE’s), typically encodes one or two proteins. LINE’s are a group of genetic elements that are found in large numbers in eukaryotic genomes. They are transcribed to an RNA using an RNA polymerase II promoter that resides inside the LINE. LINE’s encode a multifunctional enzyme with domains for DNA binding, DNA cleavage, and the reverse transcription of RNA into DNA. The reverse transcriptase has a higher specificity for the LINE RNA than other RNA and makes a DNA copy of the RNA that can be integrated into the genome at a new site. Unlike most host genes, which have their promoter region upstream of their transcription start site, many LINE’s have an internal promoter. By carrying its own promoter, the element increases the probability that it will be transcribed regardless of where it happens to land in the genome. (3) Because LINE’s move by copying themselves (instead of moving, as transposons do), they enlarge the genome. The human genome, for example, contains about 20,000-40,000 LINE’s, which is roughly 21% of the genome. (4)
The long terminal repeats (LTR’s) encode five to six proteins, typically two to three structural proteins (capsid and nucleocapsid), and three enzymes (protease, reverse transcriptase, and integrase). LTR’s are thought to be an amalgam of the other two types; since the reverse transcriptase is homologous to that of the LINE2, and the integrase is homologous to some transposases. About 8% of the human genome and approximately 10% of the mouse genome are composed of the LTR transposons. (5)
Short interspersed repetitive elements (SINE’s) are short DNA sequences (<500 bases) that do not encode any proteins themselves but instead have evolved to parasitize the LINE retrotransposon machinery. SINE’s do not encode a functional reverse transcriptase protein and rely on other mobile elements for transposition. With about 1,500,000 copies, SINE’s make up about 13% of the human genome. (6) While historically viewed as “junk DNA,” recent research suggests that in some rare cases both LINE’s and SINE’s were incorporated into novel genes so as to evolve new functionality. The distribution of these elements has been implicated in some genetic diseases and cancers. The most common SINE’s in primates are called Alu sequences. We have about one million copies of Alu and its relatives in our DNA and about 7,000 are unique to humans. It is estimated that about 10.7% of the human genome consists of Alu sequences. Alu elements appear to control gene expression by inserting themselves all over the place. Alu elements are 280 base pairs long, do not contain any coding sequences, and can be recognized by the restriction enzyme Alu (hence the name).
Single nucleotide DNA variations in an Alu element have been linked to human disease. For example, a SNP in the promoter region of the myeloperoxidase (MPO) gene has been associated with a variety of disorders, including Alzheimer’s disease, lung cancer, stomach cancer, and lupus nephritis. (8)
Alu insertions are associated with several diseases: (7)
• Breast cancer
• Ewing’s sarcoma
• Familial hypercholesterolemia
• Diabetes mellitus type II
Despite their proliferative capacities, there is abundant evidence from genome sequencing studies that transposable elements often go extinct within a host species, with active copies of the gene disappearing from the gene pool. In the human genome there are hundreds of thousands of inactive “fossil” DNA transposons grouped into 63 families, all of which proliferated to varying extents to various degrees at various times of our existence —and at this point in time are not completely inactive. (9)
Transposable elements have some very sophisticated enzymatic capabilities that in certain circumstances may be useful to the rest of the genome. An apparent clear-cut co-option of a transposable element has taken place within the evolution of the vertebrate immune system. Most vertebrates have immunoglobulin (Ig) and T-cell receptor (TCR) genes that are “split” and must be re-assembled by recombination before they can be expressed. The split nature of immunoglobulin and T-cell-receptor genes appears to derive from germ line insertion of this element into an ancestral receptor gene soon after the evolutionary divergence of jawed and jawless vertebrates.
This recombination, called V(D)J recombination, occurs only in lymphocyte cells and in most vertebrates is responsible for generating much of the diversity of antigen receptors within an individual organism, the assembly process resulting in slightly different genes in different cells. This assembly process is initiated by proteins encoded by the RAG1 and RAG2 genes, cleaving the Ig and TCR genes in a method very similar to that initiating DNA-based transposition. Moreover, the RAG1 and RAG2 genes are immediately adjacent to each other in the genome. These observations led to the suggestion that RAG1, RAG2, and the repeating domain they recognize in the Ig and TCR genes are descendants of ancient transposons that have since become domesticated for host benefit. (10)
Genetic instability is one of the principal hallmarks and causative factors in cancer. Human transposable elements have been reported to cause human diseases, including several types of cancer through insertional mutagenesis of genes critical for preventing or driving malignant transformation. (11)
Portions excerpted from Fundamentals of Generative Medicine copyright 2010, Drum Hill Publishing, USA.
1. Sawyer SA, Parsch J, Zhang Z, Hartl DL. Prevalence of positive selection among nearly neutral amino acid replacements in Drosophila. Proc. Natl. Acad. Sci. U.S.A. 104 (16): 6504–10 (2007)
2. Vega F, Medeiros A. Chromosomal translocations involved in non-Hodgkin lymphomas. Archiv Path Lab Med 127 (9): 1148–60(2003)
3. Burt A and Trivers R. Genes in Conflict: The biology of selfish genetic elements. Belknap Harvard Cambridge MA (2006)
4. Singer MF SINE’s and LINE’s: highly repeated short and long interspersed sequences in mammalian genomes. Cell 28 (3): 433–4. (1982)
5. McCarthy EM, McDonald JF Long terminal repeat retrotransposons of Mus musculus. Genome Biol. 5 (3): R14. (2004).
6. Ibid 3.
7. Batzer MA, Deininger PL. Alu repeats and human genomic diversity. Nat. Rev. Genet. 3 (5): 370–9 (May 2002)
9. Lander ES et al. Initial sequencing and analysis of the human genome. Nature. 2001 Feb 15;409 (6822):860-921
10. Agrawal A, Eastman QM, Schatz DG. Transposition mediated by RAG1 and RAG2 and its implications for the evolution of the immune system. Nature. 1998 Aug 20; 394(6695):744-51
11. Belancio V,Roy-Engel A,and Deininger P. All y’all need to know ‘bout retroelements in cancer. Semin Cancer Biol. 2010 August; 20(4): 200–210.