The human genome is a large storage of information that, throughout evolution, has undergone many changes, shifts, mutations and corrections. However, the many parts of a gene, the chromosomal structures, and the general gene-to-mRNA transcriptional mechanism remain relatively similar. In this section, the various components of a gene, chromosomal structures, and the gene transcription mechanism will be explained. Understanding the basic principles and components are necessary to understand cancer. Mutations both within, and to, these integral components promote cancer and lead to the formation of tumors.
Core promoters contain the TATA box, other transcription factor binding sites and the transcription start site (TSS)(1). The TATA box (5’-TATAAA-3’) is found at around -25 of the TSS and binds to the TATA-binding protein of transcription factor II D (TFIID)(2). This stabilizes TFIID for the formation of the RNA polymerase II preinitiation complex (2). Other transcription factors (TF) also bind to various transcription factor binding sites to form this complex (2). Some genes, however, only contain an initiator element for regulatory factor binding (2). The TSS marks the beginning of transcription as the +1 site (3).
On the core promoter, or slightly upstream, you may also find CpG islands, which are characterized by a higher percentage of cytosines bound to guanines via phosphodiester bonds than the rest of the genome (4). Methylation is common around the shores of CpG islands, up to 2kb, and can change the expression of various proteins (5).
Introns are sequences transcribed into the initial mRNA product, but are spliced out during post-transcription modifications (6). One function of introns is to provide alternative splicing options (12). These serve as exons only some of the time, yielding two different proteins from the same reading frame.
Untranslated Regions (UTRs)
There are also 5’ and 3’ UTRs, which, as their name suggests, are not translated into the final protein product. 5’ UTRs often form secondary structures, which aid in mRNA stability (7) and translational start site recognition (8). Also, within the 5’ UTR is the Kozak sequence (ACCAUGG), containing the initiation codon (8), and regulatory elements such as upstream open reading frames (uORFs)(9). The 3’ UTR, on the other hand, contains the sequence, AAUAAA followed by a GU-rich downstream element, which activate polyadenylation to add a poly(A) tail to the end of the newly synthesized mRNA molecule (10/14). The poly(A) tail prevents the degradation of the mRNA by host enzymes (11). It also plays a role in the transcription termination, translation and localization of the mRNA (11).
Exons are the regions of a gene that after transcription are spliced together during post-transcriptional modification and leave the nucleus. They do not necessarily get translated, for example, the 5' and 3' UTRs are considered part of the exon. Most exons are spliced constitutively, which means that they are either always spliced out or always included in the final mRNA product (12). However, alternative splicing patterns can create different proteins by omitting or including certain exon sequences (called cassette exon). Splice sites can also be altered, giving variations to both 5' and 3' splicing.
The figure below illustrates an example of alternative splicing where differential splicing patterns cause different mRNA forms, leading to different protein products. Protein products shown here differ in size and exons contained and/or introns retained. The product shown on the right for example, is missing an exon that's present in the alternative product, resulting in a shorter protein.
Transcription termination in eukaryotes is varied based on the RNA polymerase. RNA polymerase I require a polymerase-specific termination factor that associates with a sequence downstream of the transcription unit (13). RNA polymerase II relies on the existence of polyadenylation for termination (14). RNA polymerase III terminates transcription when it transcribes a series of uracils in a row (13).
Eukaryotic transcription begins with the binding of RNA polymerase with the help of enhancer sequences and activator proteins (15). There are three types of RNA polymerases in eukaryotes. RNA polymerase I transcribes 28S, 18S and 5.8S rRNAs, RNA polymerase II transcribes mRNAs, and RNA polymerase III transcribes tRNAs and 5S rRNAs (16). Small RNAs such as snRNAs and scRNAs are transcribed by either RNA polymerase II or III (16). The promoter of eukaryotic transcription generally contains a TATA box, which contains the consensus sequence TATTAA 25-35bp upstream of the +1 transcription start site, and the TFIIB recognition element, which has the (G/C)(G/C)(G/C)CGCC consensus sequence (15).
There are also various transcription factors which are required for transcription to occur. The following steps are for mRNA transcription by RNA polymerase II. First, TFIID, made of the TATA-binding protein (TBP) and 10-12 other polypeptides called TBP-associated factors (TAFs), binds to the TATA box. This is followed by the binding of TFIIB to TBP to recruit RNA polymerase II to the site. TFIIH and TFIIE are then recruited and TFIIH acts as both a helicase to unwind DNA and a protein kinase to phosphorylate RNA polymerase II and activate it (16). This begins transcription at the +1 site of the DNA.
The RNA polymerase reads the template strand from 3’ to 5’ and transcribe the coding strand sequence from 5’ to 3’. Due to the creation of a RNA molecule, all tyrosine molecules are replaced by uracil. This will continue until the termination stop site.
As mentioned before in the transcription terminator section, the three RNA polymerases undergo termination through different means. RNA polymerase II’s transcription is terminated by the polyadenylation sequence, AAUAAA followed by a GU-rich sequence (14). It was shown that the deletion or inactivation of the polyadenylation sequence causes increased transcription of mRNA containing sequences past the polyadenylation site (14).
1) Smale, S.T. and Kadonaga, J.T. (2003). The RNA Polymerase II Core Promoter. Annu. Rev. Biochem. 72, 449-479.
2) Lee, T.I. and Young, R.A. (2000). Transcription of eukaryotic protein-coding genes. Annu. Rev. Genet. 34, 77-137.
3) Cooper, G.M. (2000). Transcription in Prokaryotes. The Cell: A Molecular Approach. 2nd edition. (Sunderland MA: Sinauer Associates). Available from: http://www.ncbi.nlm.nih.gov/books/NBK9935/
4) Gardiner-Garden, M. and Frommer M. (1987). CpG islands in vertebrate genomes. Journal of Molecular Biology 192(2), 261-282.
5) Irizarry, R.A., Ladd-Acosta, C., Wen, Bo., Wu, Z., Montano, C., Onyango, P., Cui, H., Gabo, K., Rongione, M., Webster, M., Ji, H., Potash, J., Sabunciyan and S., Feinberg, A.P. (2009). Genome-wide methylation analysis of human colon cancer reveals similar hypo- and hypermethylation at conserved tissue-specific CpG island shores. Nay. Genet. 41(2), 178-186.
6) Clancy S. (2008). RNA Splicing: Introns, Exons and Spliceosome. Nature Education 1(1), 31. Available from: http://www.nature.com/scitable/topicpage/rna-splicing-introns-exons-and-spliceosome-12375
7) Babendure, J.R., Babendure, J.L., Ding, J.H., and Tsien, R.Y. (2006). Control of mammalian translation by mRNA structure near caps. RNA 12(5), 851-861.
8) Kozak, M. (1991). Structural Features in Eukaryotic mRNAs That Modulate the Initiation of Translation. The Journal of Biological Chemistry 266(30), 19867-19870.
9) Vilela, C. and McCarthy, J.E. (2003). Regulation of fungal gene expression via short open reading frames in the mRNA 5'untranslated region. Mol. Microbiol. 49(9), 859-867.
10) Proudfoot, N.J., Furger, A. and Dye, M.J. (2002). Integrating mRNA Processing with Transcription. Cell 108(4), 501-512.
11) Guhaniyogi, J. and Brewer, G. (2001). Regulation of mRNA stability in mammalian cells. Gene 265(1-2), 11-23.
12) Black D.L. (2003). Mechanisms of alternative pre-messenger RNA splicing. Annu. Rev. Biochem. 72, 291-336.
13) Lodish, H., Berk, A., Zipursky S.L., et al. (2000). Section 11.1 Transcription Termination. Molecular Cell Biology. 4th edition. (New York: W. H. Freeman). Available from: http://www.ncbi.nlm.nih.gov/books/NBK21601/
14) Conelly , S. and Manley, J.L. (1988). A functional mRNA polyadenylation signal is required for transcription termination by RNA polymerase II. Genes and Development 2, 440-452.
15) Clancy S. (2008). DNA transcription. Nature Education 1(1), 41. Available from: http://www.nature.com/scitable/topicpage/dna-transcription-426#
16) Cooper, G.M. (2000). Eukaryotic RNA Polymerases and General Transcription Factors. The Cell: A Molecular Approach. 2nd edition. (Sunderland MA: Sinauer Associates). Available from: http://www.ncbi.nlm.nih.gov/books/NBK9935/