What are you going to learn?

  • What is transcription and translation?
  • How does transcription work?
  • What are post-transcriptional modifications?
  • What are introns and exons?
  • What is alternative splicing?
  • terms: transcription bubble, transcription factors, promoter, TATA box, transcription initiation complex, terminator, capping, polyadenylation, splicing, pre-mRNA, snRNA, snRNP, spliceosome

Transcription is the first step of gene expression - the process in which the information in the gene is used to synthesize a product - usually, a protein. Transcription is the synthesis of RNA from DNA or in other words, the transcribing (rewriting) of the DNA code into RNA. The second step is translation, when a protein is synthesized from the RNA.

💡
transcription = DNA → RNA
💡
translation = RNA → protein

Transcription begins with the unwinding of the two DNA strands by RNA polymerase. The structure we get by unwinding the DNA strands is called a transcription bubble. One strand then acts as a template for the synthesis of the RNA. The RNA is synthesized by RNA polymerase that adds new ribonucleotides, again following the basic base-pairing rules with one exception: RNA doesn't contain thymine (T) but a different base called uracil (U), which also pairs with adenine (A - U). As was the case with DNA polymerase, RNA polymerase can add new nucleotides in the 5'-to-3' direction only. For that reason, the template it uses always runs in the 3'-to-5' direction. Unlike DNA polymerase, RNA polymerase can start an RNA chain without a primer.

💡
RNA base-pairing = adenine + uracil (A - U), cytosine + guanine (C - G)
💡
RNA polymerase unwinds the two DNA strands and synthesizes the RNA strand in the 5'-to-3' direction. It does not need a primer.

However, RNA polymerase needs accessory proteins that we call transcription factors in order to start transcribing. These transcription factors bind to a region preceding the gene that we call promoter. This promoter often contains a sequence of thymine and adenine nucleotides - the so-called TATA box. The transcription factors together with RNA polymerase form a transcription initiation complex and then transcription can start.

💡
TATA box = a sequence of thymine and adenine in promoter
💡
transcription initation complex = transcription factors + RNA polymerase; initiates transcription

RNA polymerase synthesizes the RNA until it reaches a certain sequence called terminator, where it stops and releases the new RNA. As it transcribed only one DNA strand, the RNA is single-stranded.

💡
terminator = a region that terminates the transcription

Usually, genes specify the amino acid sequences of proteins. Simply said, these genes tell us what amino acids are going to be present in a certain protein. The RNA molecules we get from transcribing these genes are called messenger RNAs (mRNAs). These genes are transcribed by a special type of RNA polymerase - RNA polymerase II.

We can, however, get RNA molecules as the final product of the transcription as well. These include specific RNAs that we call nonmessenger RNAs, for example, rRNA, tRNA, etc. These RNAs are transcribed by RNA polymerase I and RNA polymerase III.

💡
RNA polymerase II = transcribes genes that specify the amino acid sequences of proteins; its product is a mRNA
💡
RNA polymerase I & III = transcribe genes that code for nonmessenger RNAs (rRNA, tRNA, etc.)

In eukaryotes, transcription takes place in the nucleus as the DNA is stored there. However, translation, the process during which a protein is synthesized, takes place in ribosomes in the cytoplasm. For that reason, the RNA must be transported there. However, the RNA must first undergo several RNA processing steps - 1) capping, 2) polyadenylation, 3) splicing. This is called post-transcriptional processing. Before these processes are finished, the RNA is referred to as precursor mRNA - pre-mRNA.

💡
post-transcriptional processing = 1) capping, 2) polyadenylation, 3) splicing

RNA capping is a process in which a guanine nucleotide bearing a methyl group is attached to the 5' end of the RNA.

Polyadenylation is a process in which a series of adenine nucleotides is added to the 3' end of the RNA molecule. As multiple adenine nucleotides are added, the process is called polyadenylation (poly = many).

RNA capping and polyadenylation are important as they protect the RNA molecule from degradation, facilitate its transport from the nucleus to the cytoplasm and facilitate translation.

Splicing is a process which needs to happen because of a special structure of eukaryotic protein-coding genes. These genes contain two important sequences that we call introns and exons. Introns are noncoding sequences, in other words, sequences that do not contain instructions on how to make a protein. Exons, on the other hand, are coding sequences, containing these instructions. As introns do not code for proteins, they need to be removed. This is done during the process of RNA splicing. During the process, introns are removed from the RNA molecules and exons are put together. After this process, the RNA is ready to leave the nucleus and is referred to as the mRNA - messenger RNA.

💡
introns = noncoding sequences removed during splicing
💡
exons = coding sequences
💡
pre-mRNA → post-transcriptional modifications → m-RNA

RNA splicing is carried out by special RNA molecules called small nuclear RNAs (snRNAs) that together with proteins called small nuclear ribonucleoproteins (snRNPs, "snurps") form a special structure called spliceosome that is responsible for the process.

💡
spliceosome = small nuclear RNAs (snRNAs) + small nuclear ribonucleoproteins (snRNPs); carries out RNA splicing

What is interesting is that the transcripts of eukaryotic genes can be spliced in many different ways, producing different proteins. This is called alternative splicing.

References:
Alberts, B. (2014). Essential Cell Biology. Garland Science.
Cooper, G. M., & Hausman, R. E. (2007). The cell: A molecular approach. ASM.
Pollard, T. D., Earnshaw, W. C., Lippincott-Schwartz, J., & Johnson, G. T. (2017). Cell biology. Elsevier.
Snustad, D. P., & Simmons, M. J. (2012). Principles of Genetics. Wiley.