Gene to protein

FROM GENE TO PROTEIN

The information content of DNA is in the form of specific sequences of nucleotides.

DNA dictates the synthesis of proteins, which are the links between genotype and phenotype.

The symptoms of an inherited disease reflect a person's inability to synthesize a particular enzyme.

The one gene - one enzyme hypothesis, but not all proteins are enzymes and yet their synthesis depends on specific genes.
The one gene - one protein hypothesis but many proteins are composed of several polypeptides, each of which has its own gene.

Therefore, the hypothesis has been restated as the one gene - one polypeptide hypothesis.

Transcription and translation are the two main processes linking gene to protein
The bridge between DNA and protein synthesis is RNA.

RNA is chemically similar to DNA, except that it contains ribose as its sugar and substitutes the nitrogenous base uracil for thymine.

An RNA molecule almost always consists of a single strand.

The specific sequence of hundreds or thousands of nucleotides in each gene carries the information for the primary structure of a protein, the linear order of the 20 possible amino acids.

To get from DNA, written in one chemical language, to protein, written in another, requires two major stages, transcription and translation.

During transcription, a DNA strand provides a template for the synthesis of a complementary RNA strand. Fig. 17.2

This process is used to synthesize any type of RNA from a DNA template.

Transcription of a gene produces a messenger RNA (mRNA) molecule.

During translation, the information contained in the order of nucleotides in mRNA is used to determine the amino acid sequence of a polypeptide.

Translation occurs at ribosomes.

The basic mechanics of transcription and translation are similar in eukaryotes and prokaryotes.

Because bacteria lack nuclei, transcription and translation are coupled.

In a eukaryotic cell, almost all transcription occurs in the nucleus and translation occurs mainly at ribosomes in the cytoplasm.

In addition, before the primary transcript can leave the nucleus it is modified in various ways during RNA processing before the finished mRNA is exported to the cytoplasm.

DNA -> RNA -> protein.

Nucleotide triplets specify amino acids

In the triplet code, three consecutive bases specify an amino acid, creating 43 (64) possible code words. Fig. 17.3.

During transcription, one DNA strand, the template strand, provides a template for ordering the sequence of nucleotides in an RNA transcript.

Uracil is the complementary base to adenine.

During translation, blocks of three nucleotides, codons, are decoded into a sequence of amino acids. The codons are read in the 5'->3' direction along the mRNA.

Each codon specifies which one of the 20 amino acids will be incorporated at the corresponding position along a polypeptide.

Nirenberg determined the first match: UUU coded for the amino acid phenylalanine.

He created an artificial mRNA molecule entirely of uracil and added it to a test tube mixture of amino acids, ribosomes, and other components for protein synthesis.

This "poly(U)" translated into a polypeptide containing a single amino acid, phenyalanine, in a long chain.

By the mid-1960s the entire code was deciphered. Fig 17.4.

61 of 64 triplets code for amino acids.

The codon AUG not only codes for the amino acid methionine but also indicates the start of translation.

Three codons do not indicate amino acids but signal the termination of translation.

To extract the message from the genetic code requires specifying the correct starting point.

This establishes the reading frame and subsequent codons are read in groups of three nucleotides.

The genetic code must have evolved very early in the history of life

The genetic code is nearly universal, shared by organisms from the simplest bacteria to the most complex plants and animals.

In laboratory experiments, genes can be transcribed and translated after they are transplanted from one species to another.

This has permitted bacteria to be programmed to synthesize certain human proteins after insertion of the appropriate human genes.

Transcription is the DNA-directed synthesis of RNA. Fig 17.6a

Messenger RNA is transcribed from the template strand of a gene.

RNA polymerase separates the DNA strands and bonds the RNA nucleotides to the 3' end of the growing polymer as they base-pair along the DNA template.

Genes are read 3'->5', creating a 5'->3' RNA molecule.

Specific sequences of nucleotides along the DNA mark where gene transcription begins and ends.

RNA polymerase attaches and initiates transcription at the promotor, "upstream" of the information contained in the gene, the transcription unit. Fig 17.7

The terminator signals the end of transcription.

Bacteria have a single type of RNA polymerase that synthesizes all RNA molecules.

Eukaryotes have three RNA polymerases (I, II, and III) in their nuclei.

RNA polymerase II is used for mRNA synthesis.

Transcription can be separated into three stages: initiation, elongation, and termination.

Initiation - The presence of a promotor sequence determines which strand of the DNA helix is the template.

Within the promotor is the starting point for the transcription of a gene.

The promotor also includes a binding site for RNA polymerase upstream of the start point.

In eukaryotes, proteins called transcription factors recognize the promotor region, especially a TATA box, and bind to the promotor.

After they have bound to the promotor, RNA polymerase binds to transcription factors to create a transcription initiation complex.

Elongation - RNA polymerase then starts transcription. Fig 17.6b

As RNA polymerase moves along the DNA, it untwists the double helix, and adds nucleotides to the 3' end of the growing strand.

Behind the point of RNA synthesis, the double helix re-forms and the RNA molecule peels away.

A single gene can be transcribed simultaneously by several RNA polymerases at a time. This helps the cell make the encoded protein in large amounts.

Termination - Transcription proceeds until after the RNA polymerase transcribes a terminator sequence in the DNA.

Transcription, the movie!
Eukaryotic cells modify RNA after transcription

At the 5' end of the pre-mRNA molecule, a modified form of guanine is added, the 5' cap, which helps protect mRNA from hydrolytic enzymes. Fig 17. 8.

At the 3' end, an enzyme adds, the poly(A) tail.

It inhibits hydrolysis, and enables ribosome attachment and the export of mRNA from the nucleus.

RNA splicing. Fig 17.9

Most eukaryotic genes and their RNA transcripts have long noncoding stretches of nucleotides.

Noncoding segments, introns, lie between coding regions, exons, which are translated into amino acid sequences, plus the leader and trailer sequences.

RNA splicing removes introns and joins exons to create a mRNA molecule with a continuous coding sequence.

This splicing is accomplished by a spliceosome. Fig 17.10.

Spliceosomes consist of a variety of proteins and several small nuclear ribonucleoproteins (snRNPs).

Each snRNP has several protein molecules and a small nuclear RNA molecule (snRNA).

In this process, the snRNA acts as a ribozyme, an RNA molecule that functions as an enzyme.

RNA splicing appears to have several functions.

           1. Some introns contain sequences that control gene activity in some way.

           2. May regulate the passage of mRNA from the nucleus to the cytoplasm.

           3. Enables one gene to encode for more than one polypeptide.

Alternative RNA splicing gives rise to two or more different polypeptides, depending on which segments are treated as exons. Fig 19.11.

Proteins often have a modular architecture with discrete structural and functional regions called domains. Fig 17.11.

In many cases, different exons code for different domains of a protein.

Introns increase the opportunity for recombination between two alleles of a gene.

Exon shuffling could lead to new proteins through novel combinations of functions.

Translation is the RNA-directed synthesis of a polypeptide. Fig 17.12.

Transfer RNA (tRNA) (2-dimensional image Fig 17.13a; 3-dimensional and symbol Fig 17.13b) transfers amino acids from the cytoplasm's pool to a ribosome.

The ribosome adds each amino acid carried by tRNA to the growing end of the polypeptide chain.

During translation, each type of tRNA links a mRNA codon with the appropriate amino acid.

Each tRNA arriving at the ribosome carries a specific amino acid at one end and has a specific nucleotide triplet, an anticodon, at the other.

Codon by codon, tRNAs deposit amino acids in the prescribed order and the ribosome joins them into a polypeptide chain.

tRNA molecules are transcribed from DNA templates in the nucleus. Each tRNA is used repeatedly.

             To pick up its designated amino acid in the cytosol.

          To deposit the amino acid at the ribosome.

             To return to the cytosol to pick up another copy of that amino acid.

The anticodons of some tRNAs recognize more than one codon. The rules for base pairing between the third base of the codon and anticodon are relaxed (called wobble).

At the wobble position, U on the anticodon can bind with A or G in the third position of a codon.

Each amino acid is joined to the correct tRNA by aminoacyl-tRNA synthetase. Fig 17.14. The 20 different synthetases match the 20 different amino acids.

The synthetase catalyzes a covalent bond between them, forming aminoacyl-tRNA or activated amino acid.

Ribosomes facilitate the specific coupling of the tRNA anticodons with mRNA codons.

Each ribosome has a large and a small subunit. Fig 17.15

These are composed of proteins and ribosomal RNA (rRNA), the most abundant RNA in the cell.

Each ribosome has a binding site for mRNA and three binding sites for tRNA molecules.

The P site holds the tRNA carrying the growing polypeptide chain.

The A site carries the tRNA with the next amino acid.

Discharged tRNAs leave the ribosome at the E site.

RNA is the catalyst for peptide bond formation.

Translation can be divided into three stages: Initiation, Elongation, and Termination

Initiation brings together mRNA, a tRNA with the first amino acid, and the two ribosomal subunits. Fig 17.17.

First, a small ribosomal subunit binds with mRNA and a special initiator tRNA, which carries methionine and attaches to the start codon. AUG = initiator codon

Initiation factors bring in the large subunit such that the initiator tRNA occupies the P site.

Elongation consists of a series of three-step cycles as each amino acid is added to the proceeding one. Fig 17.18.

Codon recognition

Peptide bond formation

Translocation

Termination occurs when one of the three stop codons reaches the A site. Fig 17.19.

Typically a single mRNA is used to make many copies of a polypeptide simultaneously.

Multiple ribosomes, polyribosomes, may trail along the same mRNA. Fig 17.20.

Translation, the movie!

During and after synthesis, a polypeptide coils and folds to its three-dimensional shape spontaneously.

In addition, proteins may require posttranslational modifications.

This may require additions like sugars, lipids, or phosphate groups to amino acids.

Enzymes may remove some amino acids or cleave whole polypeptide chains.

Two or more polypeptides may join to form a protein.

Ribosomes, free and bound.

Free ribosomes are suspended in the cytosol and synthesize proteins that reside in the cytosol.

Bound ribosomes are attached to the cytosolic side of the endoplasmic reticulum. Fig. 17.21.

They synthesize proteins of the endomembrane system as well as proteins secreted from the cell.

Translation in all ribosomes begins in the cytosol, but a polypeptide destined for the endomembrane system or for export has a specific signal peptide region at or near the leading end.

A signal recognition particle (SRP) binds to the signal peptide and attaches it and its ribosome to a receptor protein in the ER membrane.

After binding, the SRP leaves and protein synthesis resumes with the growing polypeptide snaking across the membrane into the cisternal space via a protein pore.

Other kinds of signal peptides are used to target polypeptides to mitochondria, chloroplasts, the nucleus, and other organelles that are not part of the endomembrane system.

In these cases, translation is completed in the cytosol before the polypeptide is imported into the organelle.

RNA plays multiple roles in the cell: a review Table 17.1.

Comparing protein synthesis in prokaryotes (Fig 17.22) and eukaryotes (Fig 17.25)

One big difference is that prokaryotes can transcribe and translate the same gene simultaneously.

The new protein quickly diffuses to its operating site.

In eukaryotes, the nuclear envelope segregates transcription from translation.

In addition, extensive RNA processing is inserted between these processes.

Point mutations can affect protein structure and function

Mutations are changes in the genetic material of a cell (or virus).

These include large-scale mutations in which long segments of DNA are affected (for example, translocations, duplications, and inversions).

A chemical change in just one base pair of a gene causes a point mutation.

If these occur in gametes or cells producing gametes, they may be transmitted to future generations.

For example, sickle-cell disease is caused by a mutation of a single base pair in the gene that codes for one of the polypeptides of hemoglobin. Fig 17.23

A point mutation that results in the replacement of a pair of complementary nucleotides with another nucleotide pair is called a base-pair substitution. Fig 17.24a.

Some base-pair substitutions have little or no impact on protein function.

Missense mutations are those that still code for an amino acid but change the indicated amino acid.

Nonsense mutations change an amino acid codon into a stop codon, nearly always leading to a nonfunctional protein.

Insertions and deletions are additions or losses of nucleotide pairs in a gene. Fig 17.24b.

These have a disastrous effect on the resulting protein more often than substitutions do.

Unless these mutations occur in multiples of three, they cause a frameshift mutation.

All the nucleotides downstream of the deletion or insertion will be improperly grouped into codons.

Mutations can occur during DNA replication, DNA repair, or DNA recombination.

These are called spontaneous mutations.

Mutagens are chemical or physical agents that interact with DNA to cause mutations.

Physical agents include high-energy radiation like X-rays and ultraviolet light.

This makes sense because most carcinogens are mutagenic and most mutagens are carcinogenic.

What is a gene?

The Mendelian concept of a gene views it as a discrete unit of inheritance that affects phenotype.

A gene is a specific nucleotide sequence along a region of a DNA molecule.

A gene is a region of DNA whose final product is either a polypeptide or an RNA molecule.