THE ORGANIZATION AND
CONTROL OF EUKARYOTIC GENOMES
Gene expression in eukaryotes
has two main differences from the same process in prokaryotes.
- The typical multicellular eukaryotic
genome is much larger than that of a bacterium.
Cell specialization limits the expression of many genes to specific
cells.
The estimated 35,000 genes in the human genome includes an enormous
amount of DNA that does not program the synthesis of RNA or protein.
Chromatin structure
- Eukaryotic DNA is precisely
combined with large amounts of protein.
During interphase, chromatin fibers are highly extended.
If extended, each DNA molecule would be about 6 cm long.
DNA packing
Fig
19.1
- First level - Histone proteins
Their positively charged amino acids bind tightly to negatively
charged DNA.
The five types of histones are very similar from one eukaryote
to another and are even present in bacteria.
Unfolded chromatin has the appearance of beads on a string, a
nucleosome, in which DNA winds around a core of histone
proteins.
The beaded string seems to remain essentially intact throughout
the cell cycle.
Histones leave the DNA only transiently during DNA replication.
They stay with the DNA during transcription.
By changing shape and position, nucleosomes allow RNA-synthesizing
polymerases to move along the DNA.
Level two - As chromosomes enter mitosis the beaded string
coils to form the 30-nm chromatin fiber.
-
- Level three - This fiber forms looped domains
attached to a scaffold of nonhistone proteins.
-
- Level four - the looped domains coil and fold
to produce the characteristic metaphase chromosome.
- Interphase chromatin is generally
much less condensed than the chromatin of mitosis with the 30-nm
fibers and looped domains remaining intact.
The chromatin of each chromosome occupies a restricted area within
the interphase nucleus.
Interphase chromosomes have areas that remain highly condensed,
heterochromatin, and less compacted areas, euchromatin.
Genome Organization at the
DNA Level
- In eukaryotes, most of the
DNA (about 97% in humans) does not code for protein or
RNA.
1. noncoding regions are regulatory
sequences.
2. introns.
3. repetitive DNA, present
in many copies in the genome.
-
- In mammals about 10 -15% of
the genome is tandemly repetitive DNA, or satellite
DNA.
These differ in density from other
regions, so they form a separate band after differential ultracentrifugation.
There are three types of satellite
DNA, differentiated by the total length of DNA at each site.
Table
19.1.
-
- Some genetic disorders are
caused by abnormally long stretches of tandemly repeated nucleotide
triplets within the affected gene.
Fragile X syndrome is caused
by hundreds to thousands of repeats of CGG in the fragile X gene.
Huntington's disease occurs
due to repeats of CAG that are translated into a proteins with
a long string of glutamines.
The severity of the disease and
the age of onset of these diseases are correlated with the number
of repeats.
-
- About 25-40% of most mammalian
genomes consists of interspersed repetitive DNA.
Appear at multiple sites in the
genome.
Are similar but usually not identical
to each other.
Gene families
- While most genes are present
as a single copy per haploid set of chromosomes, multigene
families exist as a collection of identical or very similar
genes.
These likely evolved from a single ancestral gene.
The members of multigene families may be clustered or dispersed
in the genome.
-
- Identical genes are multigene families that are clustered
tandemly. Fig
19.2.
Usually consist of the genes for RNA products or those for histone
proteins.
The three largest rRNA molecules are encoded in a single transcription
unit that is repeated tandemly hundreds to thousands of times.
This transcript is cleaved to yield three rRNA molecules that
combine with proteins and one other kind of rRNA to form ribosomal
subunits.
-
- Nonidentical genes
Two related families
of globin genes, a(alpha) and ß (beta), of hemoglobin,
which are located on different chromosomes. Fig
19.3.
The different versions of each globin subunit are expressed at
different times in development.
Within both families are sequences that are expressed during
the embryonic, fetal, and/or adult stage of development.
The embryonic and fetal hemoglobins have higher affinity for
oxygen than do adult forms, ensuring transfer of oxygen from
mother to developing fetus.
-
- The differences in genes arise
from mutations that accumulate in the gene copies over generations.
These mutations may even lead to enough changes to form pseudogenes,
DNA segments that have sequences similar to real genes but that
do not yield functional proteins.
Gene amplification, loss,
or rearrangement
- The nucleotide sequence of
an organism's genome may be altered in a systematic way during
its lifetime.
Does not affect gametes
Their effects are confined to particular
cells and tissues.
-
- In gene amplification, certain
genes are replicated as a way to increase expression of these
genes.
In amphibians, the genes for rRNA not only have a normal complement
of multiple copies but millions of additional copies are synthesized
in a developing ovum.
This assists the cell in producing enormous numbers of ribosomes
for protein synthesis after fertilization.
-
- In some insect cells, whole
or parts of chromosomes are lost early in development.
-
- Rearrangement of the loci of genes in somatic cells
may have a powerful effect on gene expression.
Transposons are genes that
can move from one location to another within the genome.
10% of the human genome are transposons.
If one "jumps" into a coding sequence of another gene,
it can prevent normal gene function.
If the transposon is inserted in a regulatory area, it may increase
or decrease transcription.
-
- Most transposons are retrotransposons
(Fig
19.5), in which the transcribed RNA includes the code for
an enzyme that catalyzes the insertion of the retrotransposon
and may include a gene for reverse transcriptase.
Reverse transcriptase uses the RNA molecule originally transcribed
from the retrotransposon as a templete to synthesize a double
stranded DNA copy.
This can populate the eukaryotic genome with multiple copies
of its sequence.
-
- Major rearrangements of at
least one set of genes occur during immune system differentiation.
B lymphocytes produce immunoglobins, or antibodies,
that specifically recognize and combat viruses, bacteria, and
other invaders. Fig
19.6.
Each differentiated cell produces one specific type of antibody
that attacks a specific invader.
Functional antibody genes are pieced together from physically
separated DNA regions.
Each immunoglobin consists of four polypeptide chains, each with
a constant region and a variable region, giving
each antibody its unique function.
As a B lymphocyte differentiates, one of several hundred possible
variable segments is connected to the constant section by deleting
the intervening DNA.
The random combinations of different variable and constant regions
create an enormous variety of different polypeptides, which combine
with others to form complete antibody molecules.
As a result, the mature immune system can make millions of different
kinds of antibodies from millions of subpopulations of B lymphocytes.
The Control of Gene Expression
- Each cell expresses only a
small fraction of its genes
Are continually turned on and off in response to signals from
their internal and external environments.
Gene expression must be controlled on a long-term basis during
cellular differentiation.
Highly specialized cells express only a tiny fraction of their
genes.
Problems with gene expression and control can lead to imbalance
and diseases, including cancers.
- The control of gene expression
can occur at any step in the pathway from gene to functional
protein. Fig
19.7
These levels of control include chromatin packing, transcription,
RNA processing, translation, and various alterations
to the protein product.
Chromatin packing modifications
- Genes of densely condensed
heterochromatin are usually not expressed.
- Chemical modifications of
chromatin play a key
role in chromatin structure and transcription regulation.
-
- DNA methylation
Inactive DNA is generally highly methylated compared to DNA that
is actively transcribed.
For example, the inactivated mammalian X chromosome in females
is heavily methylated.
Methylation enzymes correctly methylate the daughter strands.
This accounts for genomic imprinting in which methylation
turns off either the maternal or paternal alleles.
-
- Histone acetylation and deacetylation appear to play a
direct role in the regulation of gene transcription.
Acetylated histones grip DNA less tightly, providing easier access
for transcription proteins in this region.
Some of the enzymes responsible for acetylation or deacetylation
are associated with or are components of transcription factors
that bind to promotors.
- DNA methylation and histone
deacetylation may cooperate to repress transcription.
Initiation of transcription
is the most important and universally used control point in gene
expression.
- Control elements are noncoding DNA segments that regulate
transcription by binding transcription factors. Fig
19.8
Eukaryotic RNA polymerase is dependent on transcription factors
before transcription begins.
One transcription factor recognizes the TATA box.
- Distal control elements, enhancers,
may be thousands of nucleotides away from the promoter or even
downstream of the gene or within an intron. Fig
19.9.
Bending of DNA enables transcription factors, activators,
bound to enhancers to contact the protein initiation complex
at the promoter.
- Eukaryotic genes also have
repressor proteins that bind to DNA control elements called
silencers.
Repression may operate mostly at the level of chromatin modification.
- Each protein generally has
a DNA-binding domain that binds to DNA and a protein-binding
domain that recognizes other transcription factors.
- Genes coding for the enzymes
of a metabolic pathway may be scattered over different chromosomes.
Coordinate gene expression depends on the association of a specific
control element or collection of control elements with every
gene of a dispersed group.
A common group of transcription factors bind to them, promoting
simultaneous gene transcription.
Post-transcriptional mechanisms
Gene expression may be
blocked or stimulated by any post-transcriptional step.
- In alternative RNA splicing,
different mRNA molecules are produced from the same primary transcript,
depending on which RNA segments are treated as exons and which
as introns. Fig
19.11. Movie!
-
- Regulation of mRNA degradation.
Prokaryotic mRNA molecules may be degraded
after only a few minutes.
Eukaryotic mRNAs typically endure for
hours and can even last days or weeks.
For example, in red blood cells the mRNAs
for the hemoglobin polypeptides are unusually stable and are
translated repeatedly in these cells.
A common pathway of mRNA breakdown begins with enzymatic shortening
of the poly(A) tail.
This triggers the enzymatic removal of the 5' cap.
This is followed by rapid degradation of the mRNA by nucleases.
- Control of translation
- Translation of specific mRNAs
can be blocked by regulatory proteins that bind to specific sequences
or structures within the 5' leader region of mRNA. Movie!
This prevents attachment to ribosomes.
Protein factors required to initiate translation in eukaryotes
offer targets for simultaneously controlling translation of all
the mRNA in a cell.
This allows the cell to shut down translation if environmental
conditions are poor
-
- Eukaryotic polypeptides
must often be processed to yield functional proteins. Movie!
Regulation may occur at cleavage, chemical modifications,
and transport to the appropriate destination.
For example, cystic fibrosis results from mutations in
the genes for a chloride ion channel protein that prevents it
from reaching the plasma membrane.
The defective protein is rapidly degraded.
The cell limits the lifetimes of normal proteins by selective
degradation.
- Proteins intended for degradation
are marked by the attachment of ubiquitin proteins. Fig
19.12.
Giant proteosomes recognize the ubiquitin and degrade
the tagged protein.
The Molecular Biology of Cancer
- Cancer is a disease in which cells escape from the control
methods that normally regulate cell growth and division.
Changes can be random spontaneous mutations or environmental
influences such as chemical carcinogens or physical mutagens.
Cancer-causing genes, oncogenes, are products of proto-oncogenes,
that code for proteins that stimulate normal cell growth and
division and have essential functions in normal cells. Fig
19.13.
An oncogene arises from a genetic change that leads to an increase
in the proto-oncogene's protein or the activity of each protein
molecule.
-
- These genetic changes include
movements of DNA within the genome, amplification of
proto-oncogenes, and point mutations in the gene.
-
- Malignant cells frequently have chromosomes that have
been broken and rejoined incorrectly.
This may translocate a fragment to a location near an
active promotor or other control element.
-
- Amplification increases the number of gene copies.
-
- A point mutation may
lead to translation of a protein that is more active or
longer-lived.
- Mutations to genes whose normal products inhibit
cell division, tumor-suppressor genes, also contribute
to cancer.
Some tumor-suppressor proteins normally repair damaged DNA.
Others control the adhesion of cells to each other or to an extracellular
matrix, crucial for normal tissues.
Still others are components of cell-signaling pathways that inhibit
the cell cycle.
Oncogene proteins and faulty
tumor-suppressor proteins interfere with normal signaling pathways.
Fig
19.14.
- Mutations in the products of
two key genes, the ras proto-oncogene, and the p53
tumor suppressor gene occur in 30% and 50% of human cancers
respectively.
Both are components of signal-transduction pathways that convey
external signals to the DNA.
-
- Ras, the product of the ras gene, is a G protein
that provides the synthesis of a protein that stimulates the
cell cycle.
Many ras oncogenes have a point mutation that leads to a hyperactive
version of the Ras protein that can issue signals on its own,
resulting in excessive cell division.
-
- The tumor-suppressor protein
encoded by the normal p53 gene is a transcription factor
that promotes synthesis of growth-inhibiting proteins.
A mutation that knocks out the p53 gene can lead to excessive
cell growth and cancer.
- The p53 gene is often called
the "guardian angel of the genome".
Damage to the cell's DNA leads to expression of the p53 gene.
The p53 protein can:
activate the p21 gene, which halts the
cell cycle.
turn on genes involved in DNA repair.
activate "suicide genes" whose
protein products cause cell death.
Multiple mutations underlie
the development of cancer
If cancer results from an accumulation
of mutations, and if mutations occur throughout life, then the
longer we live, the more likely we are to develop cancer.
-
- Colorectal cancer (Fig
19.15), with 135,000 new cases in the U.S. each year, illustrates
a multi-step cancer path.
The first sign is often a polyp, a small benign growth
in the colon lining with fast dividing cells.
Through gradual accumulation of mutations that activate oncogenes
and knock out tumor-suppressor genes, the polyp can develop into
a malignant tumor.
-
- About a half dozen DNA changes
must occur for a cell to become fully cancerous.
These usually include the appearance of at least one active
oncogene and the mutation or loss of several tumor-suppressor
genes.
- Viruses, especially retroviruses, play a role
is about 15% of human cancer cases worldwide.
These include some types of leukemia, liver cancer, and cancer
of the cervix.
Viruses promote cancer development by integrating their DNA into
that of infected cells.
By this process, a retrovirus may donate an oncogene to the cell.
Alternatively, insertion of viral DNA may disrupt a tumor-suppressor
gene or convert a proto-oncogene to an oncogene.
-
- The fact that multiple genetic
changes are required to produce a cancer cell helps explain the
predispositions to cancer that run in some families.
An individual inheriting an oncogene or a mutant allele of a
tumor-suppressor gene will be one step closer to accumulating
the necessary mutations for cancer to develop.
About 15% of colorectal cancers
Between 5-10% of breast cancer cases