A Map of Genetics A MAP OF GENETICS Mendel’s second law 3
Modified Mendelian ratios 6 Evolution 1, 14, 18, 19, 20
Mendel’s first law 2
Linkage 4, 17
Independent assortment 3
Crossing-over 4, 16 Recombination 3, 4, 16
Inheritance of more than one gene 3, 4, 19
Single gene inheritance 2
Organelle gene inheritance 3
Single gene mutations 2, 16
Quantitative inheritance 3, 19
Equal segregation 2
Chromosomal genes 2, 3, 4
Population genetics 18, 19, 20
Meiosis and mitosis in eukaryotes 2, 3
Inheritance and recombination of genes on bacterial chromosomes and plasmids and of phage genes 5, 10
DNA replication 1, 7
Chromosome rearrangements 17
THE GENOME (DNA, genes, chromosomes) 1, 2, 3, 5, 7, 14, 15
Transposons 15
Epigenetics 12
Transcription 8, 11, 12
Gene regulation 11, 12, 14
Translation 9, 11, 12
Gene interaction 6, 11, 12, 13, 14
Development 11, 12, 13 The map displays the general divisions of genetics in boxes, with arrows showing the main connections between them covered in this book. Orange, broadly, is inheritance, purple is function, and green is change. Numbers are chapters covering the topic, with main discussions in bold.
Introduction to
Genetic Analysis
About the Authors Anthony J. F. Griffiths is a Professor of Botany, Emeritus, at the University of British Columbia. His research focuses on developmental genetics using the model fungus Neurospora crassa. He has served as president of the Genetics Society of Canada and two terms as Secretary-General of the International Genetics Federation. He was recently awarded the Fellow Medal of the International Mycological Association.
[ Barbara Moon.]
Susan R. Wessler is a Distinguished Professor of Genetics in the Department of Botany and Plant Sciences at the University of California, Riverside. Her research focuses on plant transposable elements and their contribution to gene and genome evolution. Dr. Wessler was elected to the National Academy of Sciences in 1998. As a Howard Hughes Medical Institute Professor, she developed and teaches a series of dynamic genome courses in which undergraduates can experience the excitement of scientific discovery. [ Iqbal Pittawala.]
Sean B. Carroll is Vice President for Science Education at the Howard Hughes Medical Institute and a Professor of Molecular Biology and Genetics at the University of Wisconsin–Madison. Dr. Carroll is a leader in the field of evolutionary developmental biology and was elected to the National Academy of Sciences in 2007. He is also the author of Brave Genius, Endless Forms Most Beautiful: The Making of the Fittest, and Remarkable Creatures, a finalist for the National Book Award in Nonfiction in 2009.
[ Sean Carroll.]
John Doebley
is a Professor of Genetics at the University of Wisconsin–Madison. He studies the genetics of crop domestication using the methods of population and quantitative genetics. He was elected to the National Academy of Sciences in 2003 and served as the president of the American Genetic Association in 2005. He teaches general genetics and evolutionary genetics at the University of Wisconsin.
[ John Doebley.]
Introduction to
Genetic Analysis ELEVENTH EDITION
Anthony J. F. Griffiths University of British Columbia
Susan R. Wessler University of California, Riverside
Sean B. Carroll Howard Hughes Medical Institute University of Wisconsin–Madison
John Doebley University of Wisconsin–Madison
Publisher Senior Acquisitions Editor Executive Marketing Manager Marketing Assistant Developmental Editor Editorial Assistant Supplements Editor Executive Media Editor Media Editors Art Director Cover and Interior Designer Senior Project Editor Permissions Manager Photo Editor Illustration Coordinator Illustrations Production Supervisor Composition and Layout Printing and Binding Cover Photo
Kate Ahr Parker Lauren Schultz John Britch Bailey James Erica Champion Alexandra Garrett Erica Champion Amanda Dunning Donna Broadman, Erica Champion, Tue Tran Diana Blume Vicki Tomaselli Jane O’Neill Jennifer MacMillan Richard Fox Janice Donnola Dragonfly Media Group Susan Wein Sheridan Sellers RR Donnelley Susan Schmitz/Shutterstock
Library of Congress Preassigned Control Number: 2014957104 Hardcover: ISBN-13: 978-1-4641-0948-5 ISBN-10: 1-4641-0948-6 Loose-Leaf: ISBN-13: 978-1-4641-8804-6 ISBN-10: 1-4641-8804-1 © 2015, 2012, 2008, 2005 by W. H. Freeman and Company All rights reserved Printed in the United States of America First printing W. H. Freeman and Company A Macmillan Education Imprint 41 Madison Avenue New York, NY 10010 Houndmills, Basingstoke RG21 6XS, England www.macmillanhighered.com
Contents in Brief Preface
1 The Genetics Revolution PART I Transmission Genetics 2 Single-Gene Inheritance
3 Independent Assortment of Genes 4 Mapping Eukaryote Chromosomes by Recombination
Contents Preface
xiii 1
1
1.1 The Birth of Genetics
6 Gene Interaction PART Ii From DNA to Phenotype 7 DNA: Structure and Replication
8 9 10 11
3
Mendel rediscovered
5
87
The central dogma of molecular biology
9
1.2 After Cracking the Code 127
Model organisms
173
1.3 Genetics Today
Tools for genetic analysis 215
259 291
Proteins and Their Synthesis
319
Gene Isolation and Manipulation
351
egulation of Gene Expression in R Bacteria and Their Viruses
397 431 469 507
PART Iii Mutation, Variation,
and Evolution
15 The Dynamic Genome: Transposable 16 17 18 19 20
2
Gregor Mendel—A monk in the garden
RNA: Transcription and Processing
12 Regulation of Gene Expression in Eukaryotes 13 The Genetic Control of Development 14 Genomes and Genomics
1
31
5 The Genetics of Bacteria and Their Viruses
The Genetics Revolution
xiii
Elements
547
Mutation, Repair, and Recombination Large-Scale Chromosomal Changes
10 12
14
From classical genetics to medical genomics
14
Investigating mutation and disease risk
17
When rice gets its feet a little too wet
20
Recent evolution in humans
23
PART I
2
10
Transmission Genetics
Single-Gene Inheritance
31
2.1 Single-Gene Inheritance Patterns
34
Mendel’s pioneering experiments
34
Mendel’s law of equal segregation
36
2.2 The Chromosomal Basis of Single-Gene Inheritance Patterns
39
Single-gene inheritance in diploids
40
Single-gene inheritance in haploids
44
2.3 The Molecular Basis of Mendelian Inheritance Patterns
45
581
Structural differences between alleles at the molecular level
45
617
Molecular aspects of gene transmission
46
Population Genetics
665
Alleles at the molecular level
48
The Inheritance of Complex Traits
715
Evolution of Genes and Traits
761
A Brief Guide to Model Organisms Appendix A: Genetic Nomenclature Appendix B: Bioinformatics Resources for Genetics and Genomics Glossary Answers to Selected Problems Index
759 775 776
2.4 Some Genes Discovered by Observing Segregation Ratios
50
A gene active in the development of flower color
51
A gene for wing development
51
A gene for hyphal branching
52
Predicting progeny proportions or parental genotypes by applying the principles of single-gene inheritance
53
2.5 Sex-Linked Single-Gene Inheritance Patterns
53
779
Sex chromosomes
54
797
Sex-linked patterns of inheritance
54
809
X-linked inheritance
55
v
vi
CONTENTS
2.6 Human Pedigree Analysis Autosomal recessive disorders
3
58
Interference
141
Using ratios as diagnostics
142
59
Autosomal dominant disorders
61
Autosomal polymorphisms
63
Single nucleotide polymorphisms
144
X-linked recessive disorders
65
Simple sequence length polymorphisms
145
X-linked dominant disorders
68
Detecting simple sequence length polymorphisms
146
Y-linked inheritance
68
Recombination analysis using molecular markers
146
Calculating risks in pedigree analysis
69
Independent Assortment of Genes
87
4.3 Mapping with Molecular Markers
4.4 Centromere Mapping with Linear Tetrads
144
4.5 Using the Chi-Square Test to Infer Linkage 4.6 Accounting for Unseen Multiple Crossovers
148
150 151
3.1 Mendel’s Law of Independent Assortment
89
A mapping function
151
3.2 Working with Independent Assortment
93
The Perkins formula
152
Predicting progeny ratios
93
Using the chi-square test on monohybrid and dihybrid ratios
4.7 Using Recombination-Based Maps in Conjunction with Physical Maps
96
Synthesizing pure lines
98
4.8 The Molecular Mechanism of Crossing Over
Hybrid vigor
99
3.3 The Chromosomal Basis of Independent Assortment
The Genetics of Bacteria and
Their Viruses
173
5.1 Working with Microorganisms
176
103
177
Independent assortment of combinations of autosomal and X-linked genes
5.2 Bacterial Conjugation Discovery of conjugation
104
Discovery of the fertility factor (F)
178
Recombination
104
Hfr strains
179
108
Mapping of bacterial chromosomes
184
F plasmids that carry genomic fragments
188
Independent assortment in diploid organisms
101
Independent assortment in haploid organisms
3.4 Polygenic Inheritance 3.5 Organelle Genes: Inheritance Independent of the Nucleus
4
101
5
154 155
177
110
R plasmids
188
Patterns of inheritance in organelles
111 113
5.3 Bacterial Transformation The nature of transformation
191
Cytoplasmic segregation Cytoplasmic mutations in humans
115
Chromosome mapping using transformation
191
mtDNA in evolutionary studies
116
5.4 Bacteriophage Genetics Infection of bacteria by phages
192
Mapping Eukaryote Chromosomes
by Recombination
127
4.1 Diagnostics of Linkage
129
Using recombinant frequency to recognize linkage
129
How crossovers produce recombinants for linked genes
132
Linkage symbolism and terminology
132
Evidence that crossing over is a breakage-andrejoining process
133
Evidence that crossing over takes place at the four-chromatid stage
133
Multiple crossovers can include more than two chromatids
134
4.2 Mapping by Recombinant Frequency
135
Map units
136
Three-point testcross
139
Deducing gene order by inspection
141
Mapping phage chromosomes by using phage crosses
191
192 194
5.5 Transduction Discovery of transduction
196
Generalized transduction
197
Specialized transduction
198
Mechanism of specialized transduction
200
5.6 Physical Maps and Linkage Maps Compared
6
Gene Interaction
196
201
215
6.1 Interactions Between the Alleles of a Single Gene: Variations on Dominance
216
Complete dominance and recessiveness
216
Incomplete dominance
218
Codominance
219
Recessive lethal alleles
220
vii
CONTENTS
6.2 Interaction of Genes in Pathways
223
Biosynthetic pathways in Neurospora
224
Gene interaction in other types of pathways
226
6.3 Inferring Gene Interactions
227
Sorting mutants using the complementation test
227
Analyzing double mutants of random mutations
231
6.4 Penetrance and Expressivity
239
Small nuclear RNAs (snRNAs): the mechanism of exon splicing
307
Self-splicing introns and the RNA world
308
8.5 Small Functional RNAs That Regulate and Protect the Eukaryotic Genome
310
miRNAs are important regulators of gene expression
310
siRNAs ensure genome stability
311
Similar mechanisms generate siRNA and miRNA
314
9
PART Ii From DNA to Phenotype
7
Structure and Replication
319
9.1 Protein Structure
322
9.2 The Genetic Code
324
260
325 325
259
7.1 DNA: The Genetic Material
Proteins and Their Synthesis
Overlapping versus nonoverlapping codes
Discovery of transformation
261
Number of letters in the codon
Hershey–Chase experiment
263
Use of suppressors to demonstrate a triplet code
325
Degeneracy of the genetic code
327
Cracking the code
328
7.2 DNA Structure DNA structure before Watson and Crick
264
The double helix
267
7.3 Semiconservative Replication Meselson–Stahl experiment
264
270 271
The replication fork
272
DNA polymerases
273
7.4 Overview of DNA Replication
274
7.5 The Replisome: A Remarkable Replication Machine
277
Unwinding the double helix
279
Assembling the replisome: replication initiation
280
7.6 Replication in Eukaryotic Organisms Eukaryotic origins of replication
280 280
DNA replication and the yeast cell cycle
281
Replication origins in higher eukaryotes
282
7.7 Telomeres and Telomerase: Replication Termination
283
8
RNA: Transcription and Processing
291
8.1 RNA
293
Early experiments suggest an RNA intermediate
293
Properties of RNA
294
Classes of RNA
294
8.2 Transcription
296
Overview: DNA as transcription template
296
298
Stages of transcription
8.3 Transcription in Eukaryotes
301
Transcription initiation in eukaryotes
303
304
Elongation, termination, and pre-mRNA processing in eukaryotes
8.4 Intron Removal and Exon Splicing
307
Stop codons
329
9.3 tRNA: The Adapter
329
331
Codon translation by tRNA
Degeneracy revisited
331
9.4 Ribosomes
332
333
Ribosome features
Translation initiation, elongation, and termination
335
Nonsense suppressor mutations
338
9.5 The Proteome
339
Alternative splicing generates protein isoforms
339
Posttranslational events
340
10 Gene Isolation and Manipulation
351
10.1 Overview: Isolating and Amplifying Specific DNA Fragments
353
10.2 Generating Recombinant DNA Molecules
354
355
Genomic DNA can be cut up before cloning
The polymerase chain reaction amplifies selected regions of DNA in vitro
356
DNA copies of mRNA can be synthesized
358
Attaching donor and vector DNA
358
Amplification of donor DNA inside a bacterial cell
362
Making genomic and cDNA libraries
366
10.3 Using Molecular Probes to Find and Analyze a Specific Clone of Interest
367
Finding specific clones by using probes
367
Finding specific clones by functional complementation
369
Southern- and Northern-blot analysis of DNA
371
10.4 Determining the Base Sequence of a DNA Segment
374
viii
CONTENTS
10.5 Aligning Genetic and Physical Maps to Isolate Specific Genes
Using positional cloning to identify a human-disease gene
377
Gal4 regulates multiple genes through upstream activation sequences
378
The Gal4 protein has separable DNA-binding and activation domains
438
Gal4 activity is physiologically regulated
439
Gal4 functions in most eukaryotes
439
Activators recruit the transcriptional machinery
440 440
443
Using fine mapping to identify genes
379
10.6 Genetic Engineering
382
Genetic engineering in Saccharomyces cerevisiae
383
Genetic engineering in plants
383
The control of yeast mating type: combinatorial interactions
Genetic engineering in animals
386
12.3 Dynamic Chromatin
11 Regulation of Gene Expression
in Bacteria and Their Viruses
397
436
Chromatin-remodeling proteins and gene activation
444
Modification of histones
445
Histone methylation can activate or repress gene expression
448
11.1 Gene Regulation
399
The basics of prokaryotic transcriptional regulation: genetic switches
400
The inheritance of histone modifications and chromatin structure
448
401
449
11.2 Discovery of the lac System: Negative Control
404
Genes controlled together
405
DNA methylation: another heritable mark that influences chromatin structure
449
Genetic evidence for the operator and repressor
405
Genetic evidence for allostery
407
12.4 Activation of Genes in a Chromatin Environment
450
Genetic analysis of the lac promoter
408
The β-interferon enhanceosome
451
Molecular characterization of the Lac repressor and the lac operator
452
408
Genetic analysis of the lac promoter
A first look at the lac regulatory circuit
Histone variants
Enhancer-blocking insulators
408
12.5 Long-Term Inactivation of Genes in a Chromatin Environment
454
Molecular characterization of the Lac repressor and the lac operator
Mating-type switching and gene silencing
454
408
Heterochromatin and euchromatin compared
455
11.3 Catabolite Repression of the lac Operon: Positive Control
409
Position-effect variegation in Drosophila reveals genomic neighborhoods
456
Genetic analysis of PEV reveals proteins necessary for heterochromatin formation
457
The basics of lac catabolite repression: choosing the best sugar to metabolize
410
The structures of target DNA sites
410
411
A summary of the lac operon
11.4 Dual Positive and Negative Control: The Arabinose Operon
413
11.5 Metabolic Pathways and Additional Levels of Regulation: Attenuation
414
11.6 Bacteriophage Life Cycles: More Regulators, Complex Operons
417
Molecular anatomy of the genetic switch
421
Sequence-specific binding of regulatory proteins to DNA
422
11.7 Alternative Sigma Factors Regulate Large Sets of Genes
423
12 Regulation of Gene Expression
in Eukaryotes
431
12.1 Transcriptional Regulation in Eukaryotes: An Overview 12.2 Lessons from Yeast: The GAL System
432
436
12.6 Gender-Specific Silencing of Genes
and Whole Chromosomes
460
Genomic imprinting explains some unusual patterns of inheritance
460
But what about Dolly and other cloned mammals?
461
Silencing an entire chromosome: X-chromosome inactivation
462
12.7 Post-Transcriptional Gene Repression
by miRNAs
463
13 The Genetic Control of Development 13.1 The Genetic Approach to Development 13.2 The Genetic Toolkit for Drosophila Development
469 471 474
Classification of genes by developmental function
474
474
Homeotic genes and segmental identity
Organization and expression of Hox genes
476
The homeobox
478
Clusters of Hox genes control development in most animals
479
CONTENTS
13.3 Defining the Entire Toolkit
482
The anteroposterior and dorsoventral axes
483
484
Expression of toolkit genes
487
488
Maternal gradients and gene activation
Drawing stripes: integration of gap-protein inputs
489
491
Making segments different: integration of Hox inputs
13.5 Post-transcriptional Regulation of Gene Expression in Development
494
RNA splicing and sex determination in Drosophila
494
Regulation of mRNA translation and cell lineage in C. elegans 496 Translational control in the early embryo
496
miRNA control of developmental timing in C. elegans and other species
499
13.6 From Flies to Fingers, Feathers, and Floor
Plates: The Many Roles of Individual Toolkit Genes
13.7 Development and Disease
14.7 Functional Genomics and Reverse Genetics
536
536
“’Omics”
Reverse genetics
13.4 Spatial Regulation of Gene Expression in Development
ix
539
PART iIi Mutation, variation,
and evolution 15 The Dynamic Genome: Transposable
Elements
547
15.1 Discovery of Transposable Elements in Maize
549
McClintock’s experiments: the Ds element
549
Autonomous and nonautonomous elements
550
Transposable elements: only in maize?
552
15.2 Transposable Elements in Prokaryotes
553
Bacterial insertion sequences
553
500
Prokaryotic transposons
554
501
Mechanism of transposition
558 558
556
Polydactyly
501
15.3 Transposable Elements in Eukaryotes
Holoprosencephaly
502
502
Class 2: DNA transposons
562
Utility of DNA transposons for gene discovery
564
15.4 The Dynamic Genome: More Transposable Elements Than Ever Imagined
566
567
Cancer as a developmental disease
14 Genomes and Genomics
14.1 The Genomics Revolution 14.2 Obtaining the Sequence of a Genome
507
510 511
Turning sequence reads into an assembled sequence 511
Whole-genome sequencing
513
Traditional WGS
513
Next-generation whole-genome shotgun sequencing
514
517
Whole-genome-sequence assembly
14.3 Bioinformatics: Meaning from Genomic Sequence
519
The nature of the information content of DNA
519
Deducing the protein-encoding genes from genomic sequence
520
14.4 The Structure of the Human Genome
524
Noncoding functional elements in the genome
525
14.5 The Comparative Genomics of Humans with Other Species
527
527
Phylogenetic inference
Of mice and humans
530
Comparative genomics of chimpanzees and humans
532
Class 1: retrotransposons
Large genomes are largely transposable elements
Transposable elements in the human genome
568
The grasses: LTR-retrotransposons thrive in large genomes
569
569
Safe havens
15.5 Regulation of Transposable Element Movement by the Host
571
573
Genome surveillance in animals and bacteria
16 Mutation, Repair, and Recombination
581
16.1 The Phenotypic Consequences of DNA Mutations
583
Types of point mutation
583
The molecular consequences of point mutations in a coding region
584
The molecular consequences of point mutations in a noncoding region
586
16.2 The Molecular Basis of Spontaneous Mutations
586
14.6 Comparative Genomics and Human Medicine
532
Luria and Delbrück fluctuation test
586
The exome and personalized genomics
533
Mechanisms of spontaneous mutations
588
Comparative genomics of nonpathogenic and pathogenic E. coli 534
Spontaneous mutations in humans: trinucleotiderepeat diseases
591
x
CONTENTS
16.3 The Molecular Basis of Induced Mutations
593
The inbreeding coefficient
680
593
682
Mechanisms of mutagenesis
The Ames test: evaluating mutagens in our environment
595
16.4 Biological Repair Mechanisms
596
Direct reversal of damaged DNA
Base-excision repair
Population size and inbreeding
18.4 Genetic Variation and Its Measurement
684
687
597
New alleles enter the population: mutation and migration
687
598
Recombination and linkage disequilibrium
689
Nucleotide-excision repair
599
Genetic drift and population size
691
Postreplication repair: mismatch repair
602
Selection
696
Error-prone repair: translesion DNA synthesis
604
Forms of selection
698
Repair of double-strand breaks
606
Balance between mutation and drift
702
The involvement of DSB repair in meiotic recombination
608
Balance between mutation and selection
703
16.5 Cancer: An Important Phenotypic Consequence of Mutation
609
How cancer cells differ from normal cells
609
Mutations in cancer cells
609
17 Large-Scale Chromosomal Changes
617
17.1 Changes in Chromosome Number
618
Aberrant euploidy
619
Aneuploidy
627
The concept of gene balance
632
17.2 Changes in Chromosome Structure
634
Deletions
637
Duplications
640
642
Inversions
Reciprocal translocations
645
Robertsonian translocations
647
648
Applications of inversions and translocations
Rearrangements and cancer
649
650
Identifying chromosome mutations by genomics
17.3 Overall Incidence of Human Chromosome Mutations
18 Population Genetics 18.1 Detecting Genetic Variation
651
665 666
Single nucleotide polymorphisms (SNPs)
667
Microsatellites
668
Haplotypes
669
Other sources and forms of variation
670
The HapMap Project
671
18.2 The Gene-Pool Concept and the Hardy– Weinberg Law
672
18.3 Mating Systems
677
18.5 The Modulation of Genetic Variation
18.6 Biological and Social Applications
704
Conservation genetics
704
Calculating disease risks
705
DNA forensics
706
707
Googling your DNA mates
19 The Inheritance of Complex Traits 19.1 Measuring Quantitative Variation
715 717
Types of traits and inheritance
717
The mean
718
The variance
719
The normal distribution
721
19.2 A Simple Genetic Model for Quantitative Traits
722
Genetic and environmental deviations
722
Genetic and environmental variances
724
Correlation between variables
725
19.3 Broad-Sense Heritability: Nature Versus Nurture
727
728
Measuring heritability in humans using twin studies
19.4 Narrow-Sense Heritability: Predicting Phenotypes
731
732
Gene action and the transmission of genetic variation
The additive and dominance effects
733
734
A model with additivity and dominance
Narrow-sense heritability
736
Predicting offspring phenotypes
739
Selection on complex traits
740
19.5 Mapping QTL in Populations with Known Pedigrees
742
The basic method
743
747
From QTL to gene
678
19.6 Association Mapping in Random-Mating Populations
742
The basic method
751
679
752
Assortative mating
677
Isolation by distance
Inbreeding
GWA, genes, disease, and heritability
CONTENTS
20 Evolution of Genes and Traits
20.1 Evolution by Natural Selection 20.2 Natural Selection in Action: An Exemplary Case
761 764 766
The selective advantage of HbS 768
The molecular origins of HbS 770
20.3 Molecular Evolution: The Neutral Theory
The development of the neutral theory
771 771
Gene inactivation
xi
781
Regulatory-sequence evolution
782
783
Loss of characters through regulatory-sequence evolution
Regulatory evolution in humans
785
20.6 The Origin of New Genes and Protein Functions
786
787
Expanding gene number
The fate of duplicated genes
788
A Brief Guide to Model Organisms
793
Appendix A: Genetic Nomenclature
809
Appendix B: Bioinformatics Resources for Genetics and Genomics
810
The signature of positive selection on DNA sequences 778
Glossary
813
20.5 Morphological Evolution
779
Answers to Selected Problems
833
779
Index
845
The rate of neutral substitutions
772
The signature of purifying selection on DNA
772
20.4 Cumulative Selection and Multistep Paths to Functional Change
774
774
Multistep pathways in evolution
Adaptive changes in a pigment-regulating protein
This page intentionally left blank
Preface
S
ince its first edition in 1974, Introduction to Genetic Analysis has emphasized the power and incisiveness of the genetic approach in biological research and its applications. Over its many editions, the text has continuously expanded its coverage as the power of traditional genetic analysis has been extended with the introduction of recombinant DNA technology and then genomics. In the eleventh edition, we continue this tradition and show how the flowering of this powerful type of analysis has been used for insight into research in biology, agriculture, and human health.
Pedagogical Tools One of the important new features in this edition is the inclusion of LEARNING OUTCOMES lists of learning outcomes at the beginning of each chapter. Learning outcomes are crucial components of understanding. One of the tenets After completing this chapter, you will be able to of the constructivist theory of learning is that although understand• Perform a quantitative analysis of the ing might be a series of new mental circuits, the learner can never be progeny of a dihybrid testcross to assess sure of what is in his or her brain until called upon for some type of whether or not the two genes are linked on the same chromosome. performance. Indeed, understanding has even been defined by some as flexible performance capacity. The lists of goals show learners what • Extend the same type of analysis to several loci to produce a map of the relative positions precise performances are expected of them. The notes that follow of loci on a chromosome. show how the benefits of the learning outcomes in this book can be • In ascomycete fungi, map the centromeres maximized for instructors who wish to use them. to other linked loci. Classroom sessions large and small (for example, lectures and • In asci, predict allele ratios stemming from tutorials) should be structured as far as possible on learning outspecific steps in the heteroduplex model of comes closely paralleling those in these chapters. At various stages crossing over. in the classes students should be asked to demonstrate their understanding of the material just covered by attaining one or more learning outcomes. In writing examination or test questions, the instructor should try to stick closely to learning outcomes. When reviewing test results, show in what ways the outcomes have been attained or not attained by the learner. Students should read the list of learning outcomes before embarking on a chapter. Although it will not be possible to understand most of them before reading the chapter, their wording gives a good idea of the lay of the land, and shows the extent of what the instructor’s expectations are. Ideally, after reading a section of the chapter, it is a good idea for a student to go back to the list and match the material covered to an outcome. This process should be repeated at the end of the chapter by scanning the sections and making a complete match with each outcome as far as possible. In solving the end-of-chapter problems, try to focus effort on the skills described in the learning outcomes. Students should use the learning outcomes for rapid review when studying for exams; they should try to imagine ways that they will be expected to demonstrate understanding through the application of the outcomes. The general goal of a course in genetics is to learn how to think and work like a geneticist. The learning outcomes can fractionate this general goal into the many different skills required in this analytical subject. In this edition we have replaced “Messages” with “Key Concepts.” Messages have been in the book since its first edition in 1974. In the 1960s and 1970s, perhaps due to the popularity of Marshall McLuhan’s principle “The medium is the message,” the word message was in common use, and teachers were often asked, “What is your message?” Although with the rise of electronic media it is perhaps time for a resurgence of McLuhan’s principle, we felt that the word message no longer has the meaning it had in 1974.
xiii
xiv
PREFACE
New Coverage of Modern Genetic Analysis One of our goals is to show how identifying genes and their interactions is a powerful tool for understanding biological properties. In the eleventh edition, we present a completely rewritten introductory Chapter 1, with a focus on modern applications of genetics. From there, the student follows the process of a traditional genetic dissection, starting with a step-by-step coverage of single-gene identification in Chapter 2, gene mapping in Chapter 4, and identifying pathways and networks by studying gene interactions in Chapter 6. New genomic approaches to identifying and locating genes are explored in Chapters 10, 14, and 19. Flood-intolerant and flood-tolerant rice
SUB1 gene increases rice yield under flooding 6.0
Yield (t ha–1)
5.0 4.0 3.0 2.0 Swarna Swarna-Sub1
1.0 0.0
5
10
15
20
25
30
Duration of submergence (days)
FIGURE 1-20 An Indian farmer with rice variety Swarna that is not tolerant to
flooding (left) compared to variety Swarna-sub1 that is tolerant (right). This field was flooded for 10 days. The photo was taken 27 days after the flood waters receded. [ Ismail et al., “The contribution of submergence-tolerant (Sub 1) rice varieties to food security in flood-prone rainfed lowland areas in Asia,” Field Crops Research 152, 2013, 83–93, © Elsevier.]
F I G U R E 1- 2 1 Yield comparison between variety Swarna that is not tolerant to flooding (purple circles) and variety Swarna-Sub1 that is tolerant (green circles). Yield in tons per hectare ( y-axis) versus duration of flooding in days (x-axis). [ Data from Ismail et
al., “The contribution of submergence-tolerant (Sub 1) rice varieties to food security in flood-prone rainfed lowland areas in Asia,” Field Crops Research 152, 2013, 83–93.]
• A reconceptualized Chapter 1 now piques student interest in genetics by presenting a selection of modern applications in biology, evolution, medicine, and agriculture. After a brief history of the study of genetics and a review of some fundamentals, the chapter describes four stories of how genetics is used today. • Classical genetic dissection is given a more gradual introduction in Chapters 2 and 4. Chapter 2 begins with a new introduction to forward genetics and the role of genetic analysis in identifying traits of single-gene inheritance. Crosses are depicted visually as well as mathematically. The concepts of dominance and recessiveness are explained in terms of haplosufficiency and haploinsufficiency. The use of chi-square analysis in Chapter 4 has been rewritten for clarity. • The modern application of genetics introduced in Chapter 1 continues in Chapter 14 by applying new genomic techniques such as RNA-seq and exome sequencing, which are introduced to solve problems in medicine. The search for meaning in noncoding segments of the genome is an important frontier in genomics, and the ENCODE project has been added to this chapter to represent that search.
xv
PREFACE
Focus on Key Advances in Genetics We have enhanced coverage of several cutting-edge topics in the eleventh edition. Chromatin remodeling and epigenetics: Previously spread among several chapters, the flourishing field of epigenetics is now consolidated and completely updated in Chapter 12. In section 12.3, “Dynamic Chromatin,” we discuss the three major mechanisms of altering chromatin structure: chromatin remodeling, histone modification, and histone variants. Changes throughout this section provide more detail and clarity, based on recent advances in the field. Genome surveillance: Cutting-edge research in transposable elements has uncovered genome surveillance systems in plants, animals, and bacteria similar to that previously identified in C. elegans. Chapter 15 now provides an overview of piRNAs in animals and crRNAs in bacteria, and allows students to compare and contrast those approaches to Tc1 elements in worms and MITEs in plants.
Modifications of histone tails (a)
H2B
H2A
H2B
H2A
H2B
H2A
H4
H3
H4
H3
H4
H3
(b)
A
A Glu
A Ser Lys
Lys
A Lys
Lys
A Lys
A
A
Lys
A Lys
Lys
A H2B
H2A
H4
H3
Lys
M Lys
M A Ser Lys
A Lys
A Lys
Lys
pi-cluster
Some TEs insert into pi-cluster.
Transcription
Transcription Inactive element not transcribed
Processing
Processing mRNA
piRNA piwiArgonaute
Translation Anneal and degrade complementary TE mRNA.
Genome surveillance
Transposase protein
Transposition of “yellow” elements in genome
Ser
Lys
F igure 12 -13 (a) Histone tails protrude from the nucleosome core (purple). (b) Examples of histone tail modifications are shown. Circles with A represent acetylation while circles with M represent methylation. See text for details.
Inactivation of TEs following insertion into pi-clusters TEs insert randomly into chromosome.
M A
A
F igure 15 - 2 7 Insertion of the green and pink transposons into a pi-cluster in the genome results in the degradation of transcripts from these two transposons by the steps shown and described in the text. In contrast, the yellow transposon will remain active until copies insert by chance into a pi-cluster.
xvi
PREFACE
Enduring Features Coverage of model organisms The eleventh edition retains the enhanced coverage of model systems in formats that are practical and flexible for both students and instructors. • Chapter 1 introduces some key genetic model organisms and highlights some of the successes achieved through their use. • Model Organism boxes presented in context where appropriate provide additional information about the organism in nature and its use experimentally. • A Brief Guide to Model Organisms, at the back of the book, provides quick access to essential, practical information about the uses of specific model organisms in research studies. • An Index to Model Organisms, on the endpapers at the back of the book, provides chapter-by-chapter page references to discussions of specific organisms in the text, enabling instructors and students to easily find and assemble comparative information across organisms.
Problem sets No matter how clear the exposition, deep understanding requires the student to personally engage with the material. Hence our efforts to encourage student problem solving. Building on its focus on genetic analysis, the eleventh edition provides students with opportunities to practice problem-solving skills—both in the text and online through the following features. • Versatile Problem Sets. Problems span the full range of degrees of difficulty. They are categorized according to level of difficulty—basic or challenging. • Working with the Figures. An innovative set of problems included at the back of each chapter asks students pointed questions about figures in the chapter. These questions encourage students to think about the figures and help them to assess their understanding of key concepts. • Solved Problems. Found at the end of each chapter, these worked examples illustrate how geneticists apply principles to experimental data. • Unpacking the Problems. A genetics problem draws on a complex matrix of concepts and information. “Unpacking the Problem” helps students learn to approach problem solving strategically, one step at a time, concept on concept. • NEW Multiple-choice versions of the end-of-chapter problems are available on our online LaunchPad for quick gradable quizzing and easily gradable homework assignments. The Unpacking the Problem tutorials from the text have been converted to in-depth online tutorials and expanded to help students learn to solve problems and think like a geneticist. New videos demonstrate how to solve selected difficult problems.
How genetics is practiced today A feature called “What Geneticists Are Doing Today” suggests how genetic techniques are being used today to answer specific biological questions, such as “What is the link between telomere shortening and aging?” or “How can we find missing components in a specific biological pathway?”
PREFACE
Media and Supplements The LaunchPad is a dynamic, fully integrated learning environment that brings together all the teaching and learning resources in one place. It features the fully interactive e-Book, end-of-chapter practice problems now assignable as homework, animations, and tutorials to help students with difficult-to-visualize concepts. This learning system also includes easy-to-use, powerful assessment tracking and grading tools, a personalized calendar, an announcement center, and communication tools all in one place to help you manage your course. Some examples: • Hundreds of self-graded end-of-chapter problems allow students to practice their problem-solving skills. Most of the open-ended end-of-chapter questions have been carefully rewritten to create high-quality, analytical multiple-choice versions for assigning. • Animations help students visualize genetics. • Unpacking the Problem tutorials from the text have been converted and expanded to help students learn to solve problems and think like a geneticist. These in-depth online tutorials guide students toward the solution, offering guidance as needed via hints and detailed feedback. • NEW Problem-solving videos walk students through solving difficult problems from the text.
Teaching resources for instructors Electronic teaching resources are available online at the LaunchPad, at http://www.whfreeman.com/launchpad/iga11e Includes all the electronic resources listed below for teachers. Contact your W. H. Freeman sales representative to learn how to log on as an instructor. e-Book The e-Book fully integrates the text and its interactive media in a format that features a variety of helpful study tools (full-text, Google-style searching; note taking; bookmarking; highlighting; and more). Available as a stand-alone item or on the LaunchPad. Clicker Questions Jump-start discussions, illuminate important points, and promote better conceptual understanding during lectures. Layered PowerPoint Presentations Illuminate challenging topics for students by deconstructing intricate genetic concepts, sequences, and processes step-by-step in a visual format. All Images from the Text More than 500 illustrations can be downloaded as JPEGs and PowerPoint slides. Use high-resolution images with enlarged labels to project clearly for lecture hall presentations. Additionally, these JPEG and PowerPoint files are available without labels for easy customization in PowerPoint. 67 Continuous-Play Animations A comprehensive set of animations, updated and expanded for the eleventh edition, covers everything from basic molecular genetic events and lab techniques to analyzing crosses and genetic pathways. The complete list of animations appears on page xix.
xvii
xviii
Preface
Assessment Bank This resource brings together a wide selection of genetics problems for use in testing, homework assignments, or in-class activities. Searchable by topic and provided in MS Word format, as well as in LaunchPad and Diploma, the assessment bank offers a high level of flexibility. Student Solutions Manual (ISBN: 1-4641-8794-0) The Student Solutions Manual contains complete worked-out solutions to all the problems in the textbook, including the “Unpacking the Problem” exercises. Available on the LaunchPad and the Instructor’s Web site as easy-to-print Word files. Understanding Genetics: Strategies for Teachers and Learners in Universities and High Schools (ISBN: 0-7167-5216-6) Written by Anthony Griffiths and Jolie-Mayer Smith, this collection of articles focuses on problem solving and describes methods for helping students improve their ability to process and integrate new information.
Resources for students at http://www.whfreeman.com/launchpad/iga11e LaunchPad 6-month Access Card (ISBN: 1-4641-8793-2) The LaunchPad contains the following resources for students: • Self-Graded End-of-Chapter Problems: To allow students to practice their problem-solving skills, most of the open-ended end-of-chapter questions have been carefully rewritten to create high-quality, analytical multiplechoice versions for assigning. • Online Practice Tests: Students can test their understanding and receive immediate feedback by answering online questions that cover the core concepts in each chapter. Questions are page referenced to the text for easy review of the material. • Animations: A comprehensive set of animations, updated and expanded for the eleventh edition, covers everything from basic molecular genetic events and lab techniques to analyzing crosses and genetic pathways. The complete list of animations appears on the facing page. • Interactive “Unpacking the Problem”: An exercise from the problem set for many chapters is available online in interactive form. As with the text version, each Web-based “Unpacking the Problem” uses a series of questions to step students through the thought processes needed to solve a problem. The online version offers immediate feedback to students as they work through the problems as well as convenient tracking and grading functions. Authored by Craig Berezowsky, University of British Columbia. • NEW Problem-Solving Videos: Twenty-five problem-solving videos walk students through solving difficult problems from the text. Student Solutions Manual (ISBN: 1-4641-8794-0) The Solutions Manual contains complete worked-out solutions to all the problems in the textbook, including the “Unpacking the Problem” exercises. Used in conjunction with the text, this manual is one of the best ways to develop a fuller appreciation of genetic principles.
Preface
Other genomic and bioinformatic resources for students: Text Appendix A, Genetic Nomenclature, lists model organisms and their nomenclature. Text Appendix B, Bioinformatic Resources for Genetics and Genomics, builds on the theme of introducing students to the latest genetic research tools by providing students with some valuable starting points for exploring the rapidly expanding universe of online resources for genetics and genomics.
Animations Sixty-seven animations are fully integrated with the content and figures in the text chapters. These animations are available on the LaunchPad and the Book Companion site. CHAPTER 1 A Basic Plant Cross (Figure 1-3) The Central Dogma (Figure 1-10) CHAPTER 2 Mitosis (Chapter Appendix 2-1) Meiosis (Chapter Appendix 2-2) X-Linked Inheritance in Flies (Figure 2-17) CHAPTER 3 Punnett Square and Branch Diagram Methods for Predicting the Outcomes of Crosses (Figure 3-4) Meiotic Recombination Between Unlinked Genes by Independent Assortment (Figures 3-8 and 3-13) Analyzing a Cross: A Solved Problem (Solved Problem 2) CHAPTER 4 Crossing Over Produces New Allelic Combinations (Figures 4-2 and 4-3) Meiotic Recombination Between Linked Genes by Crossing Over (Figure 4-7) A Molecular Model of Crossing Over (Figure 4-21) A Mechanism of Crossing Over: A Heteroduplex Model (Figure 4-21) A Mechanism of Crossing Over: Genetic Consequences of the Heteroduplex Model Mapping a Three-Point Cross: A Solved Problem (Solved Problem 2) CHAPTER 5 Bacterial Conjugation and Mapping by Recombination (Figures 5-11 and 5-17) CHAPTER 6 Interactions Between Alleles at the Molecular Level, RR: Wild-Type Interactions Between Alleles at the Molecular Level, rr: Homozygous Recessive, Null Mutation Interactions Between Alleles at the Molecular Level, r ′r ′: Homozygous Recessive, Leaky Mutation Interactions Between Alleles at the Molecular Level, Rr: Heterozygous, Complete Dominance Screening and Selecting for Mutations A Model for Synthetic Lethality (Figure 6-20) CHAPTER 7 DNA Replication: The Nucleotide Polymerization Process (Figure 7-15) DNA Replication: Coordination of Leading and Lagging Strand Synthesis (Figure 7-20) DNA Replication: Replication of a Chromosome (Figure 7-23)
xix
xx
Preface
CHAPTER 8 Transcription in Prokaryotes (Figures 8-7 to 8-10) Transcription in Eukaryotes (Figures 8-12 and 8-13) Mechanism of RNA Splicing (Figures 8-16 and 8-17) CHAPTER 9 Peptide-Bond Formation (Figure 9-2) tRNA Charging (Figure 9-7) Translation (Figure 9-14 to 9-16) Nonsense Suppression at the Molecular Level: The rod ns Nonsense Mutation (Figure 9-18) Nonsense Suppression at the Molecular Level: The tRNA Nonsense Suppressor (Figure 9-18) Nonsense Suppression at the Molecular Level: Nonsense Suppression of the rod ns Allele (Figure 9-18) CHAPTER 10 Polymerase Chain Reaction (Figure 10-3) Plasmid Cloning (Figure 10-9) Finding Specific Cloned Genes by Functional Complementation: Functional Complementation of the Gal− Yeast Strain and Recovery of the Wild-Type GAL gene Finding Specific Cloned Genes by Functional Complementation: Making a Library of Wild-Type Yeast DNA Finding Specific Cloned Genes by Functional Complementation: Using the Cloned GAL Gene as a Probe for GAL mRNA SDS Gel Electrophoresis and Immunoblotting Dideoxy Sequencing of DNA (Figure 10-17) Creating a Transgenic Mouse (Figures 10-29 and 10-30) CHAPTER 11 Regulation of the Lactose System in E. coli: Assaying Lactose Presence/Absence Through the Lac Repressor (Figure 11-6) Regulation of the Lactose System in E. coli: OC lac Operator Mutations (Figure 11-8) Regulation of the Lactose System in E. coli: I− Lac Repressor Mutations (Figure 11-9) Regulation of the Lactose System in E. coli: IS Lac Superrepressor Mutations (Figure 11-10) CHAPTER 12 Three-Dimensional Structure of Nuclear Chromosomes (Figure 12-11) Gal4 Binding and Activation (Figures 12-6 through 12-9) Chromatin Remodeling (Figures 12-13 and 12-14) CHAPTER 13 Drosophila Embryonic Development Sex Determination in Flies (Figure 13-23) CHAPTER 14 DNA Microarrays: Using an Oligonucleotide Array to Analyze Patterns of Gene Expression (Figure 14-20) DNA Microarrays: Synthesizing an Oligonucleotide Array Yeast Two-Hybrid Systems (Figure 14-21)
Preface
CHAPTER 15 Replicative Transposition (Figure 15-9) Life Cycle of a Retrovirus (Figure 15-11) The Ty1 Mechanism of Retrotransposition (Figures 15-13 and 15-14) CHAPTER 16 Replication Slippage Creates Insertion or Deletion Mutations (Figure 16-8) UV-Induced Photodimers and Excision Repair (Figure 16-19) Base-Excision Repair, Nucleotide Excision Repair, and Mismatch Repair (Figures 16-20, 16-22, and 16-23) CHAPTER 17 Autotetraploid Meiosis (Figure 17-6) Meiotic Nondisjunction at Meiosis I (Figure 17-12) Meiotic Nondisjunction at Meiosis II (Figure 17-12) Chromosome Rearrangements: Paracentric Inversion, Formation of Paracentric Inversions (Figure 17-27) Chromosome Rearrangements: Paracentric Inversion, Meiotic Behavior of Paracentric Inversions (Figure 17-28) Chromosome Rearrangements: Reciprocal Translocation, Formation of Reciprocal Translocations (Figure 17-30) Chromosome Rearrangements: Reciprocal Translocation, Meiotic Behavior of Reciprocal Translocations (Figure 17-30) Chromosome Rearrangements: Reciprocal Translocation, Pseudolinkage of Genes by Reciprocal Translocations (Figure 17-32)
Acknowledgments We extend our thanks and gratitude to our colleagues who reviewed this edition and whose insights and advice were most helpful: Anna Allen, Howard University Melissa Antonio, California Baptist University Dave Bachoon, Georgia College & State University Brianne Barker, Drew University Lina Begdache, Binghamton University Edward Berger, Dartmouth College Aimee Bernard, University of Colorado Denver Jaime Blair, Franklin & Marshall College Jay Brewster, Pepperdine University Doug Broadfield, Florida Atlantic University Mirjana Brockett, Georgia Institute of Technology Judy Brusslan, California State University, Long Beach Gerald Buldak, Loyola University Chicago Aaron Cassill, University of Texas at San Antonio Helen Chamberlin, Ohio State University Henry Chang, Purdue University Randolph Christensen, Coe College Mary Clancy, University of New Orleans
Craig Coleman, Brigham Young University Matthew Collier, Wittenberg University Shannon Compton, University of Massachusetts–Amherst Diane Cook, Louisburg College Victoria Corbin, University of Kansas Claudette Davis, George Mason University Ann Marie Davison, Kwantlen Polytechnic University Elizabeth De Stasio, Lawrence University Matt Dean, University of Southern California Michael Dohm, Chaminade University Robert Dotson, Tulane University Chunguang Du, Montclair State University Erastus Dudley, Huntingdon College Edward Eivers, California State University, Los Angeles Robert Farrell, Penn State University David Foltz, Louisiana State University Wayne Forrester, Indiana University Rachael French, San Jose State University
xxi
xxii
Preface
Shirlean Goodwin, University of Memphis Topher Gee, UNC Charlotte John Graham, Berry College Theresa Grana, University of Mary Washington Janet Guedon, Duquesne University Patrick Gulick, Concordia University Richard Heineman, Kutztown University Anna Hicks, Memorial University Susan Hoffman, Miami University Stanton Hoegerman, College of William and Mary Margaret Hollingsworth, University at Buffalo Nancy Huang, Colorado College Jeffrey Hughes, Millikin University Varuni Jamburuthugoda, Fordham University Pablo Jenik, Franklin & Marshall College Aaron Johnson, University of Colorado School of Medicine Anil Kapoor, University of La Verne Jim Karagiannis, University of Western Ontario Kathleen Karrer, Marquette University Jessica Kaufman, Endicott College Darrell Killian, Colorado College Dennis Kraichely, Cabrini College Anuj Kumar, University of Michigan Janice Lai, Austin Community College Evan Lau, West Liberty University Min-Ken Liao, Furman University Sarah Lijegren, University of Mississippi Renyi Liu, University of California, Riverside Diego Loayza, Hunter College James Lodolce, Loyola University Chicago Joshua Loomis, Nova Southeastern University Amy Lyndaker, Ithaca College Jessica Malisch, Claremont McKenna College Patrick Martin, North Carolina A&T State University Presley Martin, Hamline University Dmitri Maslov, University of California, Riverside Maria Julia Massimelli, Claremont McKenna College Endre Mathe, Vasile Goldis Western University of Arad Herman Mays, University of Cincinnati Thomas McGuire, Penn State Abington Mark Meade, Jacksonville State University Ulrich Melcher, Oklahoma State University Philip Meneely, Haverford College Ron Michaelis, Rutgers University Chris Mignone, Berry College Sarah Mordan-McCombs, Franklin College of Indiana
Ann Murkowski, North Seattle Community College Saraswathy Nair, University of Texas at Brownsville Sang-Chul Nam, Texas A&M International University Scot Nelson, University of Hawaii at Manoa Brian Nichols, University of Illinois at Chicago Todd Nickle, Mount Royal University Juliet Noor, Duke University Mohamed Noor, Duke University Daniel Odom, California State University, Northridge Kirk Olsen, East Los Angeles College Kavita Oommen, Georgia State University Maria Orive, University of Kansas Laurie Pacarynuk, University of Lethbridge Patricia Phelps, Austin Community College Martin Poenie, University of Texas at Austin Jennifer Powell, Gettysburg College Robyn Puffenbarger, Bridgewater College Jason Rauceo, John Jay College (CUNY) Eugenia Ribiero-Hurley, Fordham University Ronda Rolfes, Georgetown University Edmund Rucker, University of Kentucky Jeffrey Sands, Lehigh University Monica Sauer, University of Toronto at Scarborough, UTSC Ken Saville, Albion College Pratibha Saxena, University of Texas at Austin Jon Schnorr, Pacific University Malcolm Schug, University of North Carolina at Greensboro Deborah Schulman, Lake Erie College Allan Showalter, Ohio University Elaine Sia, University of Rochester Robert Smith, Nova Southeastern University Joyce Stamm, University of Evansville Tara Stoulig, Southeastern Louisiana University Julie Torruellas Garcia, Nova Southeastern University Virginia Vandergon, California State University, Northridge Charles Vigue, University of New Haven Susan Walsh, Rollins College Michael Watters, Valparaiso University Roger Wartell, Georgia Institute of Technology Matthew White, Ohio University Dwayne Wise, Mississippi State University Andrew Wood, Southern Illinois University Mary Alice Yund, UC Berkeley Extension Malcom Zellars, Georgia State University Deborah Zies, University of Mary Washington
Preface
Tony Griffiths would like to acknowledge the pedagogical insights of David Suzuki, who was a co-author of the early editions of this book, and whose teaching in the media is now an inspiration to the general public around the world. Great credit is also due to Jolie Mayer-Smith and Barbara Moon, who introduced Tony to the power of the constructivist approach applied to teaching genetics. Sean Carroll would like to thank Leanne Olds for help with the artwork for Chapters 11, 12, 13, 14, and 20. John Doebley would like to thank his University of Wisconsin colleagues Bill Engels, Carter Denniston, and Jim Crow, who shaped his approach to teaching genetics. The authors also thank the team at W. H. Freeman for their hard work and patience. In particular we thank our developmental and supplements editor, Erica Champion; senior acquisitions editor Lauren Schultz; senior project editor Jane O’Neill; and copy editor Teresa Wilson. We also thank Susan Wein, production supervisor; Diana Blume, art director; Vicki Tomaselli, cover and text designer; Sheridan Sellers, page layout; Janice Donnola, illustration coordinator; Jennifer MacMillan, permissions manager; Amanda Dunning, executive media editor; and Alexandra Garrett, editorial assistant. Finally, we especially appreciate the marketing and sales efforts of John Britch, executive marketing manager, and the entire sales force.
xxiii
This page intentionally left blank
344
1
C h a p t e r
The Genetics Revolution
Learning Outcomes After completing this chapter, you will be able to • Describe the way in which modern genetics developed. • List the main cellular constituents involved in gene expression and action. • Give some examples of how genetics has influenced modern medicine, agriculture, and evolution.
DNA (deoxyribonucleic acid) is the molecule that encodes genetic information. The strings of four different chemical bases in DNA store genetic information in much the same way that strings of 0’s and 1’s store information in computer code. [ Sergey Nivens/Shutterstock.]
outline 1.1 The birth of genetics 1.2 After cracking the code 1.3 Genetics today
1
2 CHAPTER 1
The Genetics Revolution
G
enetics is a form of information science. Geneticists seek to understand the rules that govern the transmission of genetic information at three levels—from parent to offspring within families, from DNA to gene action within and between cells, and over many generations within populations of organisms. These three foci of genetics are known as transmission genetics, moleculardevelopmental genetics, and population-evolutionary genetics. The three parts of this text examine these three foci of genetics. The science of genetics was born just over 100 years ago. Since that time, genetics has profoundly changed our understanding of life, from the level of the individual cell to that of a population of organisms evolving over millions of years. In 1900, William Bateson, a prominent British biologist, wrote presciently that an “exact determination of the laws of heredity will probably work more change in man’s outlook on the world, and in his power over nature, than any other advance in natural knowledge that can be foreseen.” Throughout this text, you will see the realization of Bateson’s prediction. Genetics has driven a revolution in both the biological sciences and society in general. In this first chapter, we will look back briefly at the history of genetics, and in doing so, we will review some of the basic concepts of genetics that were discovered over the last 100 years. After that, we will look at a few examples of how genetic analysis is being applied to critical problems in biology, agriculture, and human health today. You will see how contemporary research in genetics integrates concepts discovered decades ago with recent technological advances. You will see that genetics today is a dynamic field of investigation in which new discoveries are continually advancing our understanding of the biological world.
Like begets like
F I G U R E 1-1 Family groups in the gray wolf show familial resemblances for coat colors and patterning. [ ( Top) altrendo nature/Getty Images; (bottom) Bev McConnell/ Getty Images.]
1.1 The Birth of Genetics Throughout recorded history, people around the world have understood that “like begets like.” Children resemble their parents, the seed from a tree bearing flavorful fruit will in turn grow into a tree laden with flavorful fruit, and even members of wolf packs show familial resemblances (Figure 1-1). Although people were confident in these observations, they were left to wonder as to the underlying mechanism. The Native American Hopi tribe of the Southwestern United States understood that if they planted a red kernel of maize in their fields, it would grow into a plant that also gave red kernels. The same was true for blue, white, or yellow kernels. So they thought of the kernel as a message to the gods in the Earth about the type of maize the Hopi farmers hoped to harvest. Upon receiving this message, the gods would faithfully return them a plant that produced kernels of the desired color. In the 1800s in Europe, horticulturalists, animal breeders, and biologists also sought to explain the resemblance between parents and offspring. A commonly held view at that time was the blending theory of inheritance, or the belief that inheritance worked like the mixing of fluids such as paints. Red and white paints, when mixed, give pink; and so a child of one tall parent and one short parent could be expected to grow to a middling height. While blending theory seemed to work at times, it was also clear that there were exceptions, such as tall children born to parents of average height. Blending theory also provided no mechanism by which the “heredity fluids” it imagined, once mixed, could be separated—the red and white paints cannot be reconstituted from the pink. Thus, the long-term expectation of blending theory over many generations of intermating among individuals is that all members of the population will come to express the same average value of a trait. Clearly, this is not how nature works. Human populations have people with a range of
1.1 The Birth of Genetics 3
heights, from short to tall, and we have not all narrowed in on a single average height despite the many generations that human populations have dwelled on Earth.
Gregor Mendel
Gregor Mendel—A monk in the garden While the merits and failings of blending theory were being debated, Gregor Mendel, an Austrian monk, was working to understand the rules that govern the transmission of traits from parent to offspring after hybridization among different varieties of pea plants (Figure 1-2). The setting for his work was the monastery garden in the town of Brünn, Austria (Brno, Czech Republic, today). From 1856 to 1863, Mendel cross-pollinated or intermated different varieties of the pea plant. One of his experiments involved crossing a pea variety with purple flowers to one with white flowers (Figure 1-3). Mendel recorded that the first hybrid generation
One of Mendel’s experiments F I G U R E 1-2 Gregor Mendel was an Parents
Austrian monk who discovered the laws of inheritance. [ James King-Holmes/Science
Source.]
Two gene copies
First-generation hybrid
Self-pollination Second-generation hybrids Eggs
3 purple : 1 white
Sperm
F I G U R E 1- 3 The mating scheme for Mendel’s experiment involving the crossing of purple- and white-flowered varieties of pea plants. The purple and white circles signify the gene variants for purple vs. white flower color. Gametes carry one gene copy; the plants each carry two gene copies. The “×” signifies a cross-pollination between the purple- and white-flowered plants.
4 CHAPTER 1
The Genetics Revolution
F I G U R E 1- 4 Excerpts from Mendel’s
1866 publication, Versuche über PflanzenHybriden (Experiments on plant hybrids). [ Augustinian Abbey in Old Brno, Courtesy of the Masaryk University, Mendel Museum.]
Introduction to Genetic Analysis, 11e Figure 01.04 #104 04/15/14 05/01/14 Dragonfly Media Group
of offspring from this cross all had purple flowers, just like one of the parents. There was no blending. Then, Mendel selfpollinated the first-generation hybrid plants and grew a second generation of offspring. Among the progeny, he saw plants with purple flowers as well as plants with white flowers. Of the 929 plants, he recorded 705 with purple flowers and 224 with white flowers (Figure 1-4). He observed that there were roughly 3 purple-flowered plants for every 1 whiteflowered plant. How did Mendel explain his results? Clearly, blending theory would not work since that theory predicts a uniform group of first-generation hybrid plants with light purple flowers. So Mendel proposed that the factors that control traits act like particles rather than fluids and that these particles do not blend together but are passed intact from one generation to the next. Today, Mendel’s particles are known as genes. Mendel proposed that each individual pea plant has two copies of the gene controlling flower color in each of the cells of the plant body (somatic cells). However, when the plant forms sex cells, or gametes (eggs and sperm), only one copy of the gene enters into these reproductive cells (see Figure 1-3). Then, when egg and sperm unite to start a new individual, once again there will be two copies of the flower color gene in each cell of the plant body. Mendel had some further insights. He proposed that the gene for flower color comes in two gene variants, or alleles— one that conditions purple flowers and one that conditions white flowers. He proposed that the purple allele of the flower color gene is dominant to the white allele such that a plant with one purple allele and one white allele would have purple flowers. Only plants with two white alleles would have white flowers (see Figure 1-3). Mendel’s two conclusions, (1) that genes behaved like particles that do not blend together and (2) that one allele is dominant to the other, enabled him to explain the lack of blending in the first-generation hybrids and the reappearance of white-flowered plants in the second-generation hybrids with a 3 : 1 ratio of purple- to white-flowered plants. This revolutionary advance in our understanding of inheritance will be fully discussed in Chapter 2. How did Mendel get it right when so many others before him were wrong? Mendel chose a good organism and good traits to study. The traits he studied were all controlled by single genes. Traits that are controlled by several genes, as many traits are, would not have allowed him to discover the laws of inheritance so easily. Mendel was also a careful observer, and he kept detailed records of each of his experiments. Finally, Mendel was a creative thinker capable of reasoning well beyond the ideas of his times. Mendel’s particulate theory of inheritance was published in 1866 in the Proceedings of the Natural History Society of Brünn (see Figure 1-4). At that time, his work was noticed and read by some other biologists, but its implications and importance went unappreciated for over 30 years. Unlike Charles Darwin, whose discovery of the theory of evolution by natural selection made him worldrenowned virtually overnight, when Mendel died in 1884, he was more or less unknown in the world of science. As biochemist Erwin Chargaff put it, “There are people who seem to be born in a vanishing cap. Mendel was one of them.” K e y C o n c e p t Gregor Mendel demonstrated that genes behave like particles and not fluids.
1.1 The Birth of Genetics 5
Mendel rediscovered
William Bateson gave genetics its name
As the legend goes, when the British biologist William Bateson (Figure 1-5) boarded a train bound for a conference in London in 1900, he had no idea how profoundly his world would change during the brief journey. Bateson carried with him a copy of Mendel’s 1866 paper on the hybridization of plant varieties. Bateson had recently learned that biologists in Germany, the Netherlands, and Austria had each independently reproduced Mendel’s 3 : 1 ratio, and they each cited Mendel’s original work. This trio had rediscovered Mendel’s laws of inheritance. Bateson needed to read Mendel’s paper. By the time he stepped off the train, Bateson had a new mission in life. He understood that the mystery of inheritance had been solved. He soon became a relentless apostle of Mendel’s laws of inheritance. A few years later in 1905, Bateson coined the term genetics—the study of inheritance. The genetics revolution had begun. When Mendel’s laws of inheritance were rediscovered in 1900, a flood of new thinking and ideas was unleashed. Mendelism became the organizing principle for much of biology. There were many new questions to be asked about inheritance. Table 1-1 summarizes the chronology of seminal discoveries made over the coming decades and the chapters of this text that cover each of these topics. Let’s look briefly at a few of the questions and their answers that transformed the biological sciences. Where in the cell are Mendel’s genes? The answer came in 1910, when Thomas H. Morgan at Columbia University in New York demonstrated that Mendel’s genes are located on chromosomes—he proved the chromosome theory of inheritance. The idea was not new. Walter Sutton, who was raised on a farm in Kansas and later served as a surgeon for the U.S. army during WWI had proposed the chromosome theory of inheritance in 1903. Theodor Boveri, a German biologist, independently proposed it at the same time. It was a compelling hypothesis, but there were no experimental data to support it. This changed in 1910, when Morgan proved the chromosome theory of inheritance using Mendelian genetics and the fruit fly as his experimental organism. In Chapter 4, you will retrace Morgan’s experiments that proved genes are on chromosomes. Can Mendelian genes explain the inheritance of continuously variable traits like human height? While 3 : 1 segregation ratios could be directly observed for simple traits like flower color, many traits show a continuous range of values in secondgeneration hybrids without simple ratios like 3 : 1. In 1918, Ronald Fisher, the British statistician and geneticist, resolved how Mendelian genes explained the inheritance of continuously variable traits like height in people (Figure 1-6). Fisher’s core idea
F I G U R E 1- 5 William Bateson, the
British zoologist and evolutionist who introduced the term genetics for the study of inheritance and promoted Mendel’s work. [ SPL/Science Source.]
Continuous variation for height
F I G U R E 1- 6 Students at the Connecticut Agriculture College in 1914 show a range of heights. Ronald Fisher proposed that continuously variable traits like human height are controlled by multiple Mendelian genes. [ A. F. 4:10
4:11
5:0
5:1
5:2
5:3
5:4
5:5
5:6
5:7
5:8
5:9
5:10
5:11
6:0
6:1
6:2
Blakeslee, “Corn and Men,” Journal of Heredity 5, 11, 1914, 511–518.]
6 CHAPTER 1
The Genetics Revolution
Table 1-1 Key Events in the History of Genetics Y ear
Event Chapters
1865
Gregor Mendel showed that traits are controlled by discrete factors now known as genes.
2, 3
1869
Friedrich Miescher isolated DNA from the nuclei of white blood cells.
7
1903
Walter Sutton and Theodor Boveri hypothesized that chromosomes are the hereditary elements.
4
1905
William Bateson introduced the term “genetics” for the study of inheritance.
2
1908
G. H. Hardy and Wilhelm Weinberg proposed the Hardy–Weinberg law, the foundation for population genetics.
18
1910
Thomas H. Morgan demonstrated that genes are located on chromosomes.
4
1913
Alfred Sturtevant made a genetic linkage map of the Drosophila X chromosome, the first genetic map.
4
1918
Ronald Fisher proposed that multiple Mendelian factors can explain continuous variation for traits, founding the field of quantitative genetics.
19
1931
Harriet Creighton and Barbara McClintock showed that crossing over is the cause of recombination.
4, 16
1941
Edward Tatum and George Beadle proposed the one-gene—one-polypeptide hypothesis.
6
1944
Oswald Avery, Colin MacLeod, and Maclyn McCarty provided compelling evidence that DNA is the genetic material in bacterial cells.
7
1946
Joshua Lederberg and Edward Tatum discovered bacterial conjugation.
5
1948
Barbara McClintock discovered mobile elements (transposons) that move from one place to another in the genome.
15
1950
Erwin Chargaff showed DNA composition follows some simple rules for the relative amounts of A, C, G, and T.
7
1952
Alfred Hershey and Martha Chase proved that DNA is the molecule that encodes genetic information.
7
1953
James Watson and Francis Crick determined that DNA forms a double helix.
7
1958
Matthew Meselson and Franklin Stahl demonstrated the semiconservative nature of DNA replication.
7
1958
Jérôme Lejeune discovered that Down syndrome resulted from an extra copy of the 21st chromosome.
17
1961
François Jacob and Jacques Monod proposed that enzyme levels in cells are controlled by feedback mechanisms.
11
1961– 1967
Marshall Nirenberg, Har Gobind Khorana, Sydney Brenner, and Francis Crick "cracked" the genetic code.
9
1968
Motoo Kimura proposed the neutral theory of molecular evolution.
18, 20
1977
Fred Sanger, Walter Gilbert, and Allan Maxam invented methods for determining the nucleotide sequences of DNA molecules.
10
1980
Christiane Nüsslein-Volhard and Eric F. Wieschaus defined the complex of genes that regulate body plan development in Drosophila.
13
1989
Francis Collins and Lap-Chee Tsui discovered the gene causing cystic fibrosis.
4, 10
1993
Victor Ambrose and colleagues described the first microRNA.
13
1995
First genome sequence of a living organism (Haemophilus influenzae) published.
14
1996
First genome sequence of a eukaryote (Saccharomyces cerevisiae) published.
14
1998
First genome sequence of an animal (Caenorhabditis elegans) published.
14
2000
First genome sequence of a plant (Arabidopsis thaliana) published.
14
2001
The sequence of the human genome first published.
14
2006
Andrew Fire and Craig Mello win the Nobel prize for their discovery of gene silencing by double-stranded RNA. 8
2012
John Gurdon and Shinya Yamanaka win the Nobel prize for their discovery that just four regulatory genes can 8, 12 convert adult cells into stem cells.
1.1 The Birth of Genetics 7
F I G U R E 1-7 The one-gene–one-
The one-gene–one-enzyme model Gene A
Substrate
Enzyme A
Gene B
Ornithine
Enzyme B
enzyme model proposed that genes encode enzymes that carry out biochemical functions within cells. Tatum and Beadle proposed this model based on the study of the synthesis of arginine (an amino acid) in the bread mold Neurospora crassa.
Gene C
Citrulline
Enzyme C
Arginine
was that continuous traits are each controlled by multiple Mendelian genes. Fisher’s insight is known as the multifactorial hypothesis. In Chapter 19, we will dissect the mathematical model and experimental evidence for Fisher’s hypothesis. How do genes function inside cells in a way that enables them to control different states for a trait like flower color? In 1941, Edward Tatum and George Beadle proposed that genes encode enzymes. Using bread mold (Neurospora crassa) as their experimental organism, they demonstrated that genes encode the enzymes that perform metabolic functions within cells (Figure 1-7). In the case of the pea plant, there is a gene that encodes an enzyme required to make the purple pigment in the cells of a flower. Tatum and Beadle’s breakthrough became known as the one-gene–oneenzyme hypothesis. You’ll see how they developed this hypothesis in Chapter 6. What is the physical nature of the gene? Are genes composed of protein, nucleic acid, or some other substance? In 1944, Oswald Avery, Colin MacLeod, and Maclyn McCarty offered the first compelling experimental evidence that genes are made of deoxyribonucleic acid (DNA). They showed that DNA extracted from a virulent strain of bacteria carried the necessary genetic information to transform a nonvirulent strain into a virulent one. You’ll learn exactly how they demonstrated this in Chapter 7. How can DNA molecules store information? In the 1950s, there was something of a race among several groups of geneticists and chemists to answer this question. In 1953, James Watson and Francis Crick working at Cambridge University in England won that race. They determined that the molecular structure of DNA was in the form of a double helix—two strands of DNA wound side-by-side in a spiral. Their structure of the double helix is like a twisted ladder (Figure 1-8). The sides of the ladder are made of sugar and phosphate groups. The rungs of the ladder are made of four bases: adenine (A), thymine (T), guanine (G), and cytosine (C). The bases face the center, and each base is hydrogen bonded to the base facing it in the opposite strand. Adenine in one strand is always paired with thymine in the other by a double hydrogen bond, whereas guanine is always paired with cytosine by a triple hydrogen bond. The bonding specificity is based on the complementary shapes and charges of the bases. The sequence of A, T, G, and C represents the coded information carried by the DNA molecule. You will learn in Chapter 7 how this was all worked out. How are genes regulated? Cells need mechanisms to turn genes on or off in specific cell and tissue types and at specific times during development. In 1961, François Jacob and Jacques Monod made a conceptual breakthrough on this question. WorkIntroduction to Genetic Analysis, 11e ing on01.07 the genes Figure #123 necessary to metabolize the sugar lactose in the bacterium Escherichia coli, they demonstrated that genes have regulatory elements that regulate 04/03/14 Dragonfly Media Group gene expression—that is, whether a gene is turned on or off (Figure 1-9). The regulatory elements are specific DNA sequences to which a regulatory protein binds and acts as either an activator or repressor of the expression of the gene. In Chapter 11, you will explore the logic behind the experiments of Jacob and Monod with E. coli, and in Chapter 12, you will explore the details of gene regulation in eukaryotes.
FPO
8 CHAPTER 1
The Genetics Revolution
The structure of DNA O 5´ P O 3´ O O H N O H 5´ CH2 T N H N A O O 4´ 2´ 3´ 1´ 3´ O 1´ 2´ 4´ O O 5´ CH2 P O O O O O P O H N O CH2 O NC O GN H 3´
O
N H
O
O
CH2 O P O
O
O
N H
O
AN
H N T
O
O
O
O
CH2 O P O
O
N H C N
O
O O CH2 O
(a)
(b)
O P C 5´ O
P O O O 5´ CH 2
P O O
O H N G
O
O CH2
H N O H
3´
F I G U R E 1- 8 (a) The double-helical structure of DNA, showing the sugar–phosphate
backbone in blue and paired bases in brown. (b) A flattened representation of DNA showing how A always pairs with T and G with C. Each row of dots between the bases represents a hydrogen bond.
How is the information stored in DNA decoded to synthesize proteins? While the discovery of the double-helical structure of DNA was a watershed for biology, many details were still unknown. Precisely how information was encoded into DNA and how it was decoded to form the enzymes that Tatum and Beadle had shown to be the workhorses of gene action remained unknown. Over the years 1961 through 1967, teams of molecular geneticists and chemists working in several countries answered these questions when they “cracked the genetic code.” What this means is that they deduced how a string of DNA nucleotides, each with one of four different bases (A, T, C, or G), encodes the set of 20 different amino acids that are the building blocks of proteins. They also discovered that there is a messenger molecule made of ribonucleic acid (RNA) that carries information in the DNA in the nucleus to the cytoplasm where proteins are synthesized. By 1967, the basic flowchart for information transmission in cells was known. This flowchart is called the central dogma of molecular biology. K e y C o n c e p t The rediscovery of Mendel’s laws launched a new era in which geneticists resolved many fundamental questions about the nature of the gene and the flow of genetic information within cells. During this era, geneticists learned that genes reside on chromosomes and are made of DNA. Genes encode proteins that conduct the basic enzymatic work within cells.
1.1 The Birth of Genetics 9
Genes have regulatory and coding regions Regulatory protein
GGGCCC Regulatory element
RNA polymerase complex Direction of transcription Site where the RNA polymerase complex binds
F I G U R E 1- 9 The structure of a protein-coding gene showing a regulatory DNA element (GGGCCC) to which a regulatory protein binds, the promoter region where the RNA polymerase complex binds to initiate transcription, and a protein-coding region
Protein coding sequence
The central dogma of molecular biology In 1958, Francis Crick introduced the phrase “central dogma” to represent the flow of genetic information within cells from DNA to RNA to protein, and he drew a simple diagram to summarize these relationships (Figure 1-10a). Curiously, Crick chose the word dogma thinking that it meant “hypothesis,” which was his intention, unaware that its actual meaning is “a belief that is to be accepted without doubt.” Despite this awkward beginning, the phrase had an undeniable power and it has survived. Figure 1-10b captures much of what was learned about the biochemistry of inheritance from 1905 until 1967. Let’s review the wealth of knowledge that this simple figure captures. At the left, you see DNA and a circular arrow representing DNA replication, the process by which a copy of the DNA is produced. This process enables each of the two daughter cells that result from cell division to have a
Information transfer among biological molecules Replication DNA
Transcription
RNA
Translation
Protein
(a) Protein
Introduction to Genetic Analysis, 11e Figure 01.09 #125 04/03/14 DNA 05/01/14 Dragonfly Media Group
Replication (DNA synthesis)
mRNA
Transcription (RNA synthesis)
(b) F I G U R E 1-10 (a) One version of Francis Crick’s sketch of the central dogma, showing information flow between biological molecules. The circular arrow represents DNA replication, the central straight arrow represents the transcription of DNA into RNA, and the right arrow the translation of RNA into protein. (b) More detailed sketch showing how the two strands of the DNA double helix are independently replicated, how the two strands are disassociated for transcription, and how the messenger RNA (mRNA) is translated into protein at the ribosome.
Ribosome
Translation (protein synthesis)
10 CHAPTER 1
The Genetics Revolution
complete copy of all the DNA in the parent cell. In Chapter 7, you will explore the details of the structure of DNA and its replication. Another arrow connects DNA to RNA, symbolizing how the sequence of base pairs in a gene (DNA) is copied to an RNA molecule. The process of RNA synthesis from a DNA template is called transcription. One class of RNA molecules made by transcription is messenger RNA, or mRNA for short. mRNA is the template for protein synthesis. In Chapter 8, you’ll discover how transcription is accomplished. The final arrow in Figure 1-10b connects mRNA and protein. This arrow symbolizes protein synthesis, or the translation of the information in the specific sequence of bases in the mRNA into the sequence of amino acids that compose a protein. Proteins are the workhorses of cells, comprising enzymes, structural components of the cell, and molecules for cell signaling. The process of translation takes place at the ribosomes in the cytoplasm of each cell. In Chapter 9, you will learn how the genetic code is written in three-letter words called codons. A codon is a set of three consecutive nucleotides in the mRNA that specifies an amino acid in a protein. CGC specifies the amino acid arginine, AGC specifies serine, and so forth. Since Crick proposed the central dogma, additional pathways of genetic information flow have been discovered. We now know that there are classes of RNA that do not code for proteins, instances in which mRNA is edited after transcription, and cases in which the information in RNA is copied back to DNA (see Chapters 8, 9, and 15).
1.2 After Cracking the Code With the basic laws of inheritance largely worked out by the end of the 1960s, a new era of applying genetic analysis to a broad spectrum of biological questions flourished. To this end, much effort has been and continues to be invested in developing the resources and tools to address these questions. Geneticists focused their research on a small number of species known as “model organisms” that are well suited for genetic analysis. They also developed an impressive array of tools for manipulating and analyzing DNA.
Model organisms Geneticists make special use of a small set of model organisms for genetic analysis. A model organism is a species used in experimental biology with the presumption that what is learned from the analysis of that species will hold true for other species, especially other closely related species. The philosophy underlying the use of model organisms in biology was wryly expressed by Jacques Monod: “Anything found to be true of E. coli must also be true of elephants.”1 As genetics matured and focused on model organisms, Mendel’s pea plants fell to the wayside, but Morgan’s fruit flies rose to prominence to become one of the most important model organisms for genetic research. New species were added to the list. An inconspicuous little plant that grows as a weed called Arabidopsis thaliana became the model plant species and a minute roundworm called Caenorhabditis elegans that lives in compost heaps became a star of genetic analysis in developmental biology (Figure 1-11). What features make a species suitable as a model organism? (1) Small organisms that are easy and inexpensive to maintain are very convenient for research. So fruit flies are good, blue whales not so good. (2) A short generation time is imperative because geneticists, like Mendel, need to cross different strains and then study their 1F.
Jacob and J. Monod, Cold Spring Harbor Quant. Symp. Biol. 26, 1963, 393.
1.2 After Cracking the Code 11
Model organisms are dispersed across the tree of life
Fruit fly Drosophila melanogaster
Nematode Caenorhabditis elegans
Mouse Mus musculus
Yeast Saccharomyces cerevisiae
Mouse-eared cress Arabidopsis thaliana
Eukaryotes Mycoplasma gentalium
Archaea
Bacillus subtilis
Helicobacter pylori
E. coli
Eubacteria FIGURE 1-11 The tree shows evolutionary relationships among the major groups of
organisms: Bacteria, Archaea, and Eukaryota (plants, fungi, and animals). [ (Clockwise, from top,
FPO
center) Sinclair Stammers/Science Source; SciMAT/Science Source; Darwin Dale/Science Source; Biophoto Associates/Science Photo Library; Imagebroker.net/SuperStock; © blickwinkel/Alamy.]
first- and second-generation hybrids. The shorter the generation time, the sooner the experiments can be completed. (3) A small genome is useful. As you will learn in Chapter 15, some species have large genomes and others small genomes in terms of the total number of DNA base pairs. Much of the extra size of large genome species is composed of repetitive DNA elements between the genes. If a geneticist is looking for genes, these can be more easily found in organisms with smaller genomes and fewer repetitive elements. (4) Organisms that are easy to cross or mate and that produce large numbers of offspring are best. As you read this textbook, you will encounter certain organisms over and over. Organisms such as Escherichia coli (a bacterium), Saccharomyces cerevisiae (baker’s yeast), Caenorhabditis elegans (nematode or roundworm), Drosophila melanogasIntroduction to Genetic Analysis, 11e ter (fruit fly), and Mus musculus (mice) have been used repeatedly in experiments Figure 01.11 #127 and revealed much of what we know about how inheritance works. Model organ04/04/14 Dragonfly Group isms can Media be found on diverse branches of the tree of life (see Figure 1-11), representing bacteria, fungi, algae, plants, and invertebrate and vertebrate animals.
12 CHAPTER 1
The Genetics Revolution
This diversity enables each geneticist to use a model best suited to a particular question. Each model organism has a community of scientists working on it who share information and resources, thereby facilitating each other’s research. Mendel’s experiments were possible because he had several different varieties of pea plants, each of which carried a different genetic variant for traits such as purple versus white flowers, green versus yellow seeds, or tall versus dwarf stems. For each of the model species, geneticists have assembled large numbers of varieties (also called strains or stocks) with special genetic characters that make them useful in research. There are strains of fruit flies that have trait variants such as red versus white eyes. There are strains of mice that are prone to develop specific forms of cancer or other disease conditions such as diabetes. For baker’s yeast, there is a collection of nearly 5000 deletion stocks, each of these having just one gene deleted from the genome. These stocks enable geneticists to study the function of each gene by examining how yeast is affected when the gene is removed. Since baker’s yeast has about 6000 total genes, this collect of 5000 deletion stocks covers most of the genes in the genome. The different strains of each model organism are available to researchers through stock centers that maintain and distribute the strains. Lists of available stocks are on the Internet (see Appendix B). To view an example for mouse stocks, go to the link http://jaxmice.jax.org/. Then, click the “Find JAX mice” button at the top of the page. Next, enter the word “black” in the search field and click the Search button. Now, click the “C57BL/6J” link. You will see an image and information on a commonly used C57-Black mouse strain. Other search terms such as “albino” or “obese” will link you with strains with other features. K e y C o n c e p t Most genetic studies are performed on one of a limited number of model organisms that have features that make them especially suited for genetic analysis.
Tools for genetic analysis Geneticists and biochemists have also created an incredible array of tools for characterizing and manipulating DNA, RNA, and proteins. Many of these tools are described in Chapter 10 or in other chapters relevant to a specific tool. There are a few themes to mention here. First, geneticists have harnessed the cell’s own machinery for copying, pasting, cutting, and transcribing DNA, enabling researchers to perform these reactions inside test tubes. The enzymes that perform each of these functions in living cells have been purified and are available to researchers: DNA polymerases can make a copy of a single DNA strand by synthesizing a matching strand with the complementary sequence of A’s, C’s, G’s, and T’s. Nucleases can cut DNA molecules in specific locations or degrade an entire DNA molecule into single nucleotides. Ligases can join two DNA molecules together end-to-end. Using DNA polymerase or other enzymes, DNA can also be “labeled” or “tagged” with a fluorescent dye or radioactive element so that the DNA can be detected using a fluorescence or radiation detector. Second, geneticists have developed methods to clone DNA and the genes it encodes. Here, cloning refers to making many copies (clones) of a DNA molecule. The common way of doing this involves isolating a relatively small DNA molecule (up to a few thousand base pairs in length) from an organism of interest. The DNA molecule might be an entire gene or a portion of a gene. The molecule is inserted into a host organism (often E. coli) where it is replicated many times by the host’s DNA polymerase. Having many copies of a gene is important for a vast array of experiments used to characterize and manipulate it. Third, geneticists have developed methods to insert foreign DNA molecules into the genomes of many species, including those of all the model organisms.
1.2 After Cracking the Code 13
This process is called transformation, and it is possible, for instance, to transform genes from one species into the genome of another. The recipient species then becomes a genetically modified organism (GMO). Figure 1-12 shows a tobacco plant in which a gene from the firefly was inserted, enabling the tobacco plant to emit light or glow in the dark. Fourth, geneticists have developed a large set of methods based on hybridizing DNA molecules to one another (or to RNA molecules). The two complementary strands of DNA in the double helix are bound together by hydrogen bonds, either G ≡ C or A = T. These bonds can be broken by heat (denatured) in an aqueous solution to give two single-stranded DNA molecules (Figure 1-13a). When the solution is cooled under controlled conditions, DNA molecules with complementary strands will preferentially hybridize with one another. DNA hybridization methods have enabled many discoveries. For example, the cloned DNA of a gene can be tagged with a fluorescent dye and then hybridized to chromosomes fixed on a microscope slide, revealing the chromosome on which the gene is located (Figure 1-13b). Fifth, geneticists and biochemists have developed multiple methods for determining the exact sequence of all the A’s, C’s, G’s, and T’s in the genomes, chromosomes, or genes of an organism. The process used to decipher the exact sequence of A’s, C’s, G’s, and T’s in a DNA molecule is called DNA sequencing, and it has allowed geneticists to read the language of life. Finally, over the last 20 years, researchers have created molecular and mathematical tools for analyzing the entire genome of an organism in a single experiment. These efforts gave birth to the field of genomics—the study of the structure and function of entire genomes (see Chapter 14). Genomic tools have enabled geneticists to assemble mind-boggling amounts of information on model organisms, including the complete DNA sequence of their genome, lists of all their genes, catalogs of variants in these genes, data on the cell and tissue types in which each gene is expressed, and much more. To get an idea of what is available, try browsing Fly Base (http:// flybase.org/), the genomic data site for the fruit fly (see also Appendix B).
Genetically modified tobacco
F I G U R E 1-12 This genetically modified
tobacco plant has a gene from the firefly inserted into its genome, giving it the capability to emit light. [ D. W. Ow et al., “Transient and Stable Expression of the Firefly Luciferase Gene in Plant Cells and Transgenic Plants,” Science 234, 4778, 1986, 856–859.]
K e y C o n c e p t Progress in genetics has both produced and been catalyzed by the development of molecular and mathematical tools for the analysis of single genes and whole genomes.
Strands of nucleic acids hybridize to complementary sequences 5′
5′ 3′
3′
Heat Denature
5′ 3′ (a)
5′ 3′
Cool Anneal
3′
5′
5′ 3′ (b)
F I G U R E 1-13 (a) The two strands of the DNA double helix can be dissociated by heat in aqueous solutions. Upon cooling under controlled conditions, strands reassociate, or hybridize, with their complement. (b) A cloned copy of the human BAPX1 gene was tagged with a green fluorescent dye. The fluorescent-tagged DNA was then denatured and allowed to hybridize to the chromosomes in a single cell. The fluorescent-tagged clone hybridized to the location on chromosome 4 (green fluorescent regions) where the gene is located. [ (b) C. Tribioli and T. Lufkin, “Molecular cloning, chromosomal mapping and developmental expression of BAPX1, a novel human homeobox-containing gene homologous to Drosophila bagpipe,” Gene, 203, 2, 1997, 225–233, Fig. 6, © Elsevier.]
Au-ED,
14 CHAPTER 1
The Genetics Revolution
1.3 Genetics Today In an interview in 2008, Princeton University geneticist Leonid Kruglyak remarked, “You have this clear, tangible phenomenon in which children resemble their parents. Despite what students get told in elementary-school science, we just don’t know how that works.” Although Kruglyak’s remark might seem disparaging to the progress made in the understanding of inheritance over the last 100 years, this was certainly not his intention. Rather, his remark highlights that despite the paradigm-shifting discoveries of the nineteenth and twentieth centuries, enigmas abound in genetics and the need for new thinking and new technologies has never been greater. Mendel, Morgan, Fisher, Watson, Crick, and many other others (see Table 1-1) delimited the foundation of the laws of inheritance, but the details that rest atop that foundation remain obscure in many ways. The six feet of DNA in the single cell of a human zygote encodes the information needed to transform that cell into an adult, but exactly how this works is understood only in the sparsest details. In this section, we will review four recent advances in genetics—discoveries of enough importance and general interest that they were featured in the popular press. Reading about these discoveries will both reveal the power of genetics to answer critical questions about life and highlight how this knowledge can be applied to addressing problems in society. This textbook and the course of study in which you are engaged should convey a dual message—the science of genetics has profoundly changed our understanding of life, but it is also a youthful field in the midst of a dynamic phase of its development.
From classical genetics to medical genomics Meet patient VI-1 (Figure 1-14a). Her name is Louise Benge, and as a young woman, she developed a crippling illness. Starting in her early 20s, she began to experience
→
Louise Benge has an undiagnosed disease
(a)
(b)
F I G U R E 1-14 (a) Louise Benge developed an undiagnosed disease as a young woman. (b) An X ray revealed that Louise Benge’s disease condition caused calcification of the arteries in her legs. [ (a) Jeannine Mjoseth, NHGRI/www.genome.gov; (b) National Human Genome Research Institute (NHGRI).]
1.3 Genetics Today 15
excruciating pain in her legs after walking as little as a Tracing a disease gene through a family tree city block. At first, she ignored the pain, then spoke with her primary care physician, and later visited a long line of ? ? I specialists. She was given a battery of tests and X rays, 1 2 and these revealed the problem—her arteries from her II aorta on down to her legs were calcified, clogged with calcium phosphate deposits (Figure 1-14b). It was a disease for which her doctors had no name and no therapy. She III had a disease, but not a diagnosis. There was only one thing left to do; her primary care physician referred Benge to the Undiagnosed Diseases Program (UDP) at IV the National Institutes of Health in Bethesda, Maryland. The UDP is a group of MDs and scientists that has connections with specialists throughout the National V 1 2 Institutes of Health in every imaginable field of medicine. This is the team that is asked to tackle the most chalVI lenging cases. Working with Benge, the UDP team 1 2 3 4 5 subjected her to nearly every test in their arsenal, and soon they found the underlying defect that caused her VII disease. Benge had a very low level of an enzyme called CD73. This enzyme is involved in signaling between cells, and specifically it sends a signal that blocks calcification. Now the UDP doctors F I G U R E 1-15 Family tree or pedigree showing the inheritance of the mutant could give Benge a diagnosis. They named her disease “arterial calcification due to gene causing arterial calcification due to deficiency of CD73,” or ACDC. deficiency of CD73 (ACDC). Squares are What intrigued the UDP team about Benge’s case was that she was not alone males, and circles are females. Horizontal in having this disease. Benge had two brothers and two sisters, and all of them had lines connecting a male and female are arterial calcification. Remarkably, however, Benge’s parents were unaffected. matings. Vertical lines connect a mating Moreover, Benge and her siblings all had children and none of these children had pair to its offspring. Roman numerals arterial calcification. This pattern of inheritance suggested that the underlying designate generations; Arabic numerals designate individuals within generations. cause might be genetic. Specifically, it suggested that Benge and all of her siblings Half-filled squares or circles indicate an inherited two defective copies of either CD73 or a gene that influences CD73 individual carrying one copy of the mutant expression—one from their mother and one from their father. A person with one gene. Filled squares or circles indicate an good copy and one defective copy can be normal, but if both of a person’s copies individual with two copies of the mutant are defective, then they lack the function that the gene provides. The situation is gene and who have the ACDC disease. just like Mendel’s white-flowered pea plants. Since the functional allele is domiEither individual I-1 or I-2 must have nant to the dysfunctional allele, ACDC, like white flowers, only appears if an indicarried the mutant gene, but which one carried it is uncertain as indicated by the vidual carries two defective alleles. “?”. Blue arrow indicates Louise Benge. The UDP team delved further into Benge’s family history and learned that Red arrows show the path of the mutant Benge’s parents were third cousins (Figure 1-15). This revelation fit well with the gene through the generations. [ Data from idea that the cause was a defective gene. When a husband and wife are close relaC. St. Hilaire et al., New England Journal of tives such as third cousins, there is an increased chance that they will both have Medicine 364, 2011, 432–442.] inherited the same version of a defective gene from their common ancestor and that they will both pass on this defective gene to their children. Children with one copy of a defective gene are often normal, but a child who inherits a defective copy from both parents is likely to have a genetic disorder. In Figure 1-15, we can see how this works. Benge’s mother and father (indiIntroduction to Genetic Analysis, 11e viduals V-1 and V-2 in the figure) have the same great-great-grandparents (I-1 and Figure 01.15 #129 I-2). If one of these great-great-grandparents had a mutant gene for CD73, then it 04/01/14 could have been passed down over the generations to both 05/01/14 Benge’s mother and 05/12/14 father (follow the red arrows). After that, if Benge received the mutant copy from Dragonfly Media Group both her mother and her father, then both of her copies would be defective. Each of Benge’s siblings would also need to have inherited two mutant copies from their parents to explain the fact that they have ACDC. The chance of all of this happening is very small. If both of Benge’s parents had one mutant copy, then the chance that Benge and all four of her siblings would receive a mutant copy from
16 CHAPTER 1
BOX 1-1
The Genetics Revolution
Single Nucleotide Polymorphisms
Genetic variation is any difference between two copies of the same gene or DNA molecule. The simplest form of genetic variation one might observe at a single nucleotide site is a difference in the nucleotide base present, whether adenine, cytosine, guanine, or thymine. These types of variants are called single nucleotide polymorphisms (SNPs), and they are the most common type of variation in most, if not all, organisms. The figure shows two copies of a DNA molecule from the same region of a chromosome. Notice that the bases are the same in the two molecules except where one molecule has a CG pair and the other a TA pair. If we read strand 1 of the two molecules, then the top molecule has a “G” and the lower molecule an “A” at the SNP site.
Strand 1 A T G C
Copy 1
A T C C T A G G
G C A C T G
Strand 2 SNP Strand 1 A T G C
Copy 2
A T C T T A G A
G C A C T G
Strand 2
both parents is only 1 in 1024. In Chapter 2, you’ll learn how to calculate such probabilities. With this hint from the family history, the UDP team now knew where to look in the genome for the mutant gene. They needed to look for a segment on one of the chromosomes for which the copy that Benge inherited from her mother is identical to the copy she inherited from her father. Moreover, each of Benge’s siblings must also have two copies of this segment identical to Benge’s. Such regions are very rare in people unless their parents are related, as in the case of Benge since her parents are third cousins. Generally, a segment of a chromosome that is just a few hundred base pairs long will have several differences in the sequence of A’s, C’s, G’s, and T’s between the copy we inherited from our mother and the one we inherited from our father. Analysis, These differences are known as single Introduction to Genetic 11e nucleotide polymorphisms, or SNPs Figure 1UN2 #138 for short (see Box 1-1). 04/03/14 The UDP team used a new genomic technology, called a DNA microarray (see 05/01/14 Chapter 18), that allowed them to study one million base-pair positions across the Dragonfly Media Group genome. At each of these base-pair positions along the chromosomes, the team could see where Benge’s two chromosomal segments were identical, and whether all of Benge’s siblings also carried two identical copies in this segment. For Benge, a portion of only 1/512 of her genome is expected to have two identical copies, and the chance that all four of her siblings will also have the same two identical copies is far smaller. Looking over the genome-wide SNP data, the UDP team found exactly the type of chromosome segment for which they were looking. There was a small segment on one of Benge’s chromosomes for which she and her siblings all had the same two identical copies. Furthermore, they discovered that the gene that encodes the CD73 enzyme is located in this segment. This result suggested that Benge and her siblings all had two identical copies of the same defective CD73encoding gene. The team seemed to have found the needle in a haystack for which they were looking; however, there was one last experiment to perform. The team needed to identify the specific defect in the defective CD73 gene that Benge and her siblings had inherited. After determining the DNA sequence for the CD73 gene from Benge and her siblings, the team found the defect in the gene—“the smoking gun.” The defective gene encoded only a short, or truncated, protein—it did not encode the complete sequence of amino acids. One of the DNA
1.3 Genetics Today 17
codons with letters TCG that encodes the amino acid serine was mutated to TAG, which signals the truncation of the protein. The protein made from Benge’s version of the CD73 gene was truncated so it could not signal cells in the arteries to keep the calcification pathway turned off. Louise Benge’s journey from first experiencing pain in her legs to learning that she had a new disease called ACDC was a long one. The diagnosis of her disease was a triumph made possible by the integration of classic transmission genetics and genomics. Knowing the defect underlying the disease ACDC allowed the doctors to try a medication that they would never have considered before they knew that the cause was a defective CD73 enzyme. The medication in question is called etidronate, and it can substitute for CD73 in signaling cells to keep the calcification pathway turned off. Clinical trials with etidronate are currently underway for ACDC patients and are scheduled for completion in 2017. K e y C o n c e p t Classical transmission genetics provides the foundation for modern medical genetics. The integration of classical genetics and genomic technologies can allow the causes of inherited diseases to be readily identified.
Investigating mutation and disease risk Shortly after the rediscovery of Mendel’s work, the German physician Wilhelm Weinberg reported that there seems to be a higher incidence of short-limbed dwarfism (achondroplasia) among children born last in German families than among those born first. A few decades later, British geneticist J. B. S. Haldane observed another unusual pattern of inheritance. The genealogies of some British families suggested that new mutations for the blood-clotting disorder hemophilia tended to arise in men more frequently than in women. Taken together, these two observations suggested that the risk of an inherited disorder for a child is greater as the parents age and also that fathers are more likely than mothers to contribute new mutations to their children. Over the ensuing decades, Weinberg’s and Haldane’s observations were supported by other studies, but the data were not conclusive. Tracing a new mutation in a child to the father versus the mother was fraught with uncertainty, and there was a scarcity of families well-suited for the study of the link between parental age and new disease mutations. These factors prevented definitive conclusions on the relationship between parental age and the occurrence of new mutations. In 2012, advances in genomics and DNA sequencing technology (see Chapter 14) allowed new analyses proving that Weinberg’s and Haldane’s suspicions were correct and providing a very detailed picture of the origin of new mutations within families. Here is how it was accomplished. A team of geneticists in Iceland studied 78 “trios”—a family group of a mother, a father, and their child (Figure 1-16). For some families, they had data for three generations, including a child plus its parents and at least one set of grandparents. The researchers determined the complete genome sequence of each individual with DNA isolated from their blood cells, compiling genome sequences from a total of 219 individuals. Since each individual possesses two copies of every chromosome (i.e., two copies of the human genome), their data actually include the sequences of 438 genomes. With these genome sequences in hand, the researchers could comb through the data for new or de novo mutations—unique DNA variants that exist in a child but neither of its parents. Their focus was on point mutations, or a change of one letter in the DNA code to another that can occur during DNA replication (see Chapter 16). For example, a change of an adenosine (A) to a guanine (G) (Figure 1-17). The logic of the discovery process used by the Icelandic geneticists is outlined in Figure 1-17, which shows a segment of DNA for each member of a trio. Each
Family pedigrees Simple trio
Three-generation family
F I G U R E 1-16 Squares are males, and
circles are females. Horizontal lines indicate a mating. Vertical lines connect a mating pair to its offspring.
18 CHAPTER 1
The Genetics Revolution
Tracing the origin of a new point mutation Father
Mother Copy M1 • • C AGCAGA T TGC TGC T T TGT A TGAG • • Copy M2 • • C AGC TGA T TGC TGC T T TGT A TGAG • •
Copy F1 Copy F2
• • C AGC TGA T TGC TGC T T TGT AGGAG • • • • C AAC TGA T TGC TGC T T TGT A TGAG • •
Child Copy M1 • • C AGCAGA T TGC TGC T T TGT A TGAG • • Copy F2 • • C AAC TGA T TGC T TC T T TGT A TGAG • •
individual has two copies of the segment. Notice that copy M1 in the mother has a SNP (green letter) that distinguishes it from copy M2. Similarly, there are two SNPs (purple letters) that distinguish the father’s two copies of this segment. Comparing the child to the parents, we see that the child inherited copy M1 from its mother and copy F2 from its father. Look closer at the child’s two copies of the segment, and you’ll notice something else. There is a unique variant (red letter) that occurs in the child but neither of its parents. This is a de novo point mutation. It this case, it is a mutation from a guanine (G) to a thymine (T). We can see that the mutation arose in the father since it is on the F2 copy of the segment. Where and exactly when did the new mutation depicted in Figure 1-17 arise? Most of our bodies are composed of somatic cells that make up everything from our brain to our blood. However, we also have a special lineage of cells called the germline that divide to produce eggs in women and sperm in men. New mutations that arise in somatic cells as they divide during the growth and development of our bodies are not passed on to our offspring. However, a new mutation that occurs in the germline can be transmitted to the offspring. The mutation depicted in Figure 1-17 arose in the germline of the father. With the genome sequence data for the trios, the Icelandic geneticists made some pretty startling discoveries. First, among the 78 children in the study, they Introduction to Genetic Analysis,observed 11e a total of 4933 new point mutations. Each child carried about 63 unique Figure 01.17 #131 mutations that did not exist in its parents. Most of these occurred in parts of the 04/03/14 genome where they have only a small chance to pose a health risk, but 62 of the 05/01/14 4933 mutations caused potentially damaging changes to the genes such that they Dragonfly Media Group altered the amino acid sequence of the protein encoded. Second, among the mutations that could be assigned a parent of origin, there were on average 55 from the father for every 14 from the mother. The children were inheriting nearly four times as many new mutations from their fathers as their mothers. The Icelandic team had confirmed Haldane’s prediction made 90 years earlier. The genome sequences also allowed the team to test Weinberg’s prediction that the frequency of mutation rises with the age of the parents. For each trio, the researchers knew the ages of the mother and the father at the time of conception. When they investigated whether the frequency of mutation rises with the mother’s age when controlling for the age of the father, the team found no evidence that it did. Older mothers did not pass on more new point mutations to their offspring than younger ones. (Older mothers are known to produce more chromosomal aberrations than younger mothers, such as an extra copy of the 21st chromosome that causes Down syndrome; see Chapter 17.) Next, they examined the relationship between mutation and the age of the father when controlling for the age of the mother. Here, they found a powerful relationship. The older the father, the higher the frequency of new point mutations (Figure 1-18). In fact, for
F I G U R E 1-17 A short segment of DNA from one of the chromosomes is shown. Each individual has two copies of the segment. In the mother, these are labeled M1 and M2; in the father, F1 and F2. The child inherited copy M1 from its mother and F2 from its father. The version of F2 in the child carries a new point mutation (red). Single nucleotide polymorphisms (SNPs) that distinguish the different copies are shown in green (mother) and purple (father).
1.3 Genetics Today 19
Number of new mutations observed
The number of new point mutations increases with father’s age
100
F I G U R E 1-18 Plot of the number of
new point mutations in each child (y-axis) by the age of the child’s father (x-axis). Each dot represents one of the 78 children studied. The diagonal line indicates the rate of increase in new mutations with the father’s age. [ Data from A. Kong et al., Nature 488, 2012, 471–475.]
80
60
40 15
20
25
30
35
40
45
Age of father at conception of child (years)
each year of increase in his age, a father will pass on two additional new mutations to his children. A 20-year-old father will pass on about 25 new mutations to each of his children, but a 40-year-old father will pass on about 65 new mutations. Weinberg’s observation made 100 years earlier was confirmed. Why does the age of the father matter, while that of the mother seems to have no effect on the frequency of new point mutations? The answer lies in the different ways by which men and women form gametes. In women, as in the females of other mammals, the process of making eggs takes place largely before a woman is born. Thus, when a woman is born she possesses in her ovaries a set of egg precursor cells that will mature into egg cells without further rounds of DNA replication. For a woman, from the point when she was conceived until the formation of the egg cells in her ovaries, there are about 24 rounds of cell division, 23 of which Introduction to Genetic Analysis, 11e have a round of chromosome (DNA) replication and an opportunity for a copying Figure 01.18 #132 error or mutation. All 23 of these rounds of chromosome replication occur before 04/02/14 a05/01/14 woman is born, so there are no additional rounds after her birth and no chance for additional as she ages. Thus, older mothers contribute no more new Dragonfly Mediamutations Group point mutations to their children than younger mothers. Sperm production is altogether different. The cell divisions that produce sperm continue throughout a man’s life, and there are many more rounds of cell division in sperm formation than in egg formation. Sperm produced by 20-yearold men will have experienced about 150 rounds of DNA replication from the time of the man’s conception, almost seven times as many as for the eggs produced by 20-year-old women. By the time a man is age 40, his sperm will have a history that involves over 25 times as many rounds of DNA replication as for eggs in a woman of the same age. Thus, there is much more risk of new point mutations occurring during these extra rounds of cell division and DNA replication with the increase in the age of the father. There is one final twist to the remarkable project performed by the Icelandic geneticists. The 78 trios that they studied were chosen because the children in most of the trios had inherited disorders. These included 44 children with autism spectrum disorder and 21 with schizophrenia. For all these children, there were no other cases of these disorders among their relatives, suggesting that their
20 CHAPTER 1
The Genetics Revolution
condition was due to a new mutation. As anticipated, the researchers observed a correlation between the father’s age and disease risk—older fathers were more likely to have children with autism and schizophrenia. In several cases, the DNA data for the child and parents also allowed the researchers to identify specific new mutations in genes that likely caused the disorder. For example, one child with autism inherited a new mutation in the EPH receptor B2 (EPHB2) gene that functions in the nervous system and in which a mutation had previously been found in an autistic child. Studies such as this can have important implications for individuals and society. Some men who intend to delay parenting until later in life might choose to freeze samples of their sperm while still young. This study also informs us that changes in society can impact the number of new mutations that enter the human gene pool. If men choose to delay fatherhood for postsecondary education or establishing their careers, there will be an associated increase in the number of new mutations among their children. It is common knowledge that infertility rises with age for women—as is often stated, a woman’s “biological clock” is ticking once she is past puberty. This work by the Icelandic geneticists informs us that a clock is ticking for men as well. K e y C o n c e p t Genome sequences of parents and their children clarify the factors that contribute to new point mutations. Fathers contribute four times as many new mutations to their offspring as do mothers. The number of new mutations passed on from a father to his children rises with the age of the father.
When rice gets its feet a little too wet Among the cereal crops, rice is unique. Whereas wheat, barley, maize, and the other grain crops grow solely in dry fields, rice is commonly grown in flooded fields called paddies (Figure 1-19). The ability of rice to grow in flooded fields offers it an advantage: rice can survive modest flooding (up to 25 cm of standing water) in the paddies, but most weeds cannot. So rice farmers can use flooding to control the weeds in their field while their rice thrives. The strategy works well where farmers have irrigation systems to control the water levels in their paddies and heavy rains do not exceed their capacity to
Rice growing in a flooded field or paddy
F I G U R E 1-19 Rice is grown in fields with standing water called paddies. Rice is adapted to tolerate modest levels of standing water, but the water suppresses the growth of weeds that could compete with the rice. [ © Dinodia/AGE Fotostock.]
1.3 Genetics Today 21
control these levels. If the water in the paddies gets too deep (greater than 50 cm) for a prolonged period, then the rice plants, like the weeds, can suffer or even die. Paddy agriculture, as practiced in the lowlands of India, Southeast Asia, and West Africa, relies on natural rainfall, rather than irrigation, to flood the fields. This circumstance poses a risk. When the rains are heavy, water depth in the paddies can exceed 50 cm and completely submerge the plants, causing rice plants to either suffer a loss in yield or simply die. Of the 60 million hectares of rain-fed lowland paddies, one-third experience damaging floods on a regular basis. The heavy rains and monsoons that flood the fields are estimated to cause a loss of rice worth more than US$1 billion each year. In India, Indonesia, and Bangladesh alone, 4 million tons of rice are lost to flooding each year, enough to feed 30 million people. Since this loss is mostly incurred by the poorest farmers, it can lead to malnourishment and even starvation. In the early 1990s, David Mackill, a plant geneticist and breeder at the International Rice Research Institute, had an idea about how to improve rice so that it could tolerate being submerged in flood waters. He identified a remarkable variety of rice called FR13A that could survive submergence and even thrive after the plants remained fully submerged in deep water for up to two weeks. Unfortunately, FR13A had a low yield and the quality of its grain was marginal. So Mackill set out to transfer FR13A’s genetic factor(s) for submergence tolerance into a rice variety with a higher yield and higher grain quality. He first crossed FR13A and a superior variety of rice and then for several generations crossed the hybrid plants back to the superior variety until he had created an improved form of rice that combined submergence tolerance and high yield. Mackill had achieved his initial goal of transferring submergence tolerance into a superior variety, but the genetic basis for why FR13A was submergence tolerant remained obscure. Was FR13A’s submergence tolerance controlled by many genes on multiple chromosomes, or might it be mostly controlled by just one gene? To delve into the genetic basis of submergence tolerance, Mackill and his team conducted a form of genetic analysis called quantitative trait locus (QTL) mapping (see Chapter 19). A QTL is a genetic locus that contributes incrementally or quantitatively to variation for a trait. Mendel’s gene for flower color had two categorical alleles: one for purple flowers and the other for white flowers. QTL have alleles that usually engender only partial changes such as the difference between a pale purple and a medium purple. Using QTL mapping, Mackill learned that the secret to FR13A exceptionalism was mostly due to a single genetic locus or QTL on one of the rice chromosomes. He named this locus SUB1 for “submergence tolerant.” With the chromosomal location of SUB1 revealed, it was time to delve even deeper and identify the molecular nature of SUB1. What type of protein did it encode? How did the allele of SUB1 found in FR13A allow the plant to cope with submergence? What is the physiological response that enables the plant to survive submergence? To address these questions, molecular geneticists Pamela Ronald at the University of California, Davis, and Julia Bailey-Serres at the University of California, Riverside, joined the team. Working with Mackill, this expanded team zeroed in on the chromosome segment containing the SUB1 QTL and determined that it encompasses a member of a class of genes called ethylene response factors (ERFs). ERF genes encode regulatory proteins that bind to regulatory elements in other genes and thereby regulate their expression. Thus, SUB1 is a gene that regulates the expression of other genes. Moreover, they determined that the allele of SUB1 in FR13A is switched on in response to submergence, while the allele of SUB1 found in submergence-sensitive varieties is not switched on by submergence.
22 CHAPTER 1
The Genetics Revolution
The next question was, how does switching on SUB1 enable FR13A to survive complete submergence? To answer this question, let’s review how ordinary rice plants respond to submergence. When a plant is completely submerged, oxygen levels in its cells drop to a low level, and the concentration of ethylene, a plant hormone, in the cells increases. Ethylene signals the plant to escape submergence by elongating its leaves and stems to keep its “head” above water. This escape strategy works fine as long as the water is not so deep that the plant fails to grow enough to position its stems and leaves above the flood waters. If the flood waters are too deep, then the plant cannot grow enough to escape. As a plant in such deeply flooded circumstances grows to escape the flood water, it uses up all its energy reserves (carbohydrates), becomes spindly and weak, and eventually dies. How does the FR13A variety manage to survive submergence while many other types of rice cannot? FR13A has a different strategy that could be called sit tight. In response to complete submergence, rather than attempt rapid growth to escape the flood, an FR13A plant using the sit-tight strategy becomes quiescent. It stops the elongation growth response, thereby preventing itself from burning up all its reserve carbohydrates and becoming weak and spindly. With the sit-tight strategy, a plant can remain in a quiescent, submerged state for up to two weeks and then emerge healthy and resume normal growth when the flood waters recede. The sit-tight strategy of FR13A is controlled by SUB1, which acts as the master switch or regulatory gene to activate this strategy. When the flood waters rise, the concentration of the plant hormone ethylene increases in plant cells. Because SUB1 is an ERF, it is switched on in response to the elevated ethylene levels. Then, the protein that SUB1 encodes orchestrates the plant’s response by switching on (or off) a battery of genes involved in plant growth and metabolism. In FR13A plants that become submerged, genes involved in stem and leaf elongation as part of the escape strategy are switched off, as are genes involved in mobilizing the energy reserves (carbohydrates) needed to fuel the escape strategy. Using the tools of molecular genetics and genomics such as DNA microarrays (see Chapters 10 and 14), the rice team was able to decipher the extensive catalog of genes controlling
Flood-intolerant and flood-tolerant rice
FIGURE 1-20 An Indian farmer with rice variety Swarna that is not tolerant to flooding (left) compared to variety Swarnasub1 that is tolerant (right). This field was flooded for 10 days. The photo was taken 27 days after the flood waters receded. [ Ismail et al., “The contribution of submergencetolerant (Sub 1) rice varieties to food security in flood-prone rainfed lowland areas in Asia,” Field Crops Research 152, 2013, 83–93, © Elsevier.]
1.3 Genetics Today 23
Yield (t ha–1)
organ elongation, carbon metabolism, flowering, and photoSUB1 gene increases rice yield under flooding synthesis that are regulated by SUB1 to achieve the sit-tight response. With the basic genetics of SUB1 elucidated, it was time 6.0 to put this knowledge to work. The team repeated Mackill’s 5.0 early breeding work to transfer the flood tolerance into a superior variety. Now, however, since they knew the precise 4.0 location of SUB1 on one of the chromosomes, they could transfer it into a superior variety with surgical precision. This 3.0 precision is important because it enabled the team to avoid 2.0 transferring other undesirable genes at the same time. For this project, they worked with a submergence-intolerant, but Swarna 1.0 Swarna-Sub1 superior, Indian variety, called Swarna, which is widely grown and favored by farmers. The new line they created is 0.0 0 5 10 15 20 25 30 called Swarna-Sub1, and it has lived up to expectations. Duration of submergence (days) Field trials showed a striking difference in plant survival and yield between Swarna and Swarna-Sub1 when there is complete submergence (Figure 1-20). As shown in Figure F I G U R E 1-2 1 Yield comparison 1-21, Swarna-Sub1 provides higher yield than the original Swarna under all differbetween variety Swarna that is not tolerant to flooding (purple circles) and variety ent levels of flooding. In various trials, the SUB1 improved yield between 1 to 3 Swarna-Sub1 that is tolerant (green tons of grain per hectare. circles). Yield in tons per hectare ( y-axis) With the support and sponsorship of international research organizations, versus duration of flooding in days (x-axis). governmental agencies, and philanthropies, Swarna-Sub1 and other superior vari[ Data from Ismail et al., “The contribution of eties carrying the SUB1 allele from FR13A have now been distributed to farmers. submergence-tolerant (Sub 1) rice varieties to In 2008, only 700 farmers were growing SUB1 enhanced rice, but by 2012, that food security in flood-prone rainfed lowland number had grown to 3.8 million farmers. By 2014, the number of farmers growareas in Asia,” Field Crops Research 152, 2013, 83–93.] ing rice with SUB1 should climb to 5 million, adding considerably to food security among some of the world’s poorest farmers. In the long run, the impact of the SUB1 research may not be limited to rice. Many crops are subjected to damaging floods that reduce yields or destroy the crop altogether. The genetic research on SUB1 has provided a deep understanding of the molecular genetics of how plants respond to flooding. With this knowledge, it will be possible to manipulate the genomes of other crop plants so that they too can withstand getting their feet a little too wet. K e y C o n c e p t Genetics and genomics are playing a leading role in
improving crop plants. The basic principles of genetics that you will learn during your genetics course are the foundation for these advances.
Recent evolution in humans One goal of genetics is to understand the rules that govern how genes and the information they encode change over the generations within populations. The genes in populations change over time for several different reasons. For example, as we have seen, mutation in the germline can cause a new gene variant or allele to occur in the next generation that was not present in the current generation. Another factor is Introduction to Genetic Analysis, 11e natural selection, which was first described by Charles Darwin. Briefly, if individuals Figure 01.21 #133 with a certain gene variant contribute more offspring to the next 04/01/14 generation than individuals who lack that variant, then the frequency of that variant will rise over 05/01/14 Media Group time in the population. The last three chapters of the text focus onDragonfly rules governing the transmission of genes from one generation to the next within populations. Over the past decade, evolutionary geneticists have described in remarkable detail how genetic changes have enabled human populations to adapt to the
24 CHAPTER 1
The Genetics Revolution
conditions of life on different parts of the globe. This work revealed that three factors have been particularly powerful in shaping the types of gene variants that occur in different human populations. These factors are (1) pathogens such as malaria or smallpox; (2) local climatic conditions including solar radiation, temperature, and altitude; and (3) diet, such as the relative amounts of meat, cereals, or dairy products eaten. In Chapter 20, you’ll learn how a genetic variant in the hemoglobin gene has enabled people in Africa to adapt to the ravages of malaria. Let’s look briefly at examples of genetic adaptations to climate and diet. We’ll start with a case of human adaptation to life at high altitude. Adaptation to high altitude In their effort to colonize the Andes mountains of South America, Spanish colonists established towns high up in the mountains near the settlements of the native peoples. Soon they realized something was wrong. Spanish parents were not producing children. At Potosi, Bolivia, which is situated 4000 meters above sea level, it was 53 years after the founding of the town before the first child was born to Spanish parents. As noted by the Spanish priest Father Cobo, “The Indians are healthiest and where they multiply the most prolifically is in these same cold air-tempers, which is quite the reverse of what happens to the children of the Spaniards, most of whom when born in such regions do not survive.”2 Unlike the Andean natives, the Spanish were experiencing chronic mountain sickness (CMS), a condition caused by their inability to obtain enough oxygen from the thin air of the mountains. Since early observations like these, geneticists have invested much effort into the study of human adaptation to high altitude in Tibetans are genetically adapted South America, Tibet, and Ethiopia. What enables the natives of to life at high elevation these regions to flourish while lowlanders who move to high elevations suffer the grave health consequences of CMS? Let’s look at the case in Tibet, where the Tibetan highlanders live at altitudes up to 4000 meters above sea level (Figure 1-22). The high Tibetan Plateau was colonized by people about 3000 years ago, and the people China who colonized Tibet are closely related to the modern Han Chinese. However, at high altitude, native Tibetans are far less likely Tibet than Han Chinese to experience CMS and conditions such as pulmonary hypertension and the associated formation of blood clots that underlie it. To understand the genetics of how Tibetans adapted to life at high elevation, a research team led by Cynthia Beall of Case Western Reserve University compared Tibetans to Han Chinese at over 500,000 SNPs across the genome. Since Tibetans and Chinese are closely related, one expects each SNP variant to occur at about the same frequency in both groups. If the T variant of a SNP occurs at a frequency of 10 percent in Han Chinese, it should also be at about 10 percent in Tibetans. However, if the variant is associated with improved health at high elevation, its frequency would have risen among Tibetans over the many generations since they colonized the Tibetan Plateau, because Tibetans with this variant would have been healthier and have had more surviving children than those who lacked it. Charles Darwin’s natural selection would be at work. When the research team analyzed their SNP data, the SNPs in one gene stood out. The gene is called EPAS1, and some SNPs in it F I G U R E 1-2 2 A young Tibetan woman. Inset shows the location of Tibet in Asia.
FPO
[ Stefan Auth/imagebroker/AGE Fotostock; (inset) Planet Observer/UIG/Getty Images.]
2 V. J. Vitzthum, “The home team advantage: Reproduction in women indigenous to high altitude,” Journal of Experimental Biology 204, 2001, 3141–3150.
1.3 Genetics Today 25
Tibetans have a special variant of the EPAS1 gene 9
EPAS1
8
Statistical test value
7 6 5 4
1
2
3
4
5
6
7
8
9
10
Chromosome F I G U R E 1-2 3 Twenty-two human chromosomes are arrayed from left to right. The y-axis shows results from a statistical test of whether there is a significant difference in SNP frequency between Tibetans and Han Chinese. Each small dot represents one of the SNPs that was tested. SNPs above the horizontal red line are significantly different. Only the SNPs in the EPAS1 gene show a significant difference. [ C. Beall et al. Proceedings of the National Academy of Sciences USA, 107, 25, 2010, 11459–11464, Fig. 1.]
occur at very different frequencies in Tibetans (87 percent) and Han Chinese (9 percent). Their results are shown in Figure 1-23. In this figure, the human chromosomes, numbered 1 through 22, are along the x-axis, and a measure of the difference in SNP variant frequency between Tibetans and Chinese is on the y-axis. Each dot represents a SNP. SNPs that fall above the horizontal red line are those for which the frequency difference between Tibetans and Han Chinese is so large that the gene near these SNPs must have provided some advantage to people who colonized the Tibetan Plateau. The SNPs in EPAS1 fall above this line. These results suggest that Tibetans have a special variant of EPAS1 that helps them adapt to life at high elevation. To understand this better, let’s first review what is known about EPAS1. This gene regulates the number of red blood cells (RBCs) that our bodies produce. Moreover, it regulates the number of RBCs in Introduction to Genetic Analysis, 11e response to the level of oxygen in our tissues. When oxygen levels in our tissues Figure 01.23 #000 are low, EPAS1 signals the body to produce more RBCs. 04/02/14 05/01/14 Why does EPAS1 direct our bodies to produce more RBCs when the oxygen Dragonfly Media Groupare low? The EPAS1 response to low oxygen may be how our levels in our tissues bodies normally respond to anemia (too few red blood cells). People with low RBC counts get too little oxygen in their tissues, and so EPAS1 could signal the body to make more RBCs to correct anemia. This mechanism could explain why people who live at low elevation need the EPAS1 gene. Now, let’s think about how a person from low elevation would respond if they move to high elevation. Because of the thin air at high elevation, their tissues would get less oxygen. If their bodies interpreted low oxygen due to thin air as a sign of anemia, then EPAS1 would try to correct the problem by signaling their
11
12
13
14 15 16 17 18 19 20 21 22
26 CHAPTER 1
The Genetics Revolution
body to make more RBCs. However, since they are not anemic and already have enough RBCs, their blood would become overloaded with RBCs. Too many RBCs can cause pulmonary hypertension and the formation of blot clots, the conditions underlying CMS. Finally, how could a new variant of EPAS1 have helped Tibetans avoid CMS and adapt to high elevation? The answer to this question is not known, and it is now being actively investigated, but here is one hypothesis. Unlike lowlanders, Tibetans maintain relatively normal levels of RBCs at high elevation, and they have a lower risk of blot clot formation and pulmonary hypertension than lowlanders who move to high elevation. Thus, the Tibetan version of EPAS1 may no longer cause the overproduction of RBCs at high elevation, while providing another mechanism to cope with the thin air. The Tibetan variant of EPAS1 helps them live at high elevation without suffering from CMS. Lactose tolerance Before the invention of agriculture about 10,000 to 12,000 years ago, human populations subsisted on foods harvested from nature by hunting wild animals and gathering wild fruits and vegetables. At that time, no human populations used dairy products. Cattle were yet to be domesticated, and methods for milking cows were not yet invented. Children nursed on mother’s milk, but as they aged, the gene that encodes the enzyme lactase, which enables children to digest milk sugar (lactose), was switched off. Once weaned, a child in pre-agricultural societies no longer needed the lactase enzyme, and so the lactase gene had a “switch” or regulatory element that turned it off during late childhood. With the origin of agriculture, cattle were domesticated from wild aurochs. The early farmers may have kept cattle as a source of meat at first. After milking was invented, milk offered another source of food. But there was a problem. Although children in these ancient societies could digest milk sugar, the adults could not. Adults could consume milk, but since they could not digest the lactose, they would experience bloating, cramps, and diarrhea. Adults who experience these symptoms from drinking milk are lactose intolerant. Importantly, because they could not digest milk sugar, they were not utilizing this source of nutrition. In ancient societies, where food could be scarce at times, the difference between life and death could hinge on making the best use of all available food sources. Yet, because the lactase gene is switched off in adults, adults could not digest milk sugar. Some human populations have lactase gene variants expressed in adults Lactase gene RNA polymerase complex Direction of transcription
OCT1 F I G U R E 1-2 4 Simplified diagram of the lactase gene showing a regulatory element and protein coding region. OCT1 is a protein thought to regulate expression of the lactase gene. SNP variants in the regulatory element are found in some parts of the world. These SNPs are associated with OCT1 binding to the element and expression of the lactase gene in adults.
Regulatory element
Site where the RNA polymerase complex binds
Protein coding sequence
AGA T AAGA T A ATGT AGCCCC TG G Arabia
T G Europe Ethiopa
DNA sequence of the regulatory element in most people around the world SNPs in the regulatory element found in several regions where adults drink milk
1.3 Genetics Today 27
Now, suppose a new mutation entered the population and that People in Europe are adapted to drink milk as adults this mutation allowed the lactase gene to be expressed in adults. Adults with this new mutation or variant could then Lactase persistence benefit from drinking milk in a way that adults who lacked this variant could not. Such a benefit could increase their chances to survive and have children, and over time the variant that provides lactase persistence into adulthood would become more common in the population. The scenario just described is what appears to have happened during human history in several areas of the world where people kept cattle (or camels) and used them for milk. It happened in Europe, the Middle East, and Africa. In Europe, some people have a variant of the lactase gene that has a “T” at a particular SNP, whereas people from other regions of the world have a “C” at this SNP. Recently, genetiHigh cists discovered that the “T” appears to be located in a regulatory element that controls when the lactase gene is turned on (Figure 1-24). People with the “T” variant have persistent expression of the lactase gene into adulthood, whereas people with the “C” variant have their lactase gene switched off Frequency after childhood. The “T” seems to enable a regulatory protein called OCT1 to bind near the lactase gene and thereby “T” lactase variant cause its expression in adults. Other variants that have the same effect appear to have arisen independently in the MidLow dle East and Africa. As shown in Figure 1-25, in northern Europe where cattle farming and dairy consumption are prominent, both lactase persistence and the “T” lactase variant that produce it are common, while these features are much less common in southern Europe. Geneticists infer that the early cattle farmers of northern Europe who had the “T” variant benefited from milk consumption, enabling them to survive and produce more offspring, and so this variant became more common in the population over time. Today, the “T” variant is at a frequency of 90 percent in northern Europe. Since milk was not as important a part of the diet in southern Europe, the T variant offered no special benefit and thus remained at a lower frequency (about 10 percent). These two examples highlight how human populations FIGURE 1-25 (a) Frequency in Europe of have evolved in recent times in response to the conditions of life such as the available lactase persistence, the expression of the food and climate. In the last three chapters of this text, you will learn the theory and lactase enzyme in adults. (b) Frequency in Europe of the T variant in the lactase methods used by geneticists to understand how populations evolve in response to gene that appears to control lactase their environment. You’ll learn how SNP data are gathered, how frequencies of varipersistence. [ (a) Adapted from Y. Itan et al., ants are calculated, and how comparisons are made to understand the forces that BMC Evolutionary Biology 10, 2010, 36. have influenced the types of gene variants that occur in different populations. (b) Adapted from A. Beja-Pereira et al. Nature Through this type of analysis, evolutionary geneticists have learned a vast amount Genetics 35, 2003, 311–313.] about how different species of plants, animals, fungi, and microbes have evolved and continue to evolve in response to the conditions in which they live. K e y C o n c e p t Evolutionary genetics provides the tools to document how to Genetic Analysis, 11e Introduction gene variants that provide a beneficial effect can rise in frequency in a population Figure 01.25 #000 04/04/14 and make individuals in the population better adapted to the environment in which 05/01/14 they live.
Dragonfly Media Group
28 CHAPTER 1
The Genetics Revolution
s u m m a ry As you begin your study of genetics, imagine yourself as a person at halftime on an amazing journey of discovery. The last 100 years have witnessed a remarkable revolution in human knowledge about how biological systems are put together and how they work. Genetics has been at the epicenter of that revolution. Genetic analysis has answered many fundamental questions about the transmission of genetic information within families, inside cells, and over the eons of evolutionary time. Yet, as you will learn, the discovery process in genetics has never been more dynamic and the pace of growth in knowledge never greater. Unanswered questions abound. • How do all the genes in the genome work together to transform a fertilized egg into an adult organism?
• How do cells manage to seamlessly orchestrate the incredibly complex array of interacting genes and biochemical reactions that are found within them? • How do genetic variants at hundreds or even thousands of genes control the yield of crop plants? • How can genetics guide both the prevention and treatment of cancer, autism, and other diseases? • How do genes give humans the capacity for language and consciousness? Genetic analysis over the next 100 years promises to help answer many questions like these.
key terms adenine (A) (p. 7) alleles (p. 4) blending theory (p. 2) chromosome theory (p. 5) codon (p. 10) complementary (p. 7) cytosine (C) (p. 7) DNA polymerase (p. 12) DNA sequencing (p. 13) dominant (p. 4) DNA replication (p. 9) gametes (p. 4)
gene (p. 4) gene expression (p. 7) genetically modified organism (GMO) (p. 13) genetics (p. 5) genomics (p. 13) guanine (G) (p. 7) ligase (p. 12) messenger RNA (mRNA) (p. 10) model organism (p. 10) multifactorial hypothesis (p. 7) nuclease (p. 12)
one-gene–one-enzyme hypothesis (p. 7) point mutation (p. 17) quantitative trait locus (QTL) (p. 21) regulatory element (p. 7) single nucleotide polymorphism (SNP) (p. 16) somatic cells (p. 4) thymine (T) (p. 7) transcription (p. 10) transformation (p. 13) translation (p. 10)
p r obl e m s Most of the problems are also available for review/grading through launchpad/iga11e. Working with the Figures
1. If the white-flowered parental variety in Figure 1-3 were crossed to the first-generation hybrid plant in that figure, what types of progeny would you expect to see and in what proportions? 2. In Mendel’s 1866 publication as shown in Figure 1-4, he reports 705 purple-flowered (violet) offspring and 224 white-flowered offspring. The ratio he obtained is 3.15 : 1 for purple : white. How do you think he explained the fact that the ratio is not exactly 3 : 1? 3. In Figure 1-6, the students have 1 of 15 different heights, plus there are two height classes (4′11″ and 5′ 0 ″) for which there are no observed students. That is a total of 17 height classes. If a single Mendelian gene can account for only two classes of a trait (such as purple or white flowers), how many Mendelian genes would be
http://www.whfreeman.com/
minimally required to explain the observation of 17 height classes? 4. Figure 1-7 shows a simplified pathway for arginine synthesis in Neurospora. Suppose you have a special strain of Neurospora that makes citrulline but not arginine. Which gene(s) are likely mutant or missing in your special strain? You have a second strain of Neurospora that makes neither citrulline nor arginine but does make ornithine. Which gene(s) are mutant or missing in this strain? 5. Consider Figure 1-8a. a. What do the small, blue spheres represent? b. What do the brown slabs represent? c. Do you agree with the analogy that DNA is structured like a ladder?
Problems 29
6. In Figure 1-8b, can you tell if the number of hydrogen bonds between adenine and thymine is the same as that between cytosine and guanine? Do you think that a DNA molecule with a high content of A + T would be more stable than one with high content of G + C? 7. Which of three major groups (domains) of life in Figure 1-11 is not represented by a model organism? 8. Figure 1-13b shows the human chromosomes in a single cell. The green dots show the location of a gene called BAPX1. Is the cell in this figure a sex cell (gamete)? Explain your answer. 9. Figure 1-15 shows the family tree, or pedigree, for Louise Benge (Individual VI-1) who suffers from the disease ACDC because she has two mutant copies of the CD73 gene. She has four siblings (VI-2, VI-3, VI-4, and VI-5) who have this disease for the same reason. Do all of the 10 children of Louise and her siblings have the same number of mutant copies of the CD73 gene, or might this number be different for some of the 10 children? B a s i c P r obl e m s
10. Below is the sequence of a single strand of a short DNA molecule. On a piece of paper, rewrite this sequence and then write the sequence of the complementary strand below it. GTTCGCGGCCGCGAAC Comparing the top and bottom strands, what do you notice about the relationship between them? 11. Mendel studied a tall variety of pea plants with stems that are 20 cm long and a dwarf variety with stems that are only 12 cm long. a. Under blending theory, how long would you expect the stems of first and second hybrids to be? b. Under Mendelian rules and assuming stem length is controlled by a single gene, what would you expect to observe in the second-generation hybrids if all the first-generation hybrids were tall? 12. If a DNA double helix that is 100 base pairs in length has 32 adenines, how many cytosines, guanines, and thymines must it have?
13. The complementary strands of DNA in the double helix are held together by hydrogen bonds: G ≡ C or A = T. These bonds can be broken (denatured) in aqueous solutions by heating to yield two single strands of DNA (see Figure 1-13a). How would you expect the relative amounts of GC versus AT base pairs in a DNA double helix to affect the amount of heat required to denature it? How would you expect the length of a DNA double helix in base pairs to affect the amount of heat required to denature it? 14. The figure at the bottom of the page shows the DNA sequence of a portion of one of the chromosomes from a trio (mother, father, and child). Can you spot any new point mutations in the child that are not in either parent? In which parent did the mutation arise? C h a ll e n g i n g P r obl e m s
15. a. There are three nucleotides in each codon, and each of these nucleotides can have one of four different bases. How many possible unique codons are there? b. If DNA had only two types of bases instead of four, how long would codons need to be to specify all 20 amino acids? 16. Fathers contribute more new point mutations to their children than mothers. You may know from general biology that people have sex chromosomes—two X chromosomes in females and an X plus a Y chromosome in males. Both sexes have the autosomes (A’s). a. On which type of chromosome (A, X, or Y) would you expect the genes to have the greatest number of new mutations per base pair over many generations in a population? Why? b. On which type of chromosome would you expect the least number of new mutations per base pair? Why? c. Can you calculate the expected number of new mutations per base pair for a gene on the X and Y chromosomes for every one new mutation in a gene on an autosome if the mutation rate in males is twice that in females? 17. For young men of age 20, there have been 150 rounds of DNA replication during sperm production as compared Father
Mother Copy M1 • • C AGC AGA T TGC TGC T T TGT A TGAG • • Copy M2 • • C AGC TGA T TGC TGC T T TGT A TGAG • •
Copy F1 Copy F2
• • C AGC TGA T TGC TGC T T TGT AGGAG • • • • C A A C TGA T TGC TGC T T TGT A TGAG • •
Child • • C AGC AGA T TGC TGC T T TGTC TGAG • • • • C AGC TGA T TGC TGC T T TGT AGGAG • •
3 0 CHAPTER 1
The Genetics Revolution
to only 23 rounds for a woman of age 20. That is a 6.5fold greater number of cell divisions and proportionately greater opportunity for new point mutations. Yet, on average, 20-year-old men contribute only about twice as many new point mutations to their offspring as do women. How can you explain this discrepancy? 18. In computer science, a bit stores one of two states, 0 or 1. A byte is a group of 8 bits that has 28 = 256 possible states. Modern computer files are often megabytes (106 bytes) or even gigabytes (109 bytes) in size. The human genome is approximately 3 billion base pairs in size. How many nucleotides are needed to encode a single byte? How large of a computer file would it take to store the same amount of information as a single human genome?
19. The human genome is approximately 3 billion base pairs in size. a. Using standard 8.5″ × 11″ paper with one-inch margins, a 12-point font size, and single-spaced lines, how many sheets of paper printed on one side would be required to print out the human genome? b. A ream of 500 sheets of paper is about 5 cm thick. How tall would the stack of paper with the entire human genome be? c. Would you want a backpack, shopping cart, or a semitrailer truck to haul around this stack?
344
2
Ch a p t e r
Single-Gene Inheritance
Learning Outcomes After completing this chapter, you will be able to • Discover a set of genes affecting a specific biological property of interest, by observing single-gene inheritance ratios of mutants affecting that property. • In the progeny of controlled crosses, recognize phenotypic ratios diagnostic of single-gene inheritance (1 : 1 in haploids, and 3 : 1, 1 : 2 : 1, and 1 : 1 in diploids). • Explain single-gene inheritance ratios in terms of chromosome behavior at meiosis. • Predict phenotypic ratios among descendants from crosses of parents differing at a single gene. • Propose reasonable hypotheses to explain dominance and recessiveness of specific alleles at the molecular level. The monastery of the father of genetics, Gregor Mendel. A statue of Mendel is visible in the background. Today, this part of the monastery is a museum, and the curators have planted red and white begonias in a grid that graphically represents the type of inheritance patterns obtained by Mendel with peas. [ Anthony Griffiths.]
• Apply the rules of single-gene inheritance to pedigree analysis in humans, and recognize patterns diagnostic of autosomal dominant, autosomal recessive, X-linked dominant, and X-linked recessive conditions.
outline
• Calculate risk of descendants inheriting a condition caused by a mutant allele in one or more specific ancestors.
2.1 Single-gene inheritance patterns 2.2 The chromosomal basis of single-gene inheritance patterns 2.3 The molecular basis of Mendelian inheritance patterns 2.4 Some genes discovered by observing segregation ratios 2.5 Sex-linked single-gene inheritance patterns 2.6 Human pedigree analysis
31
32 C H APTER 2 Single-Gene Inheritance
W
F i g u r e 2 -1 These photographs
show the range of mutant phenotypes typical of those obtained in the genetic dissection of biological properties. These cases are from the dissection of floral development in Arabidopsis thaliana (a) and hyphal growth in Neurospora crassa, a mold (b). WT = wild type. [ (a) George Haughn ; (b) Anthony Griffiths/Olivera Gavric.]
hat kinds of research do biologists do? One central area of research in the biology of all organisms is the attempt to understand how an organism develops from a fertilized egg into an adult—in other words, what makes an organism the way it is. Usually, this overall goal is broken down into the study of individual biological properties such as the development of plant flower color, or animal locomotion, or nutrient uptake, although biologists also study some general areas such as how a cell works. How do geneticists analyze biological properties? The genetic approach to understanding any biological property is to find the subset of genes in the genome that influence that property, a process sometimes referred to as gene discovery. After these genes have been identified, their cellular functions can be elucidated through further research. There are several different types of analytical approaches to gene discovery, but one widely used method relies on the detection of single-gene inheritance patterns, and that is the topic of this chapter. All of genetics, in one aspect or another, is based on heritable variants. The basic approach of genetics is to compare and contrast the properties of variants, and from these comparisons make deductions about genetic function. It is similar to the way in which you could make inferences about how an unfamiliar machine works by changing the composition or positions of the working parts, or even by removing parts one at a time. Each variant represents a “tweak” of the biological machine, from which its function can be deduced. In genetics, the most common form of any property of an organism is called the wild type, that which is found “in the wild,” or in nature. The heritable variants observed in an organism that differs from the wild type are mutants, individual organisms having some abnormal form of a property. As examples, the wild type and some mutants in two model organisms are shown in Figure 2-1. The alternative forms of the property are called phenotypes. In this analysis we distinguish a wild-type phenotype and a mutant phenotype. Compared to wild type, mutants are rare. We know that they arise from wild types by a process called mutation, which results in a heritable change in the DNA of a gene. The changed form of the gene is also called a mutation. Mutations are not always detrimental to an organism; sometimes they can be advantageous, but most often they have no observable effect. A great deal is known about the mechanisms of mutation (see Chapter 16), but generally it can be said that they arise from mistakes in cellular processing of DNA. Most natural populations also show polymorphisms, defined as the coexistence of two or more reasonably common phenotypes of a biological property,
Genetic analysis begins with mutants
(a)
lfy
WT
ap1
ap2
ap3
ag
WT
(b)
Single-Gene Inheritance 3 3
such as the occurrence of both red- and orange-fruited plants in a population of wild raspberries. Genetic analysis can (and does) use polymorphisms, but polymorphisms have the disadvantage that they generally do not involve the specific property of interest to the researcher. Mutants are much more useful because they allow the researcher to zero in on any property. Simply stated, the general steps of functional analysis by gene discovery are as follows: 1. Amass mutants affecting the biological property of interest. 2. Cross (mate) the mutants to wild type to see if their descendants show ratios of wild to mutant that are characteristic of single-gene inheritance. 3. Deduce the functions of the gene at the molecular level. 4. Deduce how the gene interacts with other genes to produce the property in question. Of these steps, only 1 and 2 will be covered in the present chapter. Gene discovery starts with a “hunt” to amass mutants in which the biological function under investigation is altered or destroyed. Even though mutants are individually rare, there are ways of enhancing their recovery. One widely used method is to treat the organism with radiation or chemicals that increase the mutation rate. After treatment, the most direct way to identify mutants is to visually screen a very large number of individuals, looking for a chance occurrence of mutants in that population. Also, various selection methods can be devised to enrich for the types sought. Armed with a set of mutants affecting the property of interest, one hopes that each mutant represents a lesion in one of a set of genes that control the property. Hence, the hope is that a reasonably complete gene pathway or network is represented. However, not all mutants are caused by lesions within one gene (some have far more complex determination), so first each mutant has to be tested to see if indeed it is caused by a single-gene mutation. The test for single-gene inheritance is to mate individuals showing the mutant property with wild-type and then analyze the first and second generation of descendants. As an example, a mutant plant with white flowers would be crossed to the wild type showing red flowers. The progeny of this cross are analyzed, and then they themselves are interbred to produce a second generation of descendants. In each generation, the diagnostic ratios of plants with red flowers to those with white flowers will reveal whether a single gene controls flower color. If so, then by inference, the wild type would be encoded by the wild-type form of the gene and the mutant would be encoded by a form of the same gene in which a mutation event has altered the DNA sequence in some way. Other mutations affecting flower color (perhaps mauve, blotched, striped, and so on) would be analyzed in the same way, resulting overall in a set of defined “flower-color genes.” The use of mutants in this way is sometimes called genetic dissection, because the biological property in question (flower color in this case) is picked apart to reveal its underlying genetic program, not with a scalpel but with mutants. Each mutant potentially identifies a separate gene affecting that property. After a set of key genes has been defined in this way, several different molecular methods can be used to establish the functions of each of the genes. These methods will be covered in later chapters. Hence, genetics has been used to define the set of gene functions that interact to produce the property we call flower color (in this example). This type of approach to gene discovery is sometimes called forward genetics, a strategy to understanding biological function starting with random single-gene mutants and ending with their DNA sequence and biochemical function. (We shall
3 4 C H APTER 2 Single-Gene Inheritance
see reverse genetics at work in later chapters. In brief, it starts with genomic analysis at the DNA level to identify a set of genes as candidates for encoding the biological property of interest, then induces mutants targeted specifically to those genes, and then examines the mutant phenotypes to see if they indeed affect the property under study.) K e y C o n c e p t The genetic approach to understanding a biological property is to discover the genes that control it. One approach to gene discovery is to isolate mutants and check each one for single-gene inheritance patterns (specific ratios of normal and mutant expression of the property in descendants).
Gene discovery is important not only in experimental organisms but also in applied studies. One crucial area is in agriculture, where gene discovery can be used to understand a desirable commercial property of an organism, such as its protein content. Human genetics is another important area: to know which gene functions are involved in a specific disease or condition is useful information in finding therapies. The rules for single-gene inheritance were originally elucidated in the 1860s by the monk Gregor Mendel, who worked in a monastery in the town of Brno, now part of the Czech Republic. Mendel’s analysis is the prototype of the experimental approach to single-gene discovery still used today. Indeed, Mendel was the first person to discover any gene! Mendel did not know what genes were, how they influenced biological properties, or how they were inherited at the cellular level. Now we know that genes work through proteins, a topic that we shall return to in later chapters. We also know that single-gene inheritance patterns are produced because genes are parts of chromosomes, and chromosomes are partitioned very precisely down through the generations, as we shall see later in the chapter.
2.1 Single-Gene Inheritance Patterns Recall that the first step in genetic dissection is to obtain variants that differ in the property under scrutiny. With the assumption that we have acquired a collection of relevant mutants, the next question is whether each of the mutations is inherited as a single gene.
Mendel’s pioneering experiments The first-ever analysis of single-gene inheritance as a pathway to gene discovery was carried out by Gregor Mendel. His is the analysis that we shall follow as an example. Mendel chose the garden pea, Pisum sativum, as his research organism. The choice of organism for any biological research is crucial, and Mendel’s choice proved to be a good one because peas are easy to grow and breed. Note, however, that Mendel did not embark on a hunt for mutants of peas; instead, he made use of mutants that had been found by others and had been used in horticulture. Moreover, Mendel’s work differs from most genetics research undertaken today in that it was not a genetic dissection; he was not interested in the properties of peas themselves, but rather in the way in which the hereditary units that influenced those properties were inherited from generation to generation. Nevertheless, the laws of inheritance deduced by Mendel are exactly those that we use today in modern genetics in identifying single-gene inheritance patterns. Mendel chose to investigate the inheritance of seven properties of his chosen pea species: pea color, pea shape, pod color, pod shape, flower color, plant height, and position of the flowering shoot. In genetics, the terms character and trait are used more or less synonymously; they roughly mean “property.” For each of these seven characters, he obtained from his horticultural supplier two lines that showed distinct and contrasting phenotypes. These contrasting phenotypes are illustrated
2.1 Single-Gene Inheritance Patterns 3 5
in Figure 2-2. His results were substantially the same for each character, and so we can use one character, pea seed color, as an illustration. All of the lines used by Mendel were pure lines, meaning that, for the phenotype in question, all offspring produced by matings within the members of that line were identical. For example, within the yellow-seeded line, all the progeny of any mating were yellow seeded. Mendel’s analysis of pea heredity made extensive use of crosses. To make a cross in plants such as the pea, pollen is simply transferred from the anthers of one plant to the stigmata of another. A special type of mating is a self (selfpollination), which is carried out by allowing pollen from a flower to fall on its own stigma. Crossing and selfing are illustrated in Figure 2-3. The first cross made by Mendel mated plants of the yellow-seeded lines with plants of the green-seeded lines. In his overall breeding program, these lines constituted the parental generation, abbreviated P. In Pisum sativum, the color of the seed (the pea) is determined by the seed’s own genetic makeup; hence, the peas resulting from a cross are effectively progeny and can be conveniently classified for phenotype without the need to grow them into plants. The progeny peas from the cross between the different pure lines were found to be all yellow, no matter which parent (yellow or green) was used as male or female. This progeny generation is called the first filial generation, or F1. The word filial comes from the Latin words filia (daughter) and filius (son). Hence, the results of these two reciprocal crosses were as follows, where × represents a cross:
The seven phenotypic pairs studied by Mendel
The seven phenotypic pairs studied by Mendel Round or wrinkled ripe seeds
Yellow or green seeds
Axial or terminal flowers
Purple or white petals
Inflated or pinched ripe pods
female from yellow line × male from green line → F1 peas all yellow female from green line × male from yellow line → F1 peas all yellow
Long or short stems
Green or yellow unripe pods
The results observed in the descendants of both reciprocal crosses were the same, and so we will treat them as one cross. Mendel grew F1 peas into plants, and he selfed these plants to obtain the second filial generation, or F2. The F2 was composed of 6022 yellow peas and 2001 green peas. In summary, yellow F1 × yellow F1 → F2 comprised of 6022 yellow
2001 green Total 8023
Mendel noted that this outcome was very close to a mathematical ratio of three-fourths (75%) yellow and one-fourth (25%) green. A simple calculation shows us that 6022/8023 = 0.751 or 75.1%, and 2001/8023 = 0.249 or 24.9%. Hence, there was a 3 : 1 ratio of yellow to green. Interestingly, the green phenotype, which had disappeared in the F1, had reappeared in one-fourth of the F2 individuals, showing that the genetic determinants for green must have been present in the yellow F1, although unexpressed. To further investigate the nature of the F2 plants, Mendel selfed plants grown from the F2 seeds. He found three different types of results. The plants grown from the F2 green seeds, when selfed, were found to bear only green peas.
F i g u r e 2 -2 For each character, Mendel studied two contrasting phenotypes.
3 6 C H APTER 2 Single-Gene Inheritance
Cross-pollination and selfing are two types of crosses Cross-pollination
Transfer of pollen with brush
Selfing
Transfer pollen to stigma
Removal of anthers
Stigma
Progeny
Progeny
F i g u r e 2 - 3 In a cross of a pea plant
(left ), pollen from the anthers of one plant is transferred to the stigma of another. In a self (right ), pollen is transferred from the anthers to the stigmata of the same plant.
However, plants grown from the F2 yellow seeds, when selfed, were found to be of two types: one-third of them were pure breeding for yellow seeds, but two-thirds of them gave mixed progeny: three-fourths yellow seeds and one-fourth green seeds, just as the F1 plants had. In summary, 1 4 3 4
of the F2 were green, which when selfed gave all greens of the F2 were yellow; of these
1 3
when selfed gave all yellows
2 3
when selfed gave 4 yellow and 4 green
3
1
Hence, looked at another way, the F2 was comprised of 1 4 pure-breeding greens 1 4 1 2
pure-breeding yellows F1-like yellows (mixed progeny)
Thus, the 3 : 1 ratio at a more fundamental level is a 1 : 2 : 1 ratio. Mendel made another informative cross between the F1 yellow-seeded plants and any green-seeded plant. In this cross, the progeny showed the proportions of one-half yellow and one-half green. In summary: F1 yellow × green → 21 yellow 1 green 2 These two types of matings, the F1 self and the cross of the F1 with any greenseeded plant, both gave yellow and green progeny, but in different ratios. These two ratios are represented in Figure 2-4. Notice that the ratios are seen only when the peas in several pods are combined. The 3 : 1 and 1 : 1 ratios found for pea color were also found for comparable crosses for the other six characters that Mendel studied. The actual numbers for the 3 : 1 ratios for those characters are shown in Table 2-1.
Mendel’s law of equal segregation Initially, the meaning of these precise and repeatable mathematical ratios must have been unclear to Mendel, but he was able to devise a brilliant model that not
2.1 Single-Gene Inheritance Patterns 37
Mendel’s crosses resulted in specific phenotypic ratios F1 yellow selfed Yellow
F1
F1 yellow green F1
Yellow
Grow
Self-pollinated flowers
Green
Grow
Flowers cross-pollinated
Grow
or
either
F2
F2
Progeny seeds
Total
Progeny seeds
21
7
Total
11
11
only accounted for all the results, but also represented the historical birth of the science of genetics. Mendel’s model for the pea-color example, translated into modern terms, was as follows: 1. A hereditary factor called a gene is necessary for producing pea color. 2. Each plant has a pair of this type of gene. 3. The gene comes in two forms called alleles. If the gene is phonetically called a “wye” gene, then the two alleles can be represented by Y (standing for the yellow phenotype) and y (standing for the green phenotype). 4. A plant can be either Y/Y, y/y, or Y/y. The slash shows that the alleles are a pair. 5. In the Y/y plant, the Y allele dominates, and so the phenotype will be yellow. Hence, the phenotype of the Y/y plant defines the Y allele as dominant and the y allele as recessive. 6. In meiosis, the members of a gene pair separate equally into the cells that become eggs and sperm, the gametes. This equal separation has become known as Mendel’s first law or as the law of equal segregation. Hence, a single gamete contains only one member of the gene pair. 7. At fertilization, gametes fuse randomly, regardless of which of the alleles they bear. Here, we introduce some terminology. A fertilized egg, the first cell that develops into a progeny individual, is called a zygote. A plant with a pair of identical
F i g u r e 2 - 4 Mendel obtained a 3 : 1 phenotypic ratio in his self-pollination of the F1 (left ) and a 1 : 1 phenotypic ratio in his cross of F1 yellow with green (right ). Sample sizes are arbitrary.
3 8 C H APTER 2 Single-Gene Inheritance
Table 2-1
Results of All Mendel’s Crosses in Which Parents Differed in One Character
Parental phenotypes
F1 F2 F2 ratio
1. round × wrinkled seeds
All round
5474 round; 1850 wrinkled
2.96 : 1
2. yellow × green seeds
All yellow
6022 yellow; 2001 green
3.01 : 1
3. purple × white petals
All purple
705 purple; 224 white
3.15 : 1
4. inflated × pinched pods
All inflated
882 inflated; 299 pinched
2.95 : 1
5. green × yellow pods
All green
428 green; 152 yellow
2.82 : 1
6. axial × terminal flowers
All axial
651 axial; 207 terminal
3.14 : 1
7. long × short stems
All long
787 long; 277 short
2.84 : 1
alleles is called a homozygote (adjective homozygous), and a plant in which the alleles of the pair differ is called a heterozygote (adjective heterozygous). Sometimes a heterozygote for one gene is called a monohybrid. An individual can be classified as either homozygous dominant (such as Y/Y), heterozygous (Y/y), or homozygous recessive ( y/y). In genetics generally, allelic combinations underlying phenotypes are called genotypes. Hence, Y/Y, Y/y, and y/y are all genotypes. Figure 2-5 shows how Mendel’s postulates explain the progeny ratios illustrated in Figure 2-4. The pure-breeding lines are homozygous, either Y/Y or y/y. Hence, each line produces only Y gametes or only y gametes and thus can only breed true. When crossed with each other, the Y/Y and the y/y lines produce an F1 generation composed of all heterozygous individuals (Y/y). Because Y is dominant, all F1 individuals are yellow in phenotype. Selfing the F1 individuals can be thought of as a cross of the type Y/y × Y/y, which is sometimes called a monohybrid cross. Equal segregation of the Y and y alleles in the heterozygous F1 results in gametes, both male and female, half of which are Y and half of which are y. Male and female gametes fuse randomly at fertilization, with the results shown in the grid in Figure 2-5. The composition of the F2 is three-fourths yellow seeds and one-fourth green, a 3 : 1 ratio. The one-fourth of the F2 seeds that are green breed true as expected of the genotype y/y. However, the yellow F2 seeds (totaling three-fourths) are of two genotypes: two-thirds of them are clearly heterozygotes Y/y, and one-third are homozygous dominant Y/Y. Hence, we see that underlying the 3 : 1 phenotypic ratio in the F2 is a 1 : 2 : 1 genotypic ratio: 1 4
Y/Y yellow
2 4
Y/y yellow
1 4
y/y green
u
3 4
yellow (Y/-)
The general depiction of an individual expressing the dominant allele is Y/−; the dash represents a slot that can be filled by either another Y or a y . Note that equal segregation is detectable only in the meiosis of a heterozygote. Hence, Y/y produces one-half Y gametes and one-half y gametes. Although equal segregation is taking place in homozygotes too, neither segregation 21 Y : 21 Y nor segregation 21 y : 21 y is meaningful or detectable at the genetic level. We can now also explain results of the cross between the plants grown from F1 yellow seeds (Y/y) and the plants grown from green seeds ( y/y). In this case, equal segregation in the yellow heterozygous F1 gives gametes with a 21 Y : 21 y ratio. The y/y parent can make only y gametes, however; so the phenotype of the progeny depends only on which allele they inherit from the Y/y parent. Thus, the 21 Y : 21 y gametic ratio from the heterozygote is converted into a 21 Y/y : 21 y/y
2.2 The Chromosomal Basis of Single-Gene Inheritance Patterns 3 9
A single-gene model explains Mendel’s ratios Mendel’s results Pure P
Mendel’s explanation
Pure P
Y /Y
F1
F1
y /y
Y/y Equal segregation
Selfed
Crossed with green
y /y F2 1 2
F2
3 4
1 4
1 2
1 2
1 2
1 2
Y
y
1 2
Y
1 4
Y /Y
1 4
Y/y
all y
y
1 4
Y/y
1 4
y /y
F i g u r e 2 - 5 Mendel’s results (left ) are explained by a single-gene model (right ) that postulates the equal segregation of the members of a gene pair into gametes.
genotypic ratio, which corresponds to a 1 : 1 phenotypic ratio of yellow-seeded to green-seeded plants. This is illustrated in the right-hand panel of Figure 2-5. Note that, in defining the allele pairs that underlay his phenotypes, Mendel had identified a gene that radically affects pea color. This identification was not his prime interest, but we can see how finding single-gene inheritance patterns is a process of gene discovery, identifying individual genes that influence a biological property. K e y C o n c e p t All 1 : 1, 3 : 1, and 1 : 2 : 1 genetic ratios are diagnostic of single-gene inheritance and are based on equal segregation in a heterozygote.
Mendel’s research in the mid-nineteenth century was not noticed by the international scientific community until similar observations were independently published by several other researchers in 1900. Soon research in many species of plants, animals, fungi, and algae showed that Mendel’s law of equal segregation was applicable to all sexual eukaryotes and, in all cases, was based on the chromosomal segregations taking place in meiosis, a topic that we turn to in the next section.
2.2 The Chromosomal Basis of Single-Gene Inheritance Patterns Mendel’s view of equal segregation was that the members of a gene pair segregated equally in gamete formation. He did not know about the subcellular events that take place when cells divide in the course of gamete formation. Now we understand that gene pairs are located on chromosome pairs and that it is the members of a chromosome pair that actually segregate, carrying the genes with them. The members of a gene pair are segregated as an inevitable consequence.
1 2
1 2
Y
y
1 2
Y/y
1 2
y /y
40 C H APTER 2 Single-Gene Inheritance
Single-gene inheritance in diploids When cells divide, so must the nucleus and its main contents, the chromosomes. To understand gene segregation, we must first understand and contrast the two types of nuclear divisions that take place in eukaryotic cells. When somatic (body) cells divide to increase their number, the accompanying nuclear division is called mitosis, a programmed stage of all eukaryotic cell-division cycles (Figure 2-6). Mitosis can take place in diploid or haploid cells. As a result, one progenitor cell becomes two genetically identical cells. Hence, either 2n → 2n + 2n or n → n + n This “trick” of constancy is accomplished when each chromosome replicates to make two identical copies of itself, with underlying DNA replication. The two identical copies, which are often visually discernible, are called sister chromatids. Then, each copy is pulled to opposite ends of the cell. When the cell divides, each daughter cell has the same chromosomal set as its progenitor. In addition, most eukaryotes have a sexual cycle, and, in these organisms, specialized diploid cells called meiocytes are set aside that divide to produce sex cells such as sperm and egg in plants and animals, or sexual spores in fungi or algae. Two sequential cell divisions take place, and the two nuclear divisions that accompany them are called meiosis. Because there are two divisions, four cells are produced from each progenitor cell. Meiosis takes place only in diploid cells, and the resulting gametes (sperm and eggs in animals and plants) are haploid. Hence, the net result of meiosis is 2n → n + n + n + n This overall halving of chromosome number during meiosis is achieved through one replication and two divisions. As with mitosis, each chromosome replicates once, but in meiosis the replicated chromosomes (sister chromatids) remain attached. One of each of the replicated chromosome pairs is pulled to opposite ends of the cell, and division occurs. At the second division, the sister chromatids separate and are pulled to opposite ends of the cell. Stages of the asexual cell cycle
Original cell
Daughter cells
Stages of the cell cycle M = mitosis S = DNA synthesis G = gap
M
G
G
S
Figure 2- 6
2.2 The Chromosomal Basis of Single-Gene Inheritance Patterns 41
Cell division in common life cycles
2n
Meiocytes 2n
Meiosis 2n
Meiocytes 2n
Meiosis n
Meiosis
n n n Tetrad sperm (gamete)
n
n n n
Tetrad egg (gamete)
2n Zygote Mitosis
Animal
n
Meiosis
n n Tetrad
n
n
n n n
n
n
Tetrad
2n
Mitosis gp gp gp gp gp Sperm n
n
Transient diploid cell (meiocyte)
Meiosis Egg n
2n Zygote Mitosis
n n Tetrad
n
Sexual spores
Mitosis
Mitosis
Fungus
Plant
The location of the meiocytes in animal, plant, and fungal life cycles is shown in Figure 2-7. The basic genetic features of mitosis and meiosis are summarized in Figure 2-8. To make comparison easier, both processes are shown in a diploid cell. Notice, again, that mitosis takes place in one cell division, and the two resulting “daughter” cells have the same genomic content as that of the “mother” (progenitor) cell. The first key process to note is a premitotic chromosome replication. At the DNA level, this stage is the synthesis, or S, phase (see Figure 2-6), at which the DNA is replicated. The replication produces pairs of identical sister chromatids, which become visible at the beginning of mitosis. When a cell divides, each member of a pair of sister chromatids is pulled into a daughter cell, where it assumes the role of a fully fledged chromosome. Hence, each daughter cell has the same chromosomal content as the original cell. Before meiosis, as in mitosis, chromosome replication takes place to form sister chromatids, which become visible at meiosis. The centromere appears not to divide at this stage, whereas it does in mitosis. Also in contrast with mitosis, the homologous pairs of sister chromatids now unite to form a bundle of four homologous chromatids. This joining of the homologous pairs is called synapsis, and it relies on the properties of a macromolecular assemblage called the synaptonemal complex (SC), which runs down the center of the pair. Replicate sister chromosomes are together called a dyad (from the Greek word for “two”). The unit comprising the pair of synapsed dyads is called a bivalent. The four chromatids that
F i g u r e 2 -7 The life cycles of humans, plants, and fungi, showing the points at which mitosis and meiosis take place. Note that in the females of humans and many plants, three cells of the meiotic tetrad abort. The abbreviation n indicates a haploid cell, 2n a diploid cell; gp stands for gametophyte, the name of the small structure composed of haploid cells that will produce gametes. In many plants such as corn, a nucleus from the male gametophyte fuses with two nuclei from the female gametophyte, giving rise to a triploid (3n ) cell, which then replicates to form the endosperm, a nutritive tissue that surrounds the embryo (which is derived from the 2n zygote).
42 C H APTER 2 Single-Gene Inheritance
F i g u r e 2 - 8 Simplified representation of mitosis and meiosis in diploid cells (2n, diploid; n, haploid). (Detailed versions are shown in Appendix 2-1, page 83.) Mitosis Interphase
Prophase
2n
Metaphase
4n Replication
Meiosis
Interphase
Prophase I
2n
4n
Metaphase I
Replication Pairing
make up a bivalent are called a tetrad (Greek for “four”), to indicate that there are four homologous units in the bundle. bivalent
dyad SC dyad
tetrad
(A parenthetical note. The process of crossing over takes place at this tetrad stage. Crossing over changes the combinations of alleles of several different genes but does not directly affect single-gene inheritance patterns; therefore, we will postpone its detailed coverage until Chapter 4. For the present, it is worth noting that, apart from its allele-combining function, crossing over is also known to be a crucial event that must take place in order for proper chromosome segregation in the first meiotic division.) The bivalents of all chromosomes move to the cell’s equator, and, when the cell divides, one dyad moves into each new cell, pulled by spindle fibers attached near the centromeres. In the second cell division of meiosis, the centromeres divide and each member of a dyad (each member of a pair of chromatids) moves into a daughter cell. Hence, although the process starts with the same genomic content as that for mitosis, the two successive segregations result in four haploid cells. Each of the four haploid cells that constitute the four products of meiosis contains one member of a tetrad; hence, the group of four cells is sometimes called a tetrad, too. Meiosis can be summarized as follows:
2.2 The Chromosomal Basis of Single-Gene Inheritance Patterns 4 3
Key stages of meiosis and mitosis Daughter cells Telophase Anaphase
2n
2n Segregation Telophase II
Telophase I
Prophase II
Metaphase II
Anaphase II
Anaphase I
Products of meiosis n
n
n Segregation
n
Segregation
Start: → two homologs Replication: → two dyads Pairing: → tetrad First division: → one dyad to each daughter cell Second division: → one chromatid to each daughter cell Research in cell biology has shown that the spindle fibers that pull apart chromosomes are polymers of the molecule tubulin. The pulling apart is caused mainly by a depolymerization and hence shortening of the fibers at the point where they are attached to the chromosomes. The behavior of chromosomes during meiosis clearly explains Mendel’s law of equal segregation. Consider a heterozygote of general type A/a. We can simply follow the preceding summary while considering what happens to the alleles of this gene: Start: one homolog carries A and one carries a Replication: one dyad is AA and one is aa Pairing: tetrad is A/A/a/a First-division products: one cell AA, the other cell aa (crossing over can mix these types of products up, but the overall ratio is not changed) Second-division products: four cells, two of type A and two of type a
4 4 C H APTER 2 Single-Gene Inheritance
1
Hence, the products of meiosis from a heterozygous meiocyte A/a are 2 A 1 and 2 a, precisely the equal ratio that is needed to explain Mendel’s first law. Meiosis A
a
A
A
A a
A a
a
a
A A
1 2
A
a a
1 2
a
Note that we have focused on the broad genetic aspects of meiosis necessary to explain single-gene inheritance. More complete descriptions of the detailed stages of mitosis and meiosis are presented in Appendices 2-1 and 2-2 at the end of this chapter.
Single-gene inheritance in haploids We have seen that the cellular basis of the law of equal segregation is the segregation of chromosomes in the first division of meiosis. In the discussion so far, the evidence for the equal segregation of alleles in meiocytes of both plants and animals is indirect, based on the observation that crosses show the appropriate ratios of progeny expected under equal segregation. Recognize that the gametes in these studies (such as Mendel’s) must have come from many different meiocytes. However, in some organisms, their special life cycle allows the examination of the products of one single meiocyte. These organisms are called haploids, of which good examples are most fungi and algae. They spend most of their lives in the haploid state but can mate, in the process forming a transient diploid cell that becomes the meiocyte. In some species, the four products of a single meiosis are temporarily held together in a type of sac. Baker’s yeast, Saccharomyces cerevisiae (a fungus), provides a good example (see the yeast Model Organism box in Chapter 12). In fungi, there are simple forms of sexes called mating types. In S. cerevisiae, there are two mating types, and a successful cross can only occur between strains of different mating types. Let’s look at a cross that includes a yeast mutant. Normal wild-type yeast colonies are white, but, occasionally, red mutants arise owing to a mutation in a gene in the biochemical pathway that synthesizes adenine. Let’s use the red mutant to investigate equal segregation in a single meiocyte. We can call the mutant allele r for red. What symbol can we use for the normal, or wild-type, allele? In experimental genetics, the wild-type allele for any gene is generally designated by a plus sign, +. This sign is attached as a superscript to the symbol invented for the mutant allele. Hence, the wild-type allele in this example would be designated r+, but a simple + is often used as shorthand. To see single-gene segregation, the red mutant is crossed with wild type. The cross would be r+ × r When two cells of opposite mating type fuse, a diploid cell is formed, and it is this cell that becomes the meiocyte. In the present example, the diploid meiocyte would be heterozygous, r+/r. Replication and segregation of r+ and r would give a tetrad of two meiotic products (spores) of genotype r+ and two of r, all contained within a membranous sac called an ascus. Hence, r+/r
r+ r+ r r
tetrad in ascus
The details of the process are shown in Figure 2-9. If the four spores from one ascus are isolated (representing a tetrad of chromatids) and used to generate four yeast cultures, then equal segregation within one meiocyte is revealed directly as two white cultures and two red. If we analyzed the random spores
2.3 The Molecular Basis of Mendelian Inheritance Patterns 4 5
from many meiocytes, we would find about 50 percent red and 50 percent white. Note the simplicity of haploid genetics: a cross requires the analysis of only one meiosis; in contrast, a diploid cross requires a consideration of meiosis in both the male and the female parent. This simplicity is an important reason for using haploids as model organisms. Another reason is that, in haploids, all alleles are expressed in the phenotype because there is no masking of recessives by dominant alleles on the other homolog.
2.3 The Molecular Basis of Mendelian Inheritance Patterns
Demonstration of equal segregation within one meiocyte in the yeast S. cerevisiae r + culture
+
n
r
n
Mix cells to make cross
Of course, Mendel had no idea of the molecular nature of the concepts he was working with. In this section, we can begin putting some of Mendel’s concepts into a molecular context. Let’s begin with alleles. We have used the concept of alleles without defining them at the molecular level. What are the structural differences between wild-type and mutant alleles at the DNA level of a gene? What are the functional differences at the protein level? Mutant alleles can be used to study single-gene inheritance without needing to understand their structural or functional nature. However, because a primary reason for embarking on singlegene inheritance is ultimately to investigate a gene’s function, we must come to grips with the molecular nature of wild-type and mutant alleles at both the structural and the functional level.
+
r
Diploid
+
2n
r
Chromosome replication
Structural differences between alleles at the molecular level Mendel proposed that genes come in different forms we now call alleles. What are alleles at the molecular level? When alleles such as A and a are examined at the DNA level by using modern technology, they are generally found to be identical in most of their sequences and differ only at one or several nucleotides of the hundreds or thousands of nucleotides that make up the gene. Therefore, we see that the alleles are truly different versions of the same gene. The following diagram represents the DNA of two alleles of one gene; the letter x represents a difference in the nucleotide sequence:
r culture
Meiocyte
+ + r r
Products of first division +
r
+
r
Allele 1 Allele 2
x
If the nucleotide sequence of an allele changes as the result of a rare chemical “accident,” a new mutant allele is created. Such changes can occur anywhere along the nucleotide sequence of a gene. For example, a mutation could be a change in the identity of a single nucleotide or the deletion of one or more nucleotides or even the addition of one or more nucleotides. There are many ways that a gene can be changed by mutation. For one thing, the mutational damage can occur at any one of many different sites. We can represent the situation as F i g u r e 2 - 9 One ascus isolated from the cross + × r
leads to two cultures of + and two of r.
Four products of meiosis: 1:1 ratio of r + : r
Inoculate cells to form colonies, which + demonstrate single-gene segregation in one meiocyte.
+
Ascus wall
r + r
r
+
r
46 C H APTER 2 Single-Gene Inheritance
follows, where dark blue indicates the normal wild-type DNA sequence and red with the letter x represents the altered sequence:
DNA molecules replicate to form identical chromatids Chromatid formation Homozygous diploid b+/b+
b+
b+ b+ b
+
b+
Heterozygous diploid b+/b
b+
b
b+
G
C
C G
G
b+
b+
G C
C
b+
G
C
C G
A
Homozygous diploid b/b
b
b
b b b b
Haploid b+
b
b+
b+ Haploid b
T
b
b
b
a
Mutant allele
a
x x
Molecular aspects of gene transmission Replication of alleles during the S phase What happens to alleles at the molecular level during cell division? We know that the primary genomic component of each chromosome is a DNA molecule. This DNA molecule is replicated during the S phase, which precedes both mitosis and meiosis. As we will see in Chapter 7, replication is an accurate process and so all the genetic information is duplicated, whether wild type or mutant. For example, if a mutation is the result of a change in a single nucleotide pair—say, from GC (wild type) to AT (mutant)—then in a heterozygote, replication will be as follows:
T A
chromatid GC chromatid GC
homolog AT
replication
chromatid AT chromatid AT
T
T A
b
Mutant allele
x
replication
T A
T
a
homolog GC
T
A
Mutant allele
A
A
T A T
b+
G
C
C G
b
A
T
T A
A
b
C
b
A
G
b+
C G
b+
b b
C G
G
+
Wild-type allele A
DNA replication
C
T
DNA replication before mitosis in a haploid and a diploid are shown in Figure 2-10. This type of illustration serves to remind us that, in our considerations of the mechanisms of inheritance, it is essentially DNA molecules that are being moved around in the dividing cells. Meiosis and mitosis at the molecular level The replication of DNA during the S phase produces two copies of each allele, A and a, that can now be segregated into separate cells. Nuclear division visualized at the DNA level is shown in Figure 2-11. Demonstrating chromosome segregation at the molecular level We have interpreted single-gene phenotypic inheritance patterns in relation to the segregation of chromosomal DNA at meiosis. Is there any way to show DNA segregation directly (as opposed to phenotypic segregation)? The most straightforward approach would be to sequence the
Figure 2-10 Each chromosome divides longitudinally into two
chromatids (left ); at the molecular level (right ), the single DNA molecule of each chromosome replicates, producing two DNA molecules, one for each chromatid. Also shown are various combinations of a gene with wild-type allele b+ and mutant form b, caused by the change in a single base pair from GC to AT. Notice that, at the DNA level, the two chromatids produced when a chromosome replicates are always identical with each other and with the original chromosome.
2.3 The Molecular Basis of Mendelian Inheritance Patterns 47
Nuclear division at the DNA level Mitosis in a haploid cell S Phase
S Phase
S Phase
A
a
A
Chromatid formation
a
Chromatid formation
a A
a
A
A
A
a
A
A
a
Chromatid segregation
A
A
A
Mitosis Alignment on equator
Chromatid segregation
Meiosis Pairing of homologs at equator (tetrad)
a
a A
A
a Chromosome segregation
A
A
a A a
A
A
a a
A
A
A
a
and
A
Chromatid formation
A
Mitosis Alignment on equator
Meiosis
Mitosis in a diploid cell
A
and
a
End of first division
a
A
A
A
a a End of second division
and Daughter cells All A
Daughter cells All A/a
and
a A
1 A 2
Sex cells
1 a 2
F i g u r e 2 -11 DNA and gene transmission in mitosis and meiosis in eukaryotes. The S phase and the main stages of mitosis and meiosis are shown. Mitotic divisions (left and middle ) conserve the genotype of the original cell. At the right, the two successive meiotic divisions that take place during the sexual stage of the life cycle have the net effect of halving the number of chromosomes. The alleles A and a of one gene are used to show how genotypes are transmitted in cell division.
48 C H APTER 2 Single-Gene Inheritance
alleles (say, A and a) in the parents and the meiotic products: the result would be that one-half of the products would have the A DNA sequence and one-half would have the a DNA sequence. The same would be true for any DNA sequence that differed in the inherited chromosomes, including those not necessarily inside alleles correlated with known phenotypes such as red and white flowers. Thus, we see the rules of segregation enunciated by Mendel apply not only to genes but to any stretch of DNA along a chromosome. K e y C o n c e p t Mendelian inheritance is shown by any segment of DNA on a chromosome: by genes and their alleles and by molecular markers not necessarily associated with any biological function.
Alleles at the molecular level At the molecular level, the primary phenotype of a gene is the protein it produces. What are the functional differences between proteins that explain the different effects of wild-type and mutant alleles on the properties of an organism? Let’s explore the topic by using the human disease phenylketonuria (PKU). We shall see in a later section on pedigree analysis that the PKU phenotype is inherited as a Mendelian recessive. The disease is caused by a defective allele of the gene that encodes the liver enzyme phenylalanine hydroxylase (PAH). This enzyme normally converts phenylalanine in food into the amino acid tyrosine: phenylalanine hydroxylase
phenylalanine !!!!: tyrosine However, a mutation in the gene encoding this enzyme may alter the amino acid sequence in the vicinity of the enzyme’s active site. In this case, the enzyme cannot bind phenylalanine (its substrate) or convert it into tyrosine. Therefore, phenylalanine builds up in the body and is converted instead into phenylpyruvic acid. This compound interferes with the development of the nervous system, leading to mental retardation. e nin lala lase y n phe droxy hy
phenylalanine
tyrosine phenylpyruvic acid
Babies are now routinely tested for this processing deficiency at birth. If the deficiency is detected, phenylalanine can be withheld with the use of a special diet and the development of the disease arrested. The PAH enzyme is made up of a single type of protein. What changes have occurred in the mutant form of the PKU gene’s DNA, and how can such change at the DNA level affect protein function and produce the disease phenotype? Sequencing of the mutant alleles from many PKU patients has revealed a plethora of mutations at different sites along the gene, mainly in the protein-encoding regions, or the exons; the results are summarized in Figure 2-12. They represent a range of DNA changes, but most are small changes affecting only one nucleotide pair among the thousands that constitute the gene. What these alleles have in common is that they encode a defective protein that no longer has normal PAH activity. By changing one or more amino acids, the mutations all inactivate some essential part of the protein encoded by the gene. The effect of the mutation on the function of the gene depends on where within the gene the mutation occurs. An important functional region of the gene is that encoding an enzyme’s active site; so this region is very sensitive to mutation. In addition, a minority of mutations are found to be in introns, and these mutations often prevent the normal processing of the primary RNA transcript.
2.3 The Molecular Basis of Mendelian Inheritance Patterns 49
Mutant sites in the PKU gene
Exon mutations
4
11
10
1
Intron mutations
1
2
4
3
4
24 5 37
7
7 21 12 9 1
4
5
6 7
2
2
1 3 1
Some of the general consequences of mutation at the protein level are shown in Figure 2-13. Many of the mutant alleles are of a type generally called null alleles: the proteins encoded by them completely lack PAH function. Other mutant alleles reduce the level of enzyme function; they are sometimes called leaky mutations, because some wild-type function seems to “leak” into the mutant phenotype. DNA sequencing often detects changes that have no functional impact at all, so they are functionally wild type. Hence, we see that the terms wild type and mutant sometimes have to be used carefully. K e y C o n c e p t Most mutations that alter phenotype alter the amino acid
8
9 10 11 12 13
41 1
F i g u r e 2 -12 Many mutations of the human phenylalanine hydroxylase gene that cause enzyme malfunction are known. The number of mutations in the exons, or protein-encoding regions (black), are listed above the gene. The number of mutations in the intron regions (green, numbered 1 through 13) that alter splicing are listed below the gene. [ Data from C. R. Scriver, Ann. Rev. Genet. 28, 1994, 141–165.]
sequence of the gene’s protein product, resulting in reduced or absent function.
We have been pursuing the idea that finding a set of genes that impinge on the biological property under investigation is an important goal of genetics, because it defines the components of the system. However, finding the precise way in which mutant alleles lead to mutant phenotypes is often challenging, requiring not only the identification of the protein products of these genes, but also detailed cellular and physiological studies to measure the effects of the mutations. Furthermore, Gene sites sensitive to mutation DNA Components of protein active site 5´
Promoter
Intron
3´ Wild type
Exon
Exon
m1: null m2: null m3: null m4: leaky m5: silent m6: null
m2 Protein m3 = mutant site
Active site
F i g u r e 2 -13 Mutations in the parts of
m4 m5
a gene encoding enzyme active sites lead to enzymes that do not function (null mutations). Mutations elsewhere in the gene may have no effect on enzyme function (silent mutations). Promoters are sites important in transition initiation.
50 C H APTER 2 Single-Gene Inheritance
finding how the set of genes interacts is a second level of challenge and a topic that we will pursue later, starting in Chapter 6. Dominance and recessiveness With an understanding of how genes function through their protein products, we can better understand dominance and recessiveness. Dominance was defined earlier in this chapter as the phenotype shown by a heterozygote. Hence, formally, it is the phenotype that is dominant or recessive, but, in practice, geneticists more often apply the term to alleles. This formal definition has no molecular content, but both dominance and recessiveness can have simple explanations at the molecular level. We introduce the topic here, to be revisited in Chapter 6. How can alleles be dominant? How can they be recessive? Recessiveness is observed in null mutations in genes that are functionally haplosufficient, loosely meaning that one gene copy has enough function to produce a wild-type phenotype. Although a wild-type diploid cell normally has two fully functional copies of a gene, one copy of a haplosufficient gene provides enough gene product (generally a protein) to carry out the normal transactions of the cell. In a heterozygote (say, +/m, where m is a null), the single functional copy encoded by the + allele provides enough protein product for normal cellular function. In a simple example, assume a cell needs a minimum of 10 protein units to function normally. Each wild-type allele can produce 12 units. Hence, a homozygous wild type +/+ will produce 24 units. The heterozygote +/m will produce 12 units, in excess of the 10-unit minimum, and hence the mutant allele is recessive as it has no impact in the heterozygote. Other genes are haploinsufficient. In such cases, a null mutant allele will be dominant because, in a heterozygote (+/P ), the single wild-type allele cannot provide enough product for normal function. As another example, let’s assume the cell needs a minimum of 20 units of this protein, and the wild-type allele produces only 12 units. A homozygous wild type +/+ makes 24 units, which is over the minimum. However, a heterozygote involving a null mutation (+/P ) produces only 12; hence, the presence of the mutant allele in the heterozygote results in an inadequate supply of product and a mutant phenotype ensues. In some cases, mutation results in a new function for the gene. Such mutations can be dominant because, in a heterozygote, the wild-type allele cannot mask this new function. From the above brief considerations, we see that phenotype, the description or measurement that we track during Mendelian inheritance, is an emergent property based on the nature of alleles and the way in which the gene functions normally and abnormally. The same can be said for the descriptions dominant and recessive that we apply to a phenotype.
2.4 Some Genes Discovered by Observing Segregation Ratios Recall that one general aim of genetic analysis today is to dissect a biological property by discovering the set of single genes that affect it. We learned that an important way to identify these genes is by the phenotypic segregation ratios generated by their mutations—most often 1 : 1 and 3 : 1 ratios, both of which are based on equal segregation as defined by Gregor Mendel. Let’s look at some examples that extend the Mendelian approach into a modern experimental setting. Typically, the researcher is confronted by an array of interesting mutant phenotypes that affect the property of interest (such as those depicted in Figure 2-1) and now needs to know whether they are inherited as single-mutant alleles. Mutant alleles can be either dominant or recessive, depending on their action; so the question of dominance also needs to be considered in the analysis.
2.4 Some Genes Discovered by Observing Segregation Ratios 51
The standard procedure is to cross the mutant with wild type. (If the mutant is sterile, then another approach is needed.) First, we will consider three simple cases that cover most of the possible outcomes: 1. A fertile flower mutant with no pigment in the petals (for example, white petaled in contrast with the normal red) 2. A fertile fruit-fly mutant with short wings 3. A fertile mold mutant that produces excess hyphal branches (hyperbranching)
A gene active in the development of flower color To begin the process, the white-flowered plant is crossed with the normal wildtype red. All the F1 plants are red flowered, and, of 500 F2 plants sampled, 378 are red flowered and 122 are white flowered. If we acknowledge the existence of sampling error, these F2 numbers are very close to a 43 : 41 or 3 : 1, ratio. Because this ratio indicates single-gene inheritance, we can conclude that the mutant is caused by a recessive alteration in a single gene. According to the general rules of gene nomenclature, the mutant allele for white petals might be called alb for albino and the wild-type allele would be alb+ or just +. (The conventions for allele nomenclature vary somewhat among organisms: some of the variations are shown in Appendix A on nomenclature.) We surmise that the wild-type allele plays an essential role in producing the colored petals of the plant, a property that is almost certainly necessary for attracting pollinators to the flower. The gene might be implicated in the biochemical synthesis of the pigment or in the part of the signaling system that tells the cells of the flower to start making pigment or in a number of other possibilities that require further investigation. At the purely genetic level, the crosses made would be represented symbolically as
alb
alb
/a l b
/a l b
/a l b
/a l b
All F1 are red
P +/+ × alb/alb F1 all +/alb 1 F2 4 1 2 1 4
P
F1
alb
/
/a l b
alb
/a l b
a l b /a l b
+/+ +/alb alb/alb
or graphically as in the grids on the right (see also Figure 2-5). This type of grid showing gametes and gametic fusions is called a Punnett square, named after an early geneticist, Reginald C. Punnett. They are useful devices for explaining genetic ratios. We shall encounter more in later discussions.
3 4
of F2 are red, 14 are white
A gene for wing development In the fruit-fly example, the cross of the mutant short-winged fly with wild-type long-winged stock yielded 788 progeny, classified as follows: 196 short-winged males 194 short-winged females 197 long-winged males 201 long-winged females
Introduction to Genetic Analysis, 11e Figure 02UN09 #271 04/20/14 Dragonfly Media Group
In total, there are 390 short- and 398 long-winged progeny, very close to a 1 : 1 ratio. The ratio is the same within males and females, again within the bounds of sampling error. Hence, from these results, the “short wings” mutant was very likely produced by a dominant mutation. Note that, for a dominant mutation to be expressed, only a single “dose” of mutant allele is necessary; so, in most cases,
52 C H APTER 2 Single-Gene Inheritance
P
SH
/
SH/
/
SH/
F1
/
when the mutant first shows up in the population, it will be in the heterozygous state. (This is not true for a recessive mutation such as that in the preceding plant example, which must be homozygous to be expressed and must have come from the selfing of an unidentified heterozygous plant in the preceding generation.) When long-winged progeny were interbred, all of their progeny were long winged, as expected of a recessive wild-type allele. When the short-winged progeny were interbred, their progeny showed a ratio of three-fourths short to onefourth long. Dominant mutations are represented by uppercase letters or words: in the present example, the mutant allele might be named SH, standing for “short.” Then the crosses would be represented symbolically as P
+/+ × SH/+ 1
F1 2 +/+
/
1
2 SH/+
/
/
F1 +/+ × +/+ all +/+ F1
1
4 SH/SH
SH
F1
/
SH/
SH
SH/
SH/SH
SH/+ × SH/+ 1
2 SH/+ 1
4 +/+ or graphically as shown in the grids on the left. This analysis of the fly mutant identifies a gene that is part of a subset of genes that, in wild-type form, are crucial for the normal development of a wing. Such a result is the starting point of further studies that would focus on the precise developmental and cellular ways in which the growth of the wing is arrested, which, once identified, reveal the time of action of the wild-type allele in the course of development.
A gene for hyphal branching
Introduction to Genetic Analysis, 11e Figure 02UN10 #272 04/20/14 Dragonfly Media Group
A hyperbranching fungal mutant (such as the button-like colony in Figure 2-1) was crossed with a wild-type fungus with normal sparse branching. In a sample of 300 progeny, 152 were wild type and 148 were hyperbranching, very close to a 1 : 1 ratio. We infer from this single-gene inheritance ratio that the hyperbranching mutation is of a single gene. In haploids, assigning dominance is usually not possible, but, for convenience, we can call the hyperbranching allele hb and the wild type hb+ or +. The cross must have been P Diploid meiocyte F 1
+ × hb +/hb 1 2 1 2
+ hb
The mutation and inheritance analysis has uncovered a gene whose wild-type allele is essential for normal control of branching, a key function in fungal dispersal and nutrient acquisition. Now the mutant needs to be investigated to see the location in the normal developmental sequence at which the mutant produces a block. This information will reveal the time and place in the cells at which the normal allele acts. Sometimes, the severity of a mutant phenotype renders the organism sterile, unable to go through the sexual cycle. How can the single-gene inheritance of
2.5 Sex-Linked Single-Gene Inheritance Patterns 5 3
sterile mutants be demonstrated? In a diploid organism, a sterile recessive mutant can be propagated as a heterozygote and then the heterozygote can be selfed to produce the expected 25 percent homozygous recessive mutants for study. A sterile dominant mutant is a genetic dead end and cannot be propagated sexually, but, in plants and fungi, such a mutant can be easily propagated asexually. What if a cross between a mutant and a wild type does not produce a 3 : 1 or a 1 : 1 ratio as discussed here, but some other ratio? Such a result can be due to the interactions of several genes or to an environmental effect. Some of these possibilities are discussed in Chapter 6.
Predicting progeny proportions or parental genotypes by applying the principles of single-gene inheritance We can summarize the direction of analysis of gene discovery as follows: Observe phenotypic ratios in progeny → Deduce genotypes of parents (A/A, A/a, or a/a) However, the same principle of inheritance (essentially Mendel’s law of equal segregation) can also be used to predict phenotypic ratios in the progeny of parents of known genotypes. These parents would be from stocks maintained by the researcher. The types and proportions of the progeny of crosses such as A/A × A/a, A/A × a/a, A/a × A/a, and A/a × a/a can be easily predicted. In summary, Cross parents of known genotypes → Predict phenotypic ratios in progeny This type of analysis is used in general breeding to synthesize genotypes for research or for agriculture. It is also useful in predicting likelihoods of various outcomes in human matings in families with histories of single-gene diseases. After single-gene inheritance has been established, an individual showing the dominant phenotype but of unknown genotype can be tested to see if the genotype is homozygous or heterozygous. Such a test can be performed by crossing the individual (of phenotype A/? ) with a recessive tester strain a/a. If the individual is heterozy1 1 gous, a 1 : 1 ratio will result ( 2 A/a and 2 a/a ); if the individual is homozygous, all progeny will show the dominant phenotype (all A/a). In general, the cross of an individual of unknown heterozygosity (for one gene or more) with a fully recessive parent is called a testcross, and the recessive individual is called a tester. We will encounter testcrosses many times throughout subsequent chapters; they are very useful in deducing the meiotic events taking place in more complex genotypes such as dihybrids and trihybrids. The use of a fully recessive tester means that meiosis in the tester parent can be ignored because all of its gametes are recessive and do not contribute to the phenotypes of the progeny. An alternative test for heterozygosity (useful if a recessive tester is not available and the organism can be selfed) is simply to self the unknown: if the organism being tested is heterozygous, a 3 : 1 ratio will be found in the progeny. Such tests are useful and common in routine genetic analysis. K e y C o n c e p t The principles of inheritance (such as the law of equal segregation) can be applied in two directions: (1) inferring genotypes from phenotypic ratios and (2) predicting phenotypic ratios from parents of known genotypes.
2.5 Sex-Linked Single-Gene Inheritance Patterns The chromosomes that we have been analyzing so far are autosomes, the “regular” chromosomes that form most of the genomic set. However, many animals and plants have a special pair of chromosomes associated with sex. The sex
54 C H APTER 2 Single-Gene Inheritance
chromosomes also segregate equally, but the phenotypic ratios seen in progeny are often different from the autosomal ratios.
Sex chromosomes Most animals and many plants show sexual dimorphism; in other words, individuals are either male or female. In most of these cases, sex is determined by a special pair of sex chromosomes. Let’s look at humans as an example. Human body cells have 46 chromosomes: 22 homologous pairs of autosomes plus 2 sex chromosomes. Females have a pair of identical sex chromosomes called the X chromosomes. Males have a nonidentical pair, consisting of one X and one Y. The Y chromosome is considerably shorter than the X. Hence, if we let A represent autosomal chromosomes, we can write females = 44A + XX males = 44A + XY At meiosis in females, the two X chromosomes pair and segregate like autosomes, and so each egg receives one X chromosome. Hence, with regard to sex chromosomes, the gametes are of only one type and the female is said to be the homogametic sex. At meiosis in males, the X and the Y chromosomes pair over a short region, which ensures that the X and Y separate so that there are two types of sperm, half with an X and the other half with a Y. Therefore, the male is called the heterogametic sex. The inheritance patterns of genes on the sex chromosomes are different from those of autosomal genes. Sex-chromosome inheritance patterns were first investigated in the early 1900s in the laboratory of the great geneticist Thomas Hunt Morgan, using the fruit fly Drosophila melanogaster (see the Model Organism box on page 56). This insect has been one of the most important research organisms in genetics; its short, simple life cycle contributes to its usefulness in this regard. Fruit flies have three pairs of autosomes plus a pair of sex chromosomes, again referred to as X and Y. As in mammals, Drosophila females have the constitution XX and males are XY. However, the mechanism of sex determination in Drosophila differs from that in mammals. In Drosophila, the number of X chromosomes in relation to the autosomes determines sex: two X’s result in a female, and one X results in a male. In mammals, the presence of the Y chromosome determines maleness and the absence of a Y determines femaleness. However, it is important to note that, despite this somewhat different basis for sex determination, the single-gene inheritance patterns of genes on the sex chromosomes are remarkably similar in Drosophila and mammals. Vascular plants show a variety of sexual arrangements. Dioecious species are those showing animal-like sexual dimorphism, with female plants bearing flowers containing only ovaries and male plants bearing flowers containing only anthers (Figure 2-14). Some, but not all, dioecious plants have a nonidentical pair of chromosomes associated with (and almost certainly determining) the sex of the plant. Of the species with nonidentical sex chromosomes, a large proportion have an XY system. For example, the dioecious plant Melandrium album has 22 chromosomes per cell: 20 autosomes plus 2 sex chromosomes, with XX females and XY males. Other dioecious plants have no visibly different pair of chromosomes; they may still have sex chromosomes but not visibly distinguishable types.
Sex-linked patterns of inheritance Cytogeneticists divide the X and Y chromosomes into homologous and differential regions. Again, let’s use humans as an example (Figure 2-15). The differential regions, which contain most of the genes, have no counterparts on the other sex
2.5 Sex-Linked Single-Gene Inheritance Patterns 55
chromosome. Hence, in males, the genes in the differential regions are said to be hemizygous (“half zygous”). The differential region of the X chromosome contains many hundreds of genes; most of these genes do not take part in sexual function, and they influence a great range of human properties. The Y chromosome contains only a few dozen genes. Some of these genes have counterparts on the X chromosome, but most do not. The latter type take part in male sexual function. One of these genes, SRY, determines maleness itself. Several other genes are specific for sperm production in males. In general, genes in the differential regions are said to show inheritance patterns called sex linkage. Mutant alleles in the differential region of the X chromosome show a single-gene inheritance pattern called X linkage. Mutant alleles of the few genes in the differential region of the Y chromosome show Y linkage. A gene that is sex linked can show phenotypic ratios that are different in each sex. In this respect, sex-linked inheritance patterns contrast with the inheritance patterns of genes in the autosomes, which are the same in each sex. If the genomic location of a gene is unknown, a sex-linked inheritance pattern indicates that the gene lies on a sex chromosome. The human X and Y chromosomes have two short homologous regions, one at each end (see Figure 2-15). In the sense that these regions are homologous, they are autosomal-like, and so they are called pseudoautosomal regions 1 and 2. One or both of these regions pairs in meiosis and undergoes crossing over (see Chapter 4 for details of crossing over). For this reason, the X and the Y chromosomes can act as a pair and segregate into equal numbers of sperm.
Male and female plants (a)
Ovaries only
Anthers only Female flower
Male flower
(b)
plants
plants
X-linked inheritance For our first example of X linkage, we turn to eye color in Drosophila. The wild-type eye color of Drosophila is dull red, but pure lines with white eyes are available
F i g u r e 2 -14 Examples of two
dioecious plant species are (a) Osmaronia dioica and (b) Aruncus dioicus. [ (a) Leslie Bohm; (b) Anthony Griffiths.]
Human sex chromosomes Pseudoautosomal region 1 Maleness gene SRY
Differential region of the X (X-linked genes)
Differential region of the Y (Y-linked genes)
Centromere
X
Pseudoautosomal region 2
Y
F i g u r e 2 -15 Human sex chromosomes contain a differential region and two pairing regions. The regions were located by observing where the chromosomes paired up in meiosis and where they did not.
56 C H APTER 2 Single-Gene Inheritance
Model Organism
Drosophila
Drosophila melanogaster was one of the first model organisms to be used in genetics. It is readily available from ripe fruit, has a short life cycle, and is simple to culture and cross. Sex is determined by X and Y sex chromosomes (XX = female, XY = male), and males and females are easily distinguished. Mutant phenotypes regularly arise in lab populations, and their frequency can be increased by treatment with mutagenic radiation or chemicals. It is a diploid organism, with four pairs of homologous chromosomes (2n = 8). In salivary glands and certain other tissues, multiple rounds of DNA replication without chromosomal division result in “giant chromosomes,” each with a unique banding pattern that provides geneticists with landmarks for the study of chromosome mapping and rearrangement. There are many species and races of Drosophila, which have been important raw material for the study of evolution. Time flies like an arrow; fruit flies like a banana. (Groucho Marx)
Adult
1
1 day
1
3 2 –4 2 days Egg 1 day Pupa First instar
1 day 1
2 2 –3 days Second instar Third instar 1 day
Life cycle of Drosophila melanogaster. Drosophila melanogaster, the common fruit flt. [ © blickwinkel/Alamy.]
(Figure 2-16). This phenotypic difference is determined by two alleles of a gene located on the differential region of the X chromosome. The mutant allele in the present case is w for white eyes (the lowercase letter indicates that the allele is recessive), and the corresponding wild-type allele is w +. When white-eyed males are crossed with red-eyed females, all the F1 progeny have red eyes, suggesting that the F i g u r e 2 -16 The red-eyed fly is wild allele for white eyes is recessive. Crossing these red-eyed F1 males and females protype, and the white-eyed fly is a mutant. [ Science Source/Getty Images.] duces a 3 : 1 F2 ratio of red-eyed to white-eyed flies, but all the white-eyed flies are males. This inheritance pattern, which shows a clear difference between the sexes, is explained in Figure 2-17. The basis White-eyed and red-eyed Drosophila of the inheritance pattern is that all the F1 flies receive a wildtype allele from their mothers, but the F1 females also receive Male 2 3 Number of children a white-eye allele from their fathers. Hence, all F1 females are of sex indicated heterozygous wild type (w+/w), and the F1 males are hemizyFemale gous wild type (w +). The F1 females pass on the white-eye Affected individuals allele to half their sons, who express it, and to half their daughMating ters, who do not express it, because they must inherit the wildHeterozygotes for type allele from their fathers. autosomal recessive Parents and The reciprocal cross gives a different result; that is, the children: cross between white-eyed females and red-eyed males gives Carrier of sex-linked 1 boy; 1 girl an F in which all the females are red eyed but all the males recessive 1 (in order of birth) are white eyed. In this case, every female inherited the dominant w+ allele from the father’s X chromosome, whereas Death every male inherited the recessive w allele from its mother. Dizygotic (nonidentical twins)
Abortion or stillbirth (sex unspecified)
2.5 Sex-Linked Single-Gene Inheritance Patterns 57
An example of X-linked inheritance Second cross
First cross P
P w
w
w
X
w
X
X
Y
X
White male
Red female
Male gametes
F1
w
w
X
X
White female
Red male
Male gametes
F1 w
w 1 2
w
1 2
1 2
w
Y
w
w
1 2
w
w
w
w
Female gametes
Female gametes 1 2
Red female
1 2
Male gametes
F2
1 2
Red male
Red female
w
1 2
1 2
w
w
1 2
w
w
1 2
1 2
w
w
w
1 2
1 4
Female gametes
White male
Male gametes
F2
w
w
1 2
w
Red female
w
w
1 4
Red male
w
1 2
1 4
Female gametes
w
Red female
1 4
w
w
w
Red male
1 2
1 4
Red female
1 4
White male
1 4
White female
F i g u r e 2 -17 Reciprocal crosses between red-eyed (red) and white-eyed ( white) Drosophila give different results. The alleles are X linked, and the inheritance of the X chromosome explains the phenotypic ratios observed, which are different from those of autosomal genes. (In Drosophila and many other experimental systems, a superscript plus sign is used to designate the normal, or wild-type, allele. Here, w + encodes red eyes and w encodes white eyes.)
1 4
White male
58 C H APTER 2 Single-Gene Inheritance
The F2 consists of one-half red-eyed and one-half white-eyed flies of both sexes. Hence, in sex linkage, we see examples not only of different ratios in different sexes, but also of differences between reciprocal crosses. Note that Drosophila eye color has nothing to do with sex determination, and so we have an illustration of the principle that genes on the sex chromosomes are not necessarily related to sexual function. The same is true in humans: in the discussion of pedigree analysis later in this chapter, we shall see many X-linked genes, yet few could be construed as being connected to sexual function. The abnormal allele associated with white eye color in Drosophila is recessive, but abnormal alleles of genes on the X chromosome that are dominant also arise, such as the Drosophila mutant hairy wing (Hw). In such cases, the wild-type allele (Hw +) is recessive. The dominant abnormal alleles show the inheritance pattern corresponding to that of the wild-type allele for red eyes in the preceding example. The ratios obtained are the same. K e y C o n c e p t Sex-linked inheritance regularly shows different phenotypic ratios in the two sexes of progeny, as well as different ratios in reciprocal crosses.
Historically, in the early decades of the twentieth century, the demonstration by Morgan of X-linked inheritance of white eyes in Drosophila was a key piece of evidence that suggested that genes are indeed located on chromosomes, because an inheritance pattern was correlated with one specific chromosome pair. The idea became known as “the chromosome theory of inheritance.” At that period in history, it had recently been shown that, in many organisms, sex is determined by an X and a Y chromosome and that, in males, these chromosomes segregate equally at meiosis to regenerate equal numbers of males and females in the next generation. Morgan recognized that the inheritance of alleles of the eye-color gene is exactly parallel to the inheritance of X chromosomes at meiosis; hence, the gene was likely to be on the X chromosome. The inheritance of white eyes was extended to Drosophila lines that had abnormal numbers of sex chromosomes. With the use of this novel situation, it was still possible to predict gene-inheritance patterns from the segregation of the abnormal chromosomes. That these predictions proved correct was a convincing test of the chromosome theory. Other genetic analyses revealed that, in chickens and moths, sex-linked inheritance could be explained only if the female was the heterogametic sex. In these organisms, the female sex chromosomes were designated ZW and males were designated ZZ.
2.6 Human Pedigree Analysis Human matings, like those of experimental organisms, provide many examples of single-gene inheritance. However, controlled experimental crosses cannot be made with humans, and so geneticists must resort to scrutinizing medical records in the hope that informative matings have been made (such as monohybrid crosses) that could be used to infer single-gene inheritance. Such a scrutiny of records of matings is called pedigree analysis. A member of a family who first comes to the attention of a geneticist is called the propositus. Usually, the phenotype of the propositus is exceptional in some way; for example, the propositus might have some type of medical disorder. The investigator then traces the history of the phenotype through the history of the family and draws a family tree, or pedigree, by using the standard symbols given in Figure 2-18. To see single-gene inheritance, the patterns in the pedigree have to be interpreted according to Mendel’s law of equal segregation, but humans usually have few children and so, because of this small progeny sample size, the expected
2.6 Human Pedigree Analysis 59
3 : 1 and 1 : 1 ratios are usually not seen unless many similar pedigrees are combined. The approach to pedigree analysis also depends on whether one of the contrasting phenotypes is a rare disorder or both phenotypes of a pair are common (in which case they are said to be “morphs” of a polymorphism). Most pedigrees are drawn for medical reasons and therefore concern medical disorders that are almost by definition rare. In this case, we have two phenotypes: the presence and the absence of the disorder. Four patterns of single-gene inheritance are revealed in pedigrees. Let’s look, first, at recessive disorders caused by recessive alleles of single autosomal genes.
Pedigree symbols Male Female
2
3
Number of children of sex indicated Affected individuals
Mating Parents and children: 1 boy; 1 girl (in order of birth)
Autosomal recessive disorders
Heterozygotes for autosomal recessive Carrier of sex-linked recessive Death
The affected phenotype of an autosomal recessive disorder Abortion or stillbirth is inherited as a recessive allele; hence, the corresponding Dizygotic (sex unspecified) unaffected phenotype must be inherited as the correspond(nonidentical twins) ing dominant allele. For example, the human disease phePropositus nylketonuria (PKU), discussed earlier, is inherited in a simple Mendelian manner as a recessive phenotype, with PKU determined by the allele p and the normal condition Method of identifying l determined by P. Therefore, people with this disease are of 1 2 persons in a pedigenotype p/p, and people who do not have the disease are gree: here the proMonozygotic positus is child 2 in either P/P or P/p. Recall that the term wild type and its allele ll (identical twins) 1 2 3 generation ll, or II-2 symbols are not used in human genetics because wild type is impossible to define. Consanguineous What patterns in a pedigree would reveal autosomal Sex unspecified marriage recessive inheritance? The two key points are that (1) generally the disorder appears in the progeny of unaffected parents F i g u r e 2 -18 A variety of symbols are and (2) the affected progeny include both males and females. When we know that used in human pedigree analysis. both male and female progeny are affected, we can infer that we are most likely dealing with simple Mendelian inheritance of a gene on an autosome, rather than a gene on a sex chromosome. The following typical pedigree illustrates the key point that affected children are born to unaffected parents:
From this pattern, we can deduce a simple monohybrid cross, with the recessive allele responsible for the exceptional phenotype (indicated in black). Both parents must be heterozygotes—say, A/a; both must have an a allele because each contributed an a allele to each affected child, and both must have an A allele because they are phenotypically normal. We can identify the genotypes of the children (shown left to right) as A/−, a/a, a/a, and A/−. Hence, the pedigree can be rewritten as follows: A/a
A/a
A/ a/a a/a A/ This pedigree does not support the hypothesis of X-linked recessive inheritance, because, under that hypothesis, an affected daughter must have a heterozygous mother (possible) and a hemizygous father, which is clearly
6 0 C H APTER 2 Single-Gene Inheritance
impossible because the father would have expressed the phenotype of the disorder. Notice that, even though Mendelian rules are at work, Mendelian ratios are not necessarily observed in single families because of small I sample size, as predicted earlier. In the preceding example, we observe a 1 2 1 : 1 phenotypic ratio in the progeny of a monohybrid cross. If the couple on e A/a ; one A/A were to have, say, 20 children, the ratio would be something like 15 unaffected children and 5 with PKU (a 3 : 1 ratio), but, in a small sample of II 4 children, any ratio is possible, and all ratios are commonly found. 5 4 1 2 3 The family pedigrees of autosomal recessive disorders tend to look A/a A/A A/A A /a A/– rather bare, with few black symbols. A recessive condition shows up in groups of affected siblings, and the people in earlier and later generations tend not to be affected. To understand why this is so, it is important III to have some understanding of the genetic structure of populations 7 1 2 3 4 5 6 A/– A/– A/– A/– A /a A/a A/– underlying such rare conditions. By definition, if the condition is rare, most people do not carry the abnormal allele. Furthermore, most of those people who do carry the abnormal allele are heterozygous for it IV rather than homozygous. The basic reason that heterozygotes are much 1 2 3 4 5 more common than recessive homozygotes is that, to be a recessive A/– a/a A /– a/a A/– homozygote, both parents must have the a allele, but, to be a heterozygote, only one parent must have it. The birth of an affected person usually depends on the rare chance union of F i g u r e 2 -19 Pedigree of a rare recessive phenotype determined by a unrelated heterozygous parents. However, inbreeding (mating between relatives, recessive allele a. Gene symbols are sometimes referred to as consanguinity in humans) increases the chance that two normally not included in pedigree charts, heterozygotes will mate. An example of a marriage between cousins is shown in but genotypes are inserted here for Figure 2-19. Individuals III-5 and III-6 are first cousins and produce two homozyreference. Persons II-1 and II-5 marry into gotes for the rare allele. You can see from Figure 2-19 that an ancestor who is a the family; they are assumed to be normal heterozygote may produce many descendants who also are heterozygotes. Hence, because the heritable condition under two cousins can carry the same rare recessive allele inherited from a common scrutiny is rare. Note also that it is not possible to be certain of the genotype in ancestor. For two unrelated persons to be heterozygous, they would have to inherit some persons with normal phenotype; the rare allele from both their families. Thus, matings between relatives generally such persons are indicated by A / −. run a higher risk of producing recessive disorders than do matings between nonPersons III-5 and III-6, who generate the relatives. For this reason, first-cousin marriages contribute a large proportion of recessives in generation IV, are first people with recessive diseases in the population. cousins. They both obtain their recessive Some other examples of human recessive disorders are shown in Figure 2-20. allele from a grandparent, either I-1 or I-2. Cystic fibrosis is a disease inherited on chromosome 7 according to Mendelian rules as an autosomal recessive phenotype. Its most important symptom is the secretion of large amounts of mucus into the lungs, resulting in death from a combination of effects but usually precipitated by infection of the respiratory tract. The mucus can be dislodged by mechanical chest thumpers, and pulmonary infection can be prevented by antibiotics; thus, with treatment, cystic fibrosis patients can live to adulthood. The cystic fibrosis gene (and its mutant allele) was one of the first human disease genes to be isolated at the DNA level, in 1989. This line of research eventually revealed that the disorder is caused by a defective protein that normally transports chloride ions across the cell membrane. The resultant alteration of the salt balance changes the constitution of the lung mucus. This new understanding of gene function in affected and unaffected persons has given hope for more effective treatment. Human albinism also is inherited in the standard autosomal recessive manner. The mutant allele is of a gene that normally synthesizes the brown or black pigment melanin, normally found in skin, hair, and the retina of the eye (Figure 2-21). Homozygous recessives from inbreeding
K e y C o n c e p t In human pedigrees, an autosomal recessive disorder is generally revealed by the appearance of the disorder in the male and female progeny of unaffected parents.
2.6 Human Pedigree Analysis 61
Many human diseases are caused by mutations in single genes Early-onset Parkinson’s disease (PARK7), autosomal recessive. Neurodegeneration.
Male infertility (USP9Y), Y-linked. Defect of sperm cells. Hemophilia (F8), X-linked recessive. Inactive blood clotting factor.
Ehlers-Danlos syndrome type IV (COL3A1), autosomal dominant. Stretchy collagen.
Alkaptonuria (HGD), autosomal recessive. Black urine. Neurofibromatosis type 2 (NF2), autosomal dominant. Noncancerous tumors of the nervous system.
Huntington disease (HTT), autosomal dominant. Late-onset neurodegeneration.
Lou Gehrig’s disease (SOD1), autosomal dominant. Progressive muscle degeneration.
Cockayne syndrome (ERCC8), autosomal recessive. Short stature, premature aging.
Creutzfeldt-Jakob (prion) disease (PRNP), autosomal dominant. Renegade protein causing neurodegeneration.
22 21 20
Pseudoachondroplasia (COMP), autosomal dominant. A type of dwarfism.
XY
1
2
Maple syrup urine disease (BCKDH), autosomal recessive. Metabolic disorder.
3 4
Chromosome pairs
5
19
6
18
7 8
17 Hereditary hemorrhagic telangiectasia (MADH4), autosomal dominant. Dilation of capillaries causing bleeding.
Cystic fibrosis (CFTR), autosomal recessive. Abnormal chlorine and sodium transport; mucus in the lungs interferes with breathing.
9
16 15
14
13 12
11
10
Werner syndrome (WRN), autosomal recessive. Premature aging.
Canavan disease (ASPA), autosomal recessive. Damage to nerve cells and brain.
Nail–patella syndrome (LMX1B), autosomal dominant. Disorder includes poorly developed nails and kneecaps.
Polycystic kidney disease (PKD1), autosomal dominant. Kidney cysts leading to multiple symptoms.
Crouzon syndrome (FGFR2), autosomal dominant. Disorder of pharynx.
Tay-Sachs disease (HEXA), autosomal recessive. Neurodegenerative disorder often occurring in Ashkenazi Jews and French Canadians. Hypertrophic cardiomyopathy (MYH7), autosomal dominant. Heart muscle defect.
Sickle-cell anemia (HBB), autosomal recessive. Hemoglobin defect affecting red blood cell function. Breast cancer (BRCA2), autosomal dominant. Tumor suppressor defect giving predisposition to breast and other cancers.
Phenylketonuria (PAH), autosomal recessive. Inability to metabolize phenylalanine, leading to impaired mental function.
F i g u r e 2 -2 0 The positions of the genes mutated in some single-gene diseases, shown in the 23 pairs of chromosomes in a human being. Each chromosome has a characteristic banding pattern. X and Y are the sex chromosomes (XX in women and XY in men). Genes associated with each disease are shown in parentheses.
Autosomal dominant disorders Introduction to Genetic Analysis, 11e What pedigree patterns are expected from autosomal dominant disorders? Here, Figure 02.20 #205 the normal allele is recessive, and the defective allele is dominant. It may seem 04/20/14 paradoxical that a rare disorder can be dominant, but remember that dominance 05/12/14 05/13/14 and recessiveness are simply properties of how alleles act in heterozygotes and Dragonfly Media Group
are not defined in reference to how common they are in the population. A good
62 C H APTER 2 Single-Gene Inheritance
example of a rare dominant phenotype that shows single-gene inheritance is pseudoachondroplasia, a type of dwarfism (Figure 2-22). In regard to this gene, people with normal stature are genotypically d/d, and the dwarf phenotype could be in principle D/d or D/D. However, the two “doses” of the D allele in the D/D genotype are believed to produce such a severe effect that this genotype is lethal. If this belief is generally true, all dwarf individuals are heterozygotes. In pedigree analysis, the main clues for identifying an autosomal dominant disorder with Mendelian inheritance are that the phenotype tends to appear in every generation of the pedigree and that affected fathers or mothers transmit the phenotype to both sons and daughters. Again, the equal representation of both sexes among the affected offspring rules out inheritance through the sex chromosomes. The phenotype appears in every generation because, generally, the abnormal allele carried by a person must have come from a parent in the preceding generation. (Abnormal alleles can also arise de novo by mutation. This possibility must be kept in mind for disorders that interfere with reproduction because, here, the condition is unlikely to have been inherited from an affected parent.) A typical pedigree for a dominant disorder is shown in Figure 2-23. Once again, notice that Mendelian ratios are not necessarily observed in families. As with recessive disorders, persons bearing one copy of the rare A allele (A/a) are much more common than those bearing two copies (A/A); so most affected people are heterozygotes, and virtually all matings that produce progeny with dominant disorders are A/a × a/a. Therefore, if the
A mutant gene causes albinism
F i g u r e 2 -2 1 A nonfunctional version of a skin-pigment gene results in lack of pigment. In this case, both members of the gene pair are mutated. [ Yves GELLIE/ Gamma-Rapho/Getty Images.]
Pseudoachondroplasia phenotype
F i g u r e 2 -2 2 The human
pseudoachondroplasia phenotype is illustrated here by a family of five sisters and two brothers. The phenotype is determined by a dominant allele, which we can call D, that interferes with the growth of long bones during development. This photograph was taken when the family arrived in Israel after the end of World War II. [ Bettmann/CORBIS.]
2.6 Human Pedigree Analysis 6 3
progeny of such matings are totaled, a 1 : 1 ratio is expected Inheritance of an autosomal dominant disorder of unaffected (a/a) to affected (A/a) persons. Huntington disease is an example of a disease inherI ited as a dominant phenotype determined by an allele of a 1 2 single gene. The phenotype is one of neural degeneration, A /a a /a leading to convulsions and premature death. Folk singer Woody Guthrie suffered from Huntington disease. The disease is rather unusual in that it shows late onset, the II symptoms generally not appearing until after the person 2 3 4 5 6 7 1 a /a a /a a /a A /a a /a A/a a /a has reached reproductive age (Figure 2-24). When the disease has been diagnosed in a parent, each child already born knows that he or she has a 50 percent chance of inheriting the allele and the associated disease. This tragic III 1 2 3 4 5 6 7 8 9 10 11 12 13 pattern has inspired a great effort to find ways of identifya /a a /a a /a a /a A /a a/a A /a a /a a /a a/a A/a a /a A/a ing people who carry the abnormal allele before they experience the onset of the disease. Now there are molecular diagnostics for identifying people who carry the Huntington allele. F i g u r e 2 -2 3 Pedigree of a dominant Some other rare dominant conditions are polydactyly (extra digits), shown in phenotype determined by a dominant allele A. In this pedigree, all the genotypes Figure 2-25, and piebald spotting, shown in Figure 2-26. have been deduced.
K e y C o n c e p t Pedigrees of Mendelian autosomal dominant disorders show affected males and females in each generation; they also show affected men and women transmitting the condition to equal proportions of their sons and daughters.
Autosomal polymorphisms The alternative phenotypes of a polymorphism (the morphs) are often inherited as alleles of a single autosomal gene in the standard Mendelian manner. Among the many human examples are the following dimorphisms (with two morphs, the simplest polymorphisms): brown versus blue eyes, pigmented versus blond hair, ability to smell Freesias (a fragrant type of flower) versus inability, widow’s peak versus none, sticky versus dry earwax, and attached versus free earlobes. In each example, the morph determined by the dominant allele is written first.
Of all persons carrying the allele, percentage affected with the disease
Late onset of Huntington disease
100
50
F i g u r e 2 -2 4 The graph shows that 0
10
20
30
40 50 Age (years)
60
70
80
people carrying the allele generally do not express the disease until after childbearing age.
6 4 C H APTER 2 Single-Gene Inheritance
Polydactyly I
II
III
IV
5,5 6,6
5,5 5,5
5,5 6,6
5,5 5,5
6 normal 7 normal 3 afflicted
5,5 6,6
V (a)
6,6 5,5
5,5 6,6
5,5 6,6
6,6 5,5
5,5 6,6
5,6 6,7
12 normal
(b)
6,6 6,6
F i g u r e 2 -2 5 Polydactyly is a rare dominant phenotype of the human hands and feet. (a) Polydactyly, characterized by extra fingers, toes, or both, is determined by an allele P. The numbers in the pedigree (b) give the number of fingers in the upper lines and the number of toes in the lower. (Note the variation in expression of the P allele.) [ (a) © Biophoto Associates/Science Source.]
Introduction to Genetic Analysis, 11e Figure 02.25ab #256-209 04/20/14 Dragonfly Media Group
The interpretation of pedigrees for polymorphisms is somewhat different from that of rare disorders because, by definition, the morphs are common. Let’s look at a pedigree for an interesting human case. Most human populations are dimorphic for the ability to taste the chemical phenylthiocarbamide (PTC); that is, people can either detect it as a foul, bitter taste or—to the great surprise and disbelief of tasters—cannot taste it at all. From the pedigree in Figure 2-27, we can see that two tasters sometimes produce nontaster children, which makes it clear that the allele that confers the ability to taste is dominant and that the allele for nontasting is recessive. Notice in Figure 2-27 that almost all people who marry into this family carry the recessive allele either in heterozygous or in homozygous condition. Such a pedigree thus differs from those of rare recessive disorders, for which the conventional assumption is that all who marry into a family are homozygous normal. Because both PTC alleles are common, it is not surprising that all but one of the family members in this pedigree married persons with at least one copy of the recessive allele. Polymorphism is an interesting genetic phenomenon. Population geneticists have been surprised at discovering how much polymorphism there is in natural populations of plants and animals generally. Furthermore, even though the genetics of polymorphisms is straightforward, there are very few polymorphisms for which there are satisfactory explanations for the coexistence of the morphs. But polymorphism is rampant at every level of genetic analysis, even at the DNA level; indeed, polymorphisms observed at the DNA level have been invaluable as landmarks to help geneticists find their way around the chromosomes of complex organisms, as will be described in Chapter 4. The population and evolutionary genetics of polymorphisms is considered in Chapters 17 and 19. K e y C o n c e p t Populations of plants and animals (including humans) are highly polymorphic. Contrasting morphs are often inherited as alleles of a single gene.
2.6 Human Pedigree Analysis 6 5
Dominant piebald spotting
(a) I
II
III
1
4 1– 4
1
2
5
2
6
3 4
IV (b)
5
1
7
8
6 7
2
3 9 10 11– 13
8 9 10 11 12 13 14 15 16 17
3
4
Figure 2-26 Piebald spotting is a rare dominant human phenotype. Although the phenotype is encountered sporadically in all races, the patterns show up best in those with dark skin. (a) The photographs show front and back views of affected persons IV-1, IV-3, III-5, III-8, and III-9 from (b) the family pedigree. Notice the variation in expression of the piebald gene among family members. The patterns are believed to be caused by the dominant allele interfering with the migration of melanocytes (melanin-producing cells) from the dorsal to the ventral surface in the course of development. The white forehead blaze is particularly characteristic and is often accompanied Introduction to Genetic Analysis, 11e by a white forelock in the hair. Figure 02.26ab #256 Piebaldism is not a form of albinism; the cells in the light patches have the genetic potential to make melanin, 04/20/14 but, because theyGroup are not melanocytes, they are not developmentally programmed to do so. In true albinism, the Dragonfly Media cells lack the potential to make melanin. (Piebaldism is caused by mutations in c-kit, a type of gene called a proto-oncogene; see Chapter 16.) [ Photos (a) and data (b) from I. Winship, K. Young, R. Martell, R. Ramesar, D. Curtis, and P. Beighton, “Piebaldism: An Autonomous Autosomal Dominant Entity,” Clin. Genet. 39, 1991, 330. © Reproduced with permission of John Wiley & Sons, Inc.]
X-linked recessive disorders Let’s look at the pedigrees of disorders caused by rare recessive alleles of genes located on the X chromosome. Such pedigrees typically show the following features: 1. Many more males than females show the rare phenotype under study. The reason is that a female can inherit the genotype only if both her mother and her father bear the allele (for example, XA Xa × Xa Y), whereas a male can inherit the phenotype when only the mother carries the allele (XA Xa × XA Y). If the recessive allele is very rare, almost all persons showing the phenotype are male.
6 6 C H APTER 2 Single-Gene Inheritance
2. None of the offspring of an affected male show the phenotype, but all his daughters are “carriers,” who bear the recessive allele masked in the heterozygous condition. In the next generation, half the sons of these carrier daughters show the phenotype (Figure 2-28).
Inheritance of a dimorphism I
1
II
III
1
1
2
2
2
3
4
5
3
4
3
4
5
6
7
6
7
8
9
10
11
3. None of the sons of an affected male show the phenotype under study, nor will they pass the condition to their descendants. The reason behind this lack of male-to-male transmission is that a son obtains his Y chromosome from his father; so he cannot normally inherit the father’s X chromosome, too. Conversely, male-to-male transmission of a disorder is a useful diagnostic for an autosomally inherited condition.
In the pedigree analysis of rare X-linked recessives, a normal female of unknown genotype is assumed to be 1 2 3 4 5 homozygous unless there is evidence to the contrary. = Tasters (T / T or T / t ) , Perhaps the most familiar example of X-linked reces= Nontasters (t / t ) , sive inheritance is red–green color blindness. People with this condition are unable to distinguish red from green. The genes for color vision have been characterized at the molecular level. Color F i g u r e 2 -2 7 Pedigree for the ability to taste the chemical phenylthiocarbamide. vision is based on three different kinds of cone cells in the retina, each sensitive to red, green, or blue wavelengths. The genetic determinants for the red and green cone cells are on the X chromosome. Red–green color-blind people have a mutation in one of these two genes. As with any X-linked recessive disorder, there are many more males with the phenotype than females. Another familiar example is hemophilia, the failure of blood to clot. Many proteins act in sequence to make blood clot. The most common type of hemophilia is caused by the absence or malfunction of one of these clotting proteins, called factor VIII. A well-known pedigree of hemophilia is of the interrelated royal families in Europe (Figure 2-29). The original hemophilia allele in the pedigree possibly arose spontaneously as a mutation in the reproductive cells of either Queen Victoria’s parents or Queen Victoria herself. However, some have proposed that the origin of the allele was a secret lover of Victoria’s mother. Alexis, the son of the last Inheritance of an X-linked czar of Russia, inherited the hemophilia allele ultimately from Queen Victoria, recessive disorder who was the grandmother of his mother, Alexandra. Nowadays, hemophilia can be treated medically, but it was formerly a potentially fatal condition. It is interesting to note that the Jewish Talmud contains rules about exemptions to male A A a X X X Y I circumcision clearly showing that the mode of transmission of the disease through 1 2 unaffected carrier females was well understood in ancient times. For example, one exemption was for the sons of women whose sisters’ sons had bled profusely II XAY XAXa XAY when they were circumcised. Hence, abnormal bleeding was known to be trans1 2 3 mitted through the females of the family but expressed only in their male children. III Duchenne muscular dystrophy is a fatal X-linked recessive disease. The phe1 2 3 4 notype is a wasting and atrophy of muscles. Generally, the onset is before the age XaY XAY XAXa XAXA of 6, with confinement to a wheelchair by age 12 and death by age 20. The gene for Duchenne muscular dystrophy encodes the muscle protein dystrophin. This F i g u r e 2 -2 8 As is usually the case, knowledge holds out hope for a better understanding of the physiology of this expression of the X-linked recessive alleles condition and, ultimately, a therapy. is only in males. These alleles are carried A rare X-linked recessive phenotype that is interesting from the point of view of unexpressed by daughters in the next sexual differentiation is a condition called testicular feminization syndrome, which generation, to be expressed again in sons. has a frequency of about 1 in 65,000 male births. People with this syndrome are Note that III-3 and III-4 cannot be distinguished phenotypically. chromosomally males, having 44 autosomes plus an X and a Y chromosome, but IV
2.6 Human Pedigree Analysis 67
Inheritance of hemophilia in European royalty
(a)
(b) F i g u r e 2 -2 9 A pedigree for the X-linked recessive condition hemophilia in the royal families of Europe. A recessive allele causing hemophilia (failure of blood clotting) arose through mutation in the reproductive cells of Queen Victoria or one of her parents. This hemophilia allele spread into other royal families by intermarriage. (a) This partial pedigree shows affected males and carrier females (heterozygotes). Most spouses marrying into the families have been omitted from the pedigree for simplicity. Can you deduce the likelihood of the present British royal family’s harboring the recessive allele? (b) A painting showing Queen Victoria surrounded by her numerous descendants. [ (b) © Lebrecht Music and Arts Photo Library/Alamy.]
6 8 C H APTER 2 Single-Gene Inheritance
Testicular feminization phenotype
they develop as females (Figure 2-30). They have female external genitalia, a blind vagina, and no uterus. Testes may be present either in the labia or in the abdomen. Although many such persons marry, they are sterile. The condition is not reversed by treatment with the male hormone androgen, and so it is sometimes called androgen insensitivity syndrome. The reason for the insensitivity is that a mutation in the androgen-receptor gene causes the receptor to malfunction, and so the male hormone can have no effect on the target organs that contribute to maleness. In humans, femaleness results when the male-determining system is not functional.
X-linked dominant disorders The inheritance patterns of X-linked dominant disorders have the following characteristics in pedigrees (Figure 2-31): 1. Affected males pass the condition to all their daughters but to none of their sons. 2. Affected heterozygous females married to unaffected males pass the condition to half their sons and daughters. This mode of inheritance is not common. One example is hypophosphatemia, a type of vitamin D–resistant rickets. Some forms of hypertrichosis (excess body and facial hair) show X-linked dominant inheritance.
Y-linked inheritance
F i g u r e 2 - 3 0 An XY individual with
testicular feminization syndrome, caused by the recessive X-linked allele for androgen insensitivity. [ © Wellcome Photo Library/Custom Medical Stock.]
Only males inherit genes in the differential region of the human Y chromosome, with fathers transmitting the genes to their sons. The gene that plays a primary role in maleness is the SRY gene, sometimes called the testis-determining factor. Genomic analysis has confirmed that, indeed, the SRY gene is in the differential region of the Y chromosome. Hence, maleness itself is Y linked and shows the expected pattern of exclusively male-to-male transmission. Some cases of male sterility have been shown to be caused by deletions of Y-chromosome regions containing sperm-promoting genes. Male sterility is not heritable, but, interestingly, the fathers of these men have normal Y chromosomes, showing that the deletions are new. There have been no convincing cases of nonsexual phenotypic variants associated with the Y chromosome. Hairy ear rims (Figure 2-32) have been proposed as a possibility, although disputed. The phenotype is extremely rare among the populations of most countries but more common among the populations of India. In some families, hairy ear rims have been shown to be transmitted exclusively from fathers to sons. Inheritance of an X-linked dominant disorder I
Xa/ Xa
II
F i g u r e 2 - 31 All the daughters of a
male expressing an X-linked dominant phenotype will show the phenotype. Females heterozygous for an X-linked dominant allele will pass the condition on to half their sons and daughters.
III
Xa/ Xa
Xa/ Y
XA/ Xa
XA/ Xa
XA/ Xa
Xa/ Y
XA/ Y
XA/ Y
Xa/ Y
Xa/ Y
2.6 Human Pedigree Analysis 6 9
K e y C o n c e p t Inheritance patterns with an unequal representation of phenotypes in males and females can locate the genes concerned to one of the sex chromosomes.
Hairy ears: a phenotype proposed to be Y linked
Calculating risks in pedigree analysis When a disorder with well-documented single-gene inheritance is known to be present in a family, knowledge of transmission patterns can be used to calculate the probability of prospective parents’ having a child with the disorder. For example, consider a case in which a newly married husband and wife find out that each had an uncle with Tay-Sachs disease, a severe autosomal recessive disease caused by malfunction of the enzyme hexosaminidase A. The defect leads to the buildup of fatty deposits in nerve cells, causing paralysis followed by an early death. The pedigree is as follows:
? The probability of the couple’s first child having Tay-Sachs can be calculated in the following way. Because neither of the couple has the disease, each can only be an unaffected homozygote or heterozygote. If both are heterozygotes, then they each stand a chance of passing the recessive allele on to a child, who would then have TaySachs disease. Hence, we must calculate the probability of their both being heterozygotes, and then, if so, the probability of passing the deleterious allele on to a child. 1. The husband’s grandparents must have both been heterozygotes (T/t) because they produced a t/t child (the uncle). Therefore, they effectively constituted a monohybrid cross. The husband’s father could be T/T or T/t, but within the 3/4 of unaffected progeny we know that the relative probabilities of these genotypes must be 1/4 and 1/2, respectively (the expected progeny ratio in a monohybrid cross is 1 1 1 4 T/T, 2 T/t, 4 t/t). Therefore, there is a 2/3 probability that the father is a heterozygote (two-thirds is the proportion of unaffected progeny who are heterozygotes: that is the ratio of 2/4 to 3/4). 2. The husband’s mother is assumed to be T/T, because she married into the family and disease alleles are generally rare. Thus, if the father is T/t, then the mating with the mother was a cross T/t × T/T and the expected proportions in the prog1 1 eny (which includes the husband) are 2 T/T, 2 T/t. 3. The overall probability of the husband’s being a heterozygote must be calculated with the use of a statistical rule called the product rule, which states that The probability of two independent events both occurring is the product of their individual probabilities. Because gene transmissions in different generations are independent events, we can calculate that the probability of the husband’s being a heterozygote is the probability of his father’s being a heterozygote (2/3) times the probability of his father having a heterozygous son (1/2), which is 2/3 × 1/2 = 1/3. 4. Likewise, the probability of the wife’s being heterozygous is also 1/3. 5. If they are both heterozygous (T/t), their mating would be a standard monohybrid cross and so the probability of their having a t/t child is 1/4.
F i g u r e 2 - 3 2 Hairy ear rims have been
proposed to be caused by an allele of a Y-linked gene. [ © Mark Collinson/Alamy.]
70 C H APTER 2 Single-Gene Inheritance
6. Overall, the probability of the couple’s having an affected child is the probability of them both being heterozygous and then both transmitting the recessive allele to a child. Again, these events are independent, and so we can calculate the overall probability as 1/3 × 1/3 × 1/4 = 1/36. In other words, there is a 1 in 36 chance of them having a child with Tay-Sachs disease. In some Jewish communities, the Tay-Sachs allele is not as rare as it is in the general population. In such cases, unaffected people who marry into families with a history of Tay-Sachs cannot be assumed to be T/T. If the frequency of T/t heterozygotes in the community is known, this frequency can be factored into the product-rule calculation. Nowadays, molecular diagnostic tests for Tay-Sachs alleles are available, and the judicious use of these tests has drastically reduced the frequency of the disease in some communities.
s u m m a ry In somatic cell division, the genome is transmitted by mitosis, a nuclear division. In this process, each chromosome replicates into a pair of chromatids and the chromatids are pulled apart to produce two identical daughter cells. (Mitosis can take place in diploid or haploid cells.) At meiosis, which takes place in the sexual cycle in meiocytes, each homolog replicates to form a dyad of chromatids; then, the dyads pair to form a tetrad, which segregates at each of the two cell divisions. The result is four haploid cells, or gametes. Meiosis can take place only in a diploid cell; hence, haploid organisms unite to form a diploid meiocyte. An easy way to remember the main events of meiosis, by using your fingers to represent chromosomes, is shown in Figure 2-33. Genetic dissection of a biological property begins with a collection of mutants. Each mutant has to be tested to see if it is inherited as a single-gene change. The procedure followed is essentially unchanged from the time of Mendel, who performed the prototypic analysis of this type. The analysis is based on observing specific phenotypic ratios in the progeny of controlled crosses. In a typical case, a cross of A/A × a /a produces an F1 that is all A/a. When the F1 is selfed or intercrossed, a genotypic ratio of 41 A/A : 21 A/a : 41 a /a is produced in the F2. (At the phenotypic level, this ratio is 43 A/− : 41 a /a .) The three single-gene genotypes are homozygous dominant, heterozygous (monohybrid), and homozygous recessive. If an A/a individual is crossed with a/a (a testcross), a 1 : 1 ratio is produced in the progeny. The 1 : 1, 3 : 1, and 1 : 2 : 1 ratios stem from the principle of equal segregation, which is that the haploid products of meiosis from A/a will be 21 A and 21 a. The cellular basis of the equal segregation of alleles is the segregation of homologous chromosomes at meiosis. Haploid fungi can be used to show equal segregation at the level of a single meiosis (a 1 : 1 ratio in an ascus).
The main events of mitosis aned meiosis Mitosis
Meiosis Pair of homologous chromosomes
Chromatid formation
Alignment at equator
Pairing at equator
Anaphase
Anaphase I
Anaphase II
F i g u r e 2 - 3 3 Using fingers to remember the main events of mitosis and meiosis.
Solved Problems 71
The molecular basis for chromatid production in meiosis is DNA replication. Segregation at meiosis can be observed directly at the molecular (DNA) level. The molecular force of segregation is the depolymerization and subsequent shortening of microtubules that are attached to the centromeres. Recessive mutations are generally in genes that are haplosufficient, whereas dominant mutations are often due to gene haploinsufficiency. In many organisms, sex is determined chromosomally, and, typically, XX is female and XY is male. Genes on the X chromosome (X-linked genes) have no counterparts on the Y chromosome and show a single-gene inheritance
pattern that differs in the two sexes, often resulting in different ratios in the male and female progeny. Mendelian single-gene segregation is useful in identifying mutant alleles underlying many human disorders. Analyses of pedigrees can reveal autosomal or X-linked disorders of both dominant and recessive types. The logic of Mendelian genetics has to be used with caution, taking into account that human progeny sizes are small and phenotypic ratios are not necessarily typical of those expected from larger sample sizes. If a known single-gene disorder is present in a pedigree, Mendelian logic can be used to predict the likelihood of children inheriting the disease.
key terms allele (p. 37) ascus (p. 44) bivalent (p. 41) character (p. 34) chromatid (p. 40) cross (p. 33) dimorphism (p. 63) dioecious species (p. 54) dominant (p. 38) dyad (p. 41) first filial generation (F1) (p. 35) forward genetics (p. 33) gene (p. 37) gene discovery (p. 32) genetic dissection (p. 33) genotype (p. 38) haploinsufficient (p. 50) haplosufficient (p. 50) hemizygous (p. 55) heterogametic sex (p. 54) heterozygote (p. 38) heterozygous (p. 38)
homogametic sex (p. 54) homozygote (p. 38) homozygous dominant (p. 38) homozygous recessive (p. 38) law of equal segregation (Mendel’s first law) (p. 37) leaky mutation (p. 49) meiocyte (p. 40) meiosis (p. 40) mitosis (p. 40) monohybrid (p. 38) monohybrid cross (p. 38) morph (p. 63) mutant (p. 32) mutation (p. 32) null allele (p. 49) parental generation (P) (p. 35) pedigree analysis (p. 58) phenotype (p. 32) polymorphism (p. 32) product of meiosis (p. 42) product rule (p. 69)
property (p. 32) propositus (p. 58) pseudoautosomal regions 1 and 2 (p. 55) pure line (p. 35) recessive (p. 37) reverse genetics (p. 34) second filial generation (F2) (p. 35) self (p. 35) sex chromosome (p. 54) sex linkage (p. 55) SRY gene (p. 68) testcross (p. 53) tester (p. 53) tetrad (p. 42) trait (p. 34) wild type (p. 32) X chromosome (p. 54) X linkage (p. 55) Y chromosome (p. 54) Y linkage (p. 55) zygote (p. 38)
s olv e d p r obl e m s This section in each chapter contains a few solved problems that show how to approach the problem sets that follow. The purpose of the problem sets is to challenge your understanding of the genetic principles learned in the chapter. The best way to demonstrate an understanding of a subject is to be able to use that knowledge in a real or simulated situation. Be forewarned that there is no machine-like way of solving these problems. The three main resources at your disposal are the genetic principles just learned, logic, and trial and error. Here is some general advice before beginning. First, it is absolutely essential to read and understand all of the
problem. Most of the problems use data taken from research that somebody actually carried out: ask yourself why the research might have been initiated and what was the probable goal. Find out exactly what facts are provided, what assumptions have to be made, what clues are given in the problem, and what inferences can be made from the available information. Second, be methodical. Staring at the problem rarely helps. Restate the information in the problem in your own way, preferably using a diagrammatic representation or flowchart to help you think out the problem. Good luck.
72 C H APTER 2 Single-Gene Inheritance
SOLVED PROBLEM 1. Crosses were made between two pure
lines of rabbits that we can call A and B. A male from line A was mated with a female from line B, and the F1 rabbits were subsequently intercrossed to produce an F2. Three-fourths of the F2 animals were discovered to have white subcutaneous fat, and one-fourth had yellow subcutaneous fat. Later, the F1 was examined and was found to have white fat. Several years later, an attempt was made to repeat the experiment by using the same male from line A and the same female from line B. This time, the F1 and all the F2 (22 animals) had white fat. The only difference between the original experiment and the repeat that seemed relevant was that, in the original, all the animals were fed fresh vegetables, whereas in the repeat, they were fed commercial rabbit chow. Provide an explanation for the difference and a test of your idea. Solution The first time that the experiment was done, the breeders would have been perfectly justified in proposing that a pair of alleles determine white versus yellow body fat because the data clearly resemble Mendel’s results in peas. White must be dominant, and so we can represent the white allele as W and the yellow allele as w. The results can then be expressed as follows: P
W/W × w/w
F1 W/w F2
1 4
W/W
1 2 1 4
W/w w/w
No doubt, if the parental rabbits had been sacrificed, one parent (we cannot tell which) would have been predicted to have white fat and the other yellow. Luckily, the rabbits were not sacrificed, and the same animals were bred again, leading to a very interesting, different result. Often in science, an unexpected observation can lead to a novel principle, and, rather than moving on to something else, it is useful to try to explain the inconsistency. So why did the 3 : 1 ratio disappear? Here are some possible explanations. First, perhaps the genotypes of the parental animals had changed. This type of spontaneous change affecting the whole animal, or at least its gonads, is very unlikely, because even common experience tells us that organisms tend to be stable to their type. Second, in the repeat, the sample of 22 F2 animals did not contain any yellow fat simply by chance (“bad luck”). This explanation, again, seems unlikely, because the sample was quite large, but it is a definite possibility. A third explanation draws on the principle that genes do not act in a vacuum; they depend on the environment for their effects. Hence, the formula “genotype + environment = phenotype” is a useful mnemonic. A corollary of this for-
mula is that genes can act differently in different environments; so genotype 1 + environment 1 = phenotype 1 and genotype 1 + environment 2 = phenotype 2 In the present problem, the different diets constituted different environments, and so a possible explanation of the results is that the homozygous recessive w/w produces yellow fat only when the diet contains fresh vegetables. This explanation is testable. One way to test it is to repeat the experiment again and use vegetables as food, but the parents might be dead by this time. A more convincing way is to breed several of the white-fatted F2 rabbits from the second experiment. According to the original interpretation, some of them should be heterozygous, and, if their progeny are raised on vegetables, yellow fat should appear in Mendelian proportions. For example, if a cross happened to be W/w and w/w, the progeny would be 21 white fat and 21 yellow fat. If this outcome did not happen and no progeny having yellow fat appeared in any of the matings, we would be forced back to the first or second explanation. The second explanation can be tested by using larger numbers, and if this explanation doesn’t work, we are left with the first explanation, which is difficult to test directly. As you might have guessed, in reality, the diet was the culprit. The specific details illustrate environmental effects beautifully. Fresh vegetables contain yellow substances called xanthophylls, and the dominant allele W gives rabbits the ability to break down these substances to a colorless (“white”) form. However, w/w animals lack this ability, and the xanthophylls are deposited in the fat, making it yellow. When no xanthophylls have been ingested, both W/− and w/w animals end up with white fat. SOLVED PROBLEM 2. Phenylketonuria (PKU) is a human hereditary disease resulting from the inability of the body to process the chemical phenylalanine, which is contained in the protein that we eat. PKU is manifested in early infancy and, if it remains untreated, generally leads to mental retardation. PKU is caused by a recessive allele with simple Mendelian inheritance. A couple intends to have children but consult a genetic counselor because the man has a sister with PKU and the woman has a brother with PKU. There are no other known cases in their families. They ask the genetic counselor to determine the probability that their first child will have PKU. What is this probability?
Solution What can we deduce? If we let the allele causing the PKU phenotype be p and the respective normal allele be P, then the sister and brother of the man and woman, respectively,
Solved Problems 73
must have been p/p. To produce these affected persons, all four grandparents must have been heterozygous normal. The pedigree can be summarized as follows: P/p
P/p
P/p
P/p
...
... p/p P/
P/ p/p ?
When these inferences have been made, the problem is reduced to an application of the product rule. The only way in which the man and woman can have a PKU child is if both of them are heterozygotes (it is obvious that they themselves do not have the disease). Both the grandparental matings are simple Mendelian monohybrid crosses expected to produce progeny in the following proportions: 1 4 P/P 1 2 P/p
Normal (34)
1 4
PKU(14)
p/p
We know that the man and the woman are normal, and so the probability of each being a heterozygote is 2/3 because, within the P/− class, 2/3 are P/p and 1/3 are P/P. The probability of both the man and the woman being heterozygotes is 2/3 × 2/3 = 4/9. If both are heterozygous, then one-quarter of their children would have PKU, and so the probability that their first child will have PKU is 1/4 and the probability of their being heterozygous and of their first child’s having PKU is 4/9 × 1/4 = 4/36 = 1/9, which is the answer. SOLVED PROBLEM 3. A rare human disease is found in a family as shown in the accompanying pedigree.
Solution a. The most likely mode of inheritance is X-linked dominant. We assume that the disease phenotype is dominant because, after it has been introduced into the pedigree by the male in generation II, it appears in every generation. We assume that the phenotype is X linked because fathers do not transmit it to their sons. If it were autosomal dominant, father-to-son transmission would be common. In theory, autosomal recessive could work, but it is improbable. In particular, note the marriages between affected members of the family and unaffected outsiders. If the condition were autosomal recessive, the only way in which these marriages could have affected offspring is if each person marrying into the family were a heterozygote; then the matings would be a/a (affected) × A/a (unaffected). However, we are told that the disease is rare; in such a case, heterozygotes are highly unlikely to be so common. X-linked recessive inheritance is impossible, because a mating of an affected woman with a normal man could not produce affected daughters. So we can let A represent the disease-causing allele and a represent the normal allele. b. 1 × 9: Number 1 must be heterozygous A/a because she must have obtained a from her normal mother. Number 9 must be A/Y. Hence, the cross is A/a × A/Y . Female gametes 1 2
A
1 2
a
1 2
1 2
2
3
4
5
6
7
8
9
10
a. Deduce the most likely mode of inheritance. b. What would be the outcomes of the cousin marriages 1 × 9, 1 × 4, 2 × 3, and 2 × 8 ?
1 2
A
1 4
A/A
1 2
Y
1 4
A/Y
1 2
A
1 4
A/a
Y
1 4
a/Y
1 2
1 × 4: Must be A/a × a/Y . Female gametes
1
Male gametes Progeny
A
a
Male gametes Progeny 1 2
a
1 4
A/a
1 2
Y
1 4
A/Y
1 2
a
1 4
a/a
1 2
Y
1 4
a/Y
2 × 3: Must be a/Y × A/a (same as 1 × 4). 2 × 8: Must be a/Y × a/a (all progeny normal).
74 C H APTER 2 Single-Gene Inheritance
p r obl e m s Most of the problems are also available for review/grading through the launchpad/iga 11e.
http://www.whfreeman.com/
Working with the Figures
B a s i c P r obl e m s
(The first 14 questions require inspection of text figures.)
14. Make up a sentence including the words chromosome, genes, and genome.
1. In the left-hand part of Figure 2-4, the red arrows show selfing as pollination within single flowers of one F1 plant. Would the same F2 results be produced by crosspollinating two different F1 plants? 2. In the right-hand part of Figure 2-4, in the plant showing an 11 : 11 ratio, do you think it would be possible to find a pod with all yellow peas? All green? Explain. 3. In Table 2-1, state the recessive phenotype in each of the seven cases. 4. Considering Figure 2-8, is the sequence “pairing → replication → segregation → segregation” a good shorthand description of meiosis? 5. Point to all cases of bivalents, dyads, and tetrads in Figure 2-11. 6. In Figure 2-11, assume (as in corn plants) that allele A encodes an allele that produces starch in pollen and allele a does not. Iodine solution stains starch black. How would you demonstrate Mendel’s first law directly with such a system? 7. Considering Figure 2-13, if you had a homozygous double mutant m3/m3 m5/m5, would you expect it to be mutant in phenotype? (Note: This line would have two mutant sites in the same coding sequence.) 8. In which of the stages of the Drosophila life cycle (represented in the box on page 56) would you find the products of meiosis? 9. If you assume Figure 2-15 also applies to mice and you irradiate male sperm with X rays (known to inactivate genes), what phenotype would you look for in progeny in order to find cases of individuals with an inactivated SRY gene?
15. Peas (Pisum sativum) are diploid and 2n = 14. In Neurospora, the haploid fungus, n = 7. If you were to isolate genomic DNA from both species and use electrophoresis to separate DNA molecules by size, how many distinct DNA bands would be visible in each species? 16. The broad bean (Vicia faba) is diploid and 2n = 18. Each haploid chromosome set contains approximately 4 m of DNA. The average size of each chromosome during metaphase of mitosis is 13 μm. What is the average packing ratio of DNA at metaphase? (Packing ratio = length of chromosome/length of DNA molecule therein.) How is this packing achieved? 17. If we call the amount of DNA per genome “x,” name a situation or situations in diploid organisms in which the amount of DNA per cell is a. x b. 2x c. 4x 18. Name the key function of mitosis. 19. Name two key functions of meiosis. 20. Design a different nuclear-division system that would achieve the same outcome as that of meiosis. 21. In a possible future scenario, male fertility drops to zero, but, luckily, scientists develop a way for women to produce babies by virgin birth. Meiocytes are converted directly (without undergoing meiosis) into zygotes, which implant in the usual way. What would be the short- and long-term effects in such a society? 22. In what ways does the second division of meiosis differ from mitosis?
10. In Figure 2-17, how does the 3 : 1 ratio in the bottom-lefthand grid differ from the 3 : 1 ratios obtained by Mendel?
23. Make up mnemonics for remembering the five stages of prophase I of meiosis and the four stages of mitosis.
11. In Figure 2-19, assume that the pedigree is for mice, in which any chosen cross can be made. If you bred IV-1 with IV-3, what is the probability that the first baby will show the recessive phenotype? 12. Which part of the pedigree in Figure 2-23 in your opinion best demonstrates Mendel’s first law?
24. In an attempt to simplify meiosis for the benefit of students, mad scientists develop a way of preventing premeiotic S phase and making do with having just one division, including pairing, crossing over, and segregation. Would this system work, and would the products of such a system differ from those of the present system?
13. Could the pedigree in Figure 2-31 be explained as an autosomal dominant disorder? Explain.
25. Theodor Boveri said, “The nucleus doesn’t divide; it is divided.” What was he getting at?
Problems 75
26. Francis Galton, a geneticist of the pre-Mendelian era, devised the principle that half of our genetic makeup is derived from each parent, one-quarter from each grandparent, one-eighth from each great-grandparent, and so forth. Was he right? Explain. 27. If children obtain half their genes from one parent and half from the other parent, why aren’t siblings identical? 28. State where cells divide mitotically and where they divide meiotically in a fern, a moss, a flowering plant, a pine tree, a mushroom, a frog, a butterfly, and a snail. 29. Human cells normally have 46 chromosomes. For each of the following stages, state the number of nuclear DNA molecules present in a human cell: a. Metaphase of mitosis b. Metaphase I of meiosis c. Telophase of mitosis d. Telophase I of meiosis e. Telophase II of meiosis 30. Four of the following events are part of both meiosis and mitosis, but only one is meiotic. Which one? (1) Chromatid formation, (2) spindle formation, (3) chromosome condensation, (4) chromosome movement to poles, (5) synapsis. 31. In corn, the allele f ′ causes floury endosperm and the allele f ″ causes flinty endosperm. In the cross f ′/f ′ × f ″/f ″ , all the progeny endosperms are floury, but, in the reciprocal cross, all the progeny endosperms are flinty. What is a possible explanation? (Check the legend for Figure 2-7.) 32. What is Mendel’s first law? 33. If you had a fruit fly (Drosophila melanogaster) that was of phenotype A, what cross would you make to determine if the fly’s genotype was A/A or A/a? 34. In examining a large sample of yeast colonies on a petri dish, a geneticist finds an abnormal-looking colony that is very small. This small colony was crossed with wild type, and products of meiosis (ascospores) were spread on a plate to produce colonies. In total, there were 188 wild-type (normal-size) colonies and 180 small ones. a. What can be deduced from these results regarding the inheritance of the small-colony phenotype? (Invent genetic symbols.) b. What would an ascus from this cross look like? 35. Two black guinea pigs were mated and over several years produced 29 black and 9 white offspring. Explain these results, giving the genotypes of parents and progeny.
36. In a fungus with four ascospores, a mutant allele lys-5 causes the ascospores bearing that allele to be white, whereas the wild-type allele lys-5+ results in black ascospores. (Ascospores are the spores that constitute the four products of meiosis.) Draw an ascus from each of the following crosses: a. lys-5 × lys-5+ b. lys-5 × lys-5 c. lys-5+ × lys-5+ 37. For a certain gene in a diploid organism, eight units of protein product are needed for normal function. Each wild-type allele produces five units. a. If a mutation creates a null allele, do you think this allele will be recessive or dominant? b. What assumptions need to be made to answer part a? 38. A Neurospora colony at the edge of a plate seemed to be sparse (low density) in comparison with the other colonies on the plate. This colony was thought to be a possible mutant, and so it was removed and crossed with a wild type of the opposite mating type. From this cross, 100 ascospore progeny were obtained. None of the colonies from these ascospores was sparse, all appearing to be normal. What is the simplest explanation of this result? How would you test your explanation? (Note: Neurospora is haploid.) 39. From a large-scale screen of many plants of Collinsia grandiflora, a plant with three cotyledons was discovered (normally, there are two cotyledons). This plant was crossed with a normal pure-breeding wild-type plant, and 600 seeds from this cross were planted. There were 298 plants with two cotyledons and 302 with three cotyledons. What can be deduced about the inheritance of three cotyledons? Invent gene symbols as part of your explanation. 40. In the plant Arabidopsis thaliana, a geneticist is interested in the development of trichomes (small projections). A large screen turns up two mutant plants (A and B) that have no trichomes, and these mutants seem to be potentially useful in studying trichome development. (If they were determined by single-gene mutations, then finding the normal and abnormal functions of these genes would be instructive.) Each plant is crossed with wild type; in both cases, the next generation (F1) had normal trichomes. When F1 plants were selfed, the resulting F2’s were as follows: F2 from mutant A: 602 normal; 198 no trichomes F2 from mutant B: 267 normal; 93 no trichomes
76 C H APTER 2 Single-Gene Inheritance
a. What do these results show? Include proposed genotypes of all plants in your answer. b. Under your explanation to part a, is it possible to confidently predict the F1 from crossing the original mutant A with the original mutant B? 41. You have three dice: one red (R), one green (G), and one blue (B). When all three dice are rolled at the same time, calculate the probability of the following outcomes: a. 6 (R), 6 (G), 6 (B) b. 6 (R), 5 (G), 6 (B) c. 6 (R), 5 (G), 4 (B) d. No sixes at all e. A different number on all dice 42. In the pedigree below, the black symbols represent individuals with a very rare blood disease.
3. Can parts of the problem be restated by using branch diagrams? 4. In the pedigree, identify a mating that illustrates Mendel’s first law. 5. Define all the scientific terms in the problem, and look up any other terms about which you are uncertain. 6. What assumptions need to be made in answering this problem? 7. Which unmentioned family members must be considered? Why? 8. What statistical rules might be relevant, and in what situations can they be applied? Do such situations exist in this problem? 9. What are two generalities about autosomal recessive diseases in human populations? 10. What is the relevance of the rareness of the phenotype under study in pedigree analysis generally, and what can be inferred in this problem? 11. In this family, whose genotypes are certain and whose are uncertain?
If you had no other information to go on, would you think it more likely that the disease was dominant or recessive? Give your reasons. 43. a. The ability to taste the chemical phenylthiocarbamide is an autosomal dominant phenotype, and the inability to taste it is recessive. If a taster woman with a nontaster father marries a taster man who in a previous marriage had a nontaster daughter, what is the probability that their first child will be (1) A nontaster girl (2) A taster girl (3) A taster boy b. What is the probability that their first two children will be tasters of either sex? www
Unpacking the Problem 44 John and Martha are contemplating having children, but John’s brother has galactosemia (an autosomal recessive disease) and Martha’s great-grandmother also had galactosemia. Martha has a sister who has three children, none of whom have galactosemia. What is the probability that John and Martha’s first child will have galactosemia? 1. Can the problem be restated as a pedigree? If so, write one. 2. Can parts of the problem be restated by using Punnett squares? www
12. In what way is John’s side of the pedigree different from Martha’s side? How does this difference affect your calculations? 13. Is there any irrelevant information in the problem as stated? 14. In what way is solving this kind of problem similar to solving problems that you have already successfully solved? In what way is it different? 15. Can you make up a short story based on the human dilemma in this problem? Now try to solve the problem. If you are unable to do so, try to identify the obstacle and write a sentence or two describing your difficulty. Then go back to the expansion questions and see if any of them relate to your difficulty. 45. Holstein cattle are normally black and white. A superb black-and-white bull, Charlie, was purchased by a farmer for $100,000. All the progeny sired by Charlie were normal in appearance. However, certain pairs of his progeny, when interbred, produced red-and-white progeny at a frequency of about 25 percent. Charlie was soon removed from the stud lists of the Holstein breeders. Use symbols to explain precisely why. 46. Suppose that a husband and wife are both heterozygous for a recessive allele for albinism. If they have dizygotic (two-egg) twins, what is the probability that both the twins will have the same phenotype for pigmentation?
Problems 77
47. The plant blue-eyed Mary grows on Vancouver Island and on the lower mainland of British Columbia. The populations are dimorphic for purple blotches on the leaves—some plants have blotches and others don’t. Near Nanaimo, one plant in nature had blotched leaves. This plant, which had not yet flowered, was dug up and taken to a laboratory, where it was allowed to self. Seeds were collected and grown into progeny. One randomly selected (but typical) leaf from each of the progeny is shown in the accompanying illustration.
Plants were collected from nature before flowering and were crossed or selfed with the following results:
Number of progeny
Pollination
Winged
Winged (selfed)
91
Winged (selfed)
90
Wingless (selfed)
Wingless 1* 30
4*
80
Winged × wingless
161
Winged × wingless
29
31
Winged × wingless
46
44
Winged × winged
*Phenotype probably has a nongenetic explanation.
Interpret these results, and derive the mode of inheritance of these fruit-shaped phenotypes. Use symbols. What do you think is the nongenetic explanation for the phenotypes marked by asterisks in the table? 50. The accompanying pedigree is for a rare, but relatively mild, hereditary disorder of the skin. I
a. Formulate a concise genetic hypothesis to explain these results. Explain all symbols and show all genotypic classes (and the genotype of the original plant). b. How would you test your hypothesis? Be specific. 48. Can it ever be proved that an animal is not a carrier of a recessive allele (that is, not a heterozygote for a given gene)? Explain. 49. In nature, the plant Plectritis congesta is dimorphic for fruit shape; that is, individual plants bear either wingless or winged fruits, as shown in the illustration.
1
II
1
III
IV
1
1
2
2
2
2
4
3
4
3
3
4
5
5
6
6
7
7
8
8
9
a. How is the disorder inherited? State reasons for your answer. b. Give genotypes for as many individuals in the pedigree as possible. (Invent your own defined allele symbols.)
Wingless fruit
Winged fruit
c. Consider the four unaffected children of parents III-4 and III-5. In all four-child progenies from parents of these genotypes, what proportion is expected to contain all unaffected children?
78 C H APTER 2 Single-Gene Inheritance
51. Four human pedigrees are shown in the accompanying illustration. The black symbols represent an abnormal phenotype inherited in a simple Mendelian manner.
53. The pedigree below was obtained for a rare kidney disease.
1
2
3
1
2
a. Deduce the inheritance of this condition, stating your reasons. b. If persons 1 and 2 marry, what is the probability that their first child will have the kidney disease? 54. This pedigree is for Huntington disease, a late-onset disorder of the nervous system. The slashes indicate deceased family members. I
1
4
II
a. For each pedigree, state whether the abnormal condition is dominant or recessive. Try to state the logic behind your answer. b. For each pedigree, describe the genotypes of as many persons as possible. 52. Tay-Sachs disease is a rare human disease in which toxic substances accumulate in nerve cells. The recessive allele responsible for the disease is inherited in a simple Mendelian manner. For unknown reasons, the allele is more common in populations of Ashkenazi Jews of eastern Europe. A woman is planning to marry her first cousin, but the couple discovers that their shared grandfather’s sister died in infancy of Tay-Sachs disease. a. Draw the relevant parts of the pedigree, and show all the genotypes as completely as possible. b. What is the probability that the cousins’ first child will have Tay-Sachs disease, assuming that all people who marry into the family are homozygous normal?
III
IV
V
1
1
2
2
3
3
1
2
4
5
4
2
6
5
3
4
1 Susan
7
6
7
8
5
6
2 Alan
a. Is this pedigree compatible with the mode of inheritance for Huntington disease mentioned in the chapter? b. Consider two newborn children in the two arms of the pedigree, Susan in the left arm and Alan in the right arm. Study the graph in Figure 2-24 and form an opinion on the likelihood that they will develop Huntington disease. Assume for the sake of the discussion that parents have children at age 25.
Problems 79
55. Consider the accompanying pedigree of a rare autosomal recessive disease, PKU. I
60. An X-linked dominant allele causes hypophosphatemia in humans. A man with hypophosphatemia marries a normal woman. What proportion of their sons will have hypophosphatemia?
II III IV
mal to small-winged flies can be expected in each sex in the F1? If F1 flies are intercrossed, what F2 progeny ratios are expected? What progeny ratios are predicted if F1 females are backcrossed with their father?
A
B
a. List the genotypes of as many of the family members as possible. b. If persons A and B marry, what is the probability that their first child will have PKU? c. If their first child is normal, what is the probability that their second child will have PKU? d. If their first child has the disease, what is the probability that their second child will be unaffected? (Assume that all people marrying into the pedigree lack the abnormal allele.) 56. A man has attached earlobes, whereas his wife has free earlobes. Their first child, a boy, has attached earlobes. a. If the phenotypic difference is assumed to be due to two alleles of a single gene, is it possible that the gene is X linked? b. Is it possible to decide if attached earlobes are dominant or recessive? 57. A rare recessive allele inherited in a Mendelian manner causes the disease cystic fibrosis. A phenotypically normal man whose father had cystic fibrosis marries a phenotypically normal woman from outside the family, and the couple consider having a child. a. Draw the pedigree as far as described. b. If the frequency in the population of heterozygotes for cystic fibrosis is 1 in 50, what is the chance that the couple’s first child will have cystic fibrosis? c. If the first child does have cystic fibrosis, what is the probability that the second child will be normal? 58. The allele c causes albinism in mice (C causes mice to be black). The cross C/c × c/c produces 10 progeny. What is the probability of all of them being black? 59. The recessive allele s causes Drosophila to have small wings, and the s+ allele causes normal wings. This gene is known to be X linked. If a small-winged male is crossed with a homozygous wild-type female, what ratio of nor-
61. Duchenne muscular dystrophy is sex linked and usually affects only males. Victims of the disease become progressively weaker, starting early in life. a. What is the probability that a woman whose brother has Duchenne’s disease will have an affected child? b. If your mother’s brother (your uncle) had Duchenne’s disease, what is the probability that you have received the allele? c. If your father’s brother had the disease, what is the probability that you have received the allele? 62. A recently married man and woman discover that each had an uncle with alkaptonuria (black urine disease), a rare disease caused by an autosomal recessive allele of a single gene. They are about to have their first baby. What is the probability that their child will have alkaptonuria? 63. The accompanying pedigree concerns a rare inherited dental abnormality, amelogenesis imperfecta.
a. What mode of inheritance best accounts for the transmission of this trait? b. Write the genotypes of all family members according to your hypothesis. 64. A couple who are about to get married learn from studying their family histories that, in both their families, their unaffected grandparents had siblings with cystic fibrosis (a rare autosomal recessive disease). a. If the couple marries and has a child, what is the probability that the child will have cystic fibrosis? b. If they have four children, what is the chance that the children will have the precise Mendelian ratio of 3 : 1 for normal : cystic fibrosis? c. If their first child has cystic fibrosis, what is the probability that their next three children will be normal? 65. A sex-linked recessive allele c produces a red–green color blindness in humans. A normal woman whose father was color blind marries a color-blind man.
8 0 C H APTER 2 Single-Gene Inheritance
a. What genotypes are possible for the mother of the color-blind man? b. What are the chances that the first child from this marriage will be a color-blind boy? c. Of the girls produced by these parents, what proportion can be expected to be color blind? d. Of all the children (sex unspecified) of these parents, what proportion can be expected to have normal color vision? 66. Male house cats are either black or orange; females are black, orange, or calico. a. If these coat-color phenotypes are governed by a sexlinked gene, how can these observations be explained? b. Using appropriate symbols, determine the phenotypes expected in the progeny of a cross between an orange female and a black male. c. Half the females produced by a certain kind of mating are calico, and half are black; half the males are orange, and half are black. What colors are the parental males and females in this kind of mating? d. Another kind of mating produces progeny in the following proportions: one-fourth orange males, onefourth orange females, one-fourth black males, and onefourth calico females. What colors are the parental males and females in this kind of mating? 67. The pedigree below concerns a certain rare disease that is incapacitating but not fatal.
?
?
?
a. Determine the most likely mode of inheritance of this disease. b. Write the genotype of each family member according to your proposed mode of inheritance. c. If you were this family’s doctor, how would you advise the three couples in the third generation about the likelihood of having an affected child? 68. In corn, the allele s causes sugary endosperm, whereas S causes starchy. What endosperm genotypes result from each of the following crosses? a. s/s female × S/S male b. S/S female × s/s male c. S/s female × S/s male
69. A plant geneticist has two pure lines, one with purple petals and one with blue. She hypothesizes that the phenotypic difference is due to two alleles of one gene. To test this idea, she aims to look for a 3 : 1 ratio in the F2. She crosses the lines and finds that all the F1 progeny are purple. The F1 plants are selfed, and 400 F2 plants are obtained. Of these F2 plants, 320 are purple and 80 are blue. Do these results fit her hypothesis well? If not, suggest why. www
Unpacking the Problem 70 A man’s grandfather has galactosemia, a rare autosomal recessive disease caused by the inability to process galactose, leading to muscle, nerve, and kidney malfunction. The man married a woman whose sister had galactosemia. The woman is now pregnant with their first child. www
a. Draw the pedigree as described. b. What is the probability that this child will have galactosemia? c. If the first child does have galactosemia, what is the probability that a second child will have it? Ch a ll e n g i n g P r obl e m s
71. A geneticist working on peas has a single plant monohybrid Y/ y (yellow) plant and, from a self of this plant, wants to produce a plant of genotype y/y to use as a tester. How many progeny plants need to be grown to be 95 percent sure of obtaining at least one in the sample? 72. A curious polymorphism in human populations has to do with the ability to curl up the sides of the tongue to make a trough (“tongue rolling”). Some people can do this trick, and others simply cannot. Hence, it is an example of a dimorphism. Its significance is a complete mystery. In one family, a boy was unable to roll his tongue but, to his great chagrin, his sister could. Furthermore, both his parents were rollers, and so were both grandfathers, one paternal uncle, and one paternal aunt. One paternal aunt, one paternal uncle, and one maternal uncle could not roll their tongues. a. Draw the pedigree for this family, defining your symbols clearly, and deduce the genotypes of as many individual members as possible. b. The pedigree that you drew is typical of the inheritance of tongue rolling and led geneticists to come up with the inheritance mechanism that no doubt you came up with. However, in a study of 33 pairs of identical twins, both members of 18 pairs could roll, neither member of 8 pairs could roll, and one of the twins in 7 pairs could roll but the other could not. Because identical twins are derived from the splitting of one fertilized egg into two embryos,
Problems 81
?
I II III IV V I II III IV V VI Red beard and body hair
Red hair
the members of a pair must be genetically identical. How can the existence of the seven discordant pairs be reconciled with your genetic explanation of the pedigree? 73. Red hair runs in families, as the pedigree above shows. (Pedigree data from W. R. Singleton and B. Ellis, Journal of Heredity 55, 1964, 261.) a. Does the inheritance pattern in this pedigree suggest that red hair could be caused by a dominant or a recessive allele of a gene that is inherited in a simple Mendelian manner? b. Do you think that the red-hair allele is common or rare in the population as a whole? 74. When many families were tested for the ability to taste the chemical phenylthiocarbamide, the matings were grouped into three types and the progeny were totaled, with the results shown below: Parents Taster × taster Taster × nontaster Nontaster × nontaster
Children Number of families Tasters 425 289 86
929 483 5
ratios in each of the three types of mating be accounted for? 75. A condition known as icthyosis hystrix gravior appeared in a boy in the early eighteenth century. His skin became very thick and formed loose spines that were sloughed off at intervals. When he grew up, this “porcupine man” married and had six sons, all of whom had this condition, and several daughters, all of whom were normal. For four generations, this condition was passed from father to son. From this evidence, what can you postulate about the location of the gene? 76. The wild-type (W) Abraxas moth has large spots on its wings, but the lacticolor (L) form of this species has very small spots. Crosses were made between strains differing in this character, with the following results: Parents Cross
Progeny F1
F2 1 2
Nontasters
1
L
W
W W
L, 12 W W
130 278 218
2
W
L
L
W
With the assumption that PTC tasting is dominant (P ) and nontasting is recessive ( p), how can the progeny
1 2 1 2
W, 12 L W, 12 L
Provide a clear genetic explanation of the results in these two crosses, showing the genotypes of all individual moths.
82 C H APTER 2 Single-Gene Inheritance
77. The pedigree above shows the inheritance of a rare human disease. Is the pattern best explained as being caused by an X-linked recessive allele or by an autosomal dominant allele with expression limited to males? (Pedigree data from J. F. Crow, Genetics Notes, 6th ed. Copyright 1967 by Burgess Publishing Co., Minneapolis.) 78. A certain type of deafness in humans is inherited as an X-linked recessive trait. A man with this type of deafness married a normal woman, and they are expecting a child. They find out that they are distantly related. Part of the family tree is shown here.
?
How would you advise the parents about the probability of their child being a deaf boy, a deaf girl, a normal boy, or a normal girl? Be sure to state any assumptions that you make. 79. The accompanying pedigree shows a very unusual inheritance pattern that actually did exist. All progeny are
shown, but the fathers in each mating have been omitted to draw attention to the remarkable pattern. a. Concisely state exactly what is unusual about this pedigree. b. Can the pattern be explained by Mendelian inheritance?
Appendix 2-1 Stages of Mitosis 8 3
Appendix 2-1
Stages of Mitosis
Mitosis usually takes up only a small proportion of the cell cycle, approximately 5 to 10 percent. The remaining time is the interphase, composed of G1, S, and G2 stages. The DNA is replicated during the S phase, although the duplicated DNA does not become visible until later in mitosis. The chromosomes cannot be seen during interphase (see below),
Telophase: A nuclear membrane re-forms around each daughter nucleus, the chromosomes uncoil, and the cytoplasm is divided into two by a new cell membrane. The spindle has dispersed.
mainly because they are in an extended state and are intertwined with one another like a tangle of yarn. The photographs below show the stages of mitosis in the nuclei of root-tip cells of the royal lily, Lilium regale. In each stage, a photograph is shown at the left and an interpretive drawing at the right.
Early prophase: The chromosomes become distinct for the first time. They condense and become progressively shorter, forming spirals or coils that are more easily moved.
1 Interphase
2 Early mitotic prophase
6 Mitotic telophase
Anaphase. The pairs of sister chromatids separate, one of a pair moving to each pole. The centromeres divide and separate first. As each chromatid moves, its two arms appear to trail its centromere; a set of V-shaped structures results, with the points of the V’s directed at the poles.
5 Mitotic anaphase
3 Late mitotic prophase Pole
Spindle Pole
4 Mitotic metaphase
Metaphase: The nuclear spindle becomes prominent. The spindle is a birdcage-like series of parallel fibers that point to each of two cell poles. The chromosomes move to the equatorial plane of the cell, where the centromeres become attached to a spindle fiber from each pole.
The photographs show mitosis in the nuclei of root-tip cells of Lilium regale. [ J. McLeish and B. Snoad, Looking at Chromosomes. Copyright 1958, St. Martin’s, Macmillan.]
Late prophase: Each chromosome is seen to have become a pair of strands; these are the identical “sister” chromatids formed when the DNA replicated during S phase. The chromatids in each pair are joined at the centromere. The nuclear membrane breaks down.
8 4 C H APTER 2 Single-Gene Inheritance
Stages of Meiosis
Appendix 2-2
Meiosis consists of two nuclear divisions distinguished as meiosis I and meiosis II, which take place in consecutive cell divisions. Each meiotic division is formally divided into prophase, metaphase, anaphase, and telophase. Of these stages, the most complex and lengthy is prophase I, which is divided into five stages.
The photographs below show the stages of meiosis in the nuclei of root-tip cells of the royal lily, Lilium regale. In each stage, a photograph is shown at the left and an interpretive drawing at the right.
2 Zygotene
3 Pachytene
Prophase I: Zygotene. The threads form pairs as each chromosome progressively aligns, or synapses, along the length of its homologous partner.
Prophase I: Pachytene. Chromosomes are thick and fully synapsed. Thus, the number of pairs of homologous chromosomes is equal to the number n.
1 Leptotene Prophase I: Leptotene. The chromosomes become visible as long, thin single threads. Chromosomes begin to contract and continue contracting throughout the entire prophase.
16 Young pollen grains
The tetrad and young pollen grains: In the anthers of a flower, the four products of meiosis develop into pollen grains. In other organisms, the products of meiosis differentiate into other kinds of structures, such as sperm cells in animals.
15 The tetrad Cells divide
Telophase II: The nuclei re-form around the chromosomes at the poles.
14 Telophase II Anaphase II: Centromeres split and sister chromatids are pulled to opposite poles by the spindle fibers.
13 Anaphase II
Metaphase II: The pairs of sister chromatids arrange themselves on the equatorial plane. Here the chromatids often partly dissociate from each other instead of being closely pressed together as they are in mitosis.
12 Metaphase II
Prophase II: The haploid number of sister chromatid pairs are now present in the contracted state.
11 Prophase II
Appendix 2-2 Stages of Meiosis 8 5
The photographs show meiosis and pollen formation in Lilium regale. Note: For simplicity, multiple chiasmata are drawn between only two chromatids; in reality, all four chromatids can take part. [ J. McLeish and B. Snoad, Looking at Chromosomes. Copyright 1958, St. Martin’s, Macmillan.]
4 Diplotene 4 Diplotene
5 Diakinesis 5 Diakinesis
Prophase I: Diplotene. Although the DNAthe has Prophase I: Diplotene. Although DNA has alreadyalready replicated during the premeiotic S phase, replicated during the premeiotic S phase, this factthis firstfact becomes manifestmanifest only in diplotene first becomes only in diplotene as eachas chromosome is seen is toseen haveto become a each chromosome have become a pair of sister The synapsed structure pair ofchromatids. sister chromatids. The synapsed structure now consists of a bundle four homologous now consists of a of bundle of four homologous chromosomes. The paired separate chromosomes. The homologs paired homologs separate slightly,slightly, and oneand or more structures one orcross-shaped more cross-shaped structures called chiasmata (singular, chiasma) appear appear called chiasmata (singular, chiasma) betweenbetween nonsister chromatids. nonsister chromatids.
Prophase I: Diakinesis. Further Further chromosome Prophase I: Diakinesis. chromosome contraction produces compactcompact units that arethat veryare very contraction produces units maneuverable. maneuverable.
Metaphase Metaphase I: The nuclear I: The nuclear membrane membrane has has disappeared, disappeared, and each and pair each of homologs pair of homologs takes takes up a position up a position in the equatorial in the equatorial plane. At plane. this stage At thisofstage of meiosis,meiosis, the centromeres the centromeres do not divide; do not this divide; lackthis of lack of divisiondivision is a major is adifference major difference from mitosis. from mitosis. The twoThe two centromeres centromeres of a homologous of a homologous chromosome chromosome pair pair attach to attach spindle to spindle fibers from fibers opposite from opposite poles. poles.
Anaphase Anaphase I: The members I: The members of eachof homologous each homologous pair move pairtomove opposite to opposite poles. poles.
Telophase I and interphase: The chromosomes elongate and become diffuse, diffuse, Telophase I and interphase: The chromosomes elongate and become the nuclear membrane re-forms, and theand cell the divides. After telophase I, there I,isthere is the nuclear membrane re-forms, cell divides. After telophase an interphase, called interkinesis. In manyInorganisms, telophase 1 and 1 and an interphase, called interkinesis. many organisms, telophase interkinesis do not exist or exist are brief in duration. In any case, neverisDNA interkinesis do not or are brief in duration. In anythere case,isthere never DNA synthesis at this time, and theand genetic state ofstate the chromosomes does not synthesis at this time, the genetic of the chromosomes does not change.change.
Cell divides Cell divides 10 Interphase 10 Interphase
9 Telophase I 9 Telophase I
6 Metaphase I 6 Metaphase I
7 anaphase Early anaphase I 7 Early I
8 anaphase Later anaphase I 8 Later I
This page intentionally left blank
344
3
C h a p t e r
Independent Assortment of Genes
Learning Outcomes After completing this chapter, you will be able to • In diploids, design experiments to make a dihybrid and then self- or testcross it. • In diploids, analyze the progeny phenotypes of dihybrid selfs and testcrosses and, from these results, assess whether the two genes are assorting independently (which would suggest locations of different chromosomes). • In haploids, design experiments to make a transient diploid dihybrid AaBb and analyze its haploid progeny to assess whether the two genes are assorting independently. • In crosses involving independently assorting dihybrids, predict the genotypic ratios in meiotic products, genotypic ratios in progeny, and phenotypic ratios in progeny. The Green Revolution in agriculture is fostered by the widespread planting of superior lines of crops (such as rice, shown here) made by combining beneficial genetic traits. [ Jorgen Schytte.]
outline 3.1 Mendel’s law of independent assortment 3.2 Working with independent assortment 3.3 The chromosomal basis of independent assortment 3.4 Polygenic inheritance 3.5 Organelle genes: inheritance independent of the nucleus
• Use chi-square analysis to test whether observed phenotypic ratios are an acceptable fit to those predicted by independent assortment. • In diploids, design experiments to synthesize lines that are pure-breeding (homozygous) for two or more genes. • Interpret two-gene independent assortment ratios in terms of chromosome behavior at meiosis. • Analyze progeny ratios of dihybrids in terms of recombinant frequency (RF) and apply the diagnostic RF for independent assortment. • Extend the principles of two-gene independent assortment to heterozygotes for three or more genes. • Extend the principle of independent assortment to multiple genes that each contribute to a phenotype showing continuous distribution. • Apply the diagnostic criteria for assessing whether mutations are in genes in cytoplasmic organelles.
87
8 8 CHA P TER 3 Independent Assortment of Genes
T
his chapter is about the principles at work when two or more cases of single-gene inheritance are analyzed simultaneously. Nowhere have these principles been more important than in plant and animal breeding in agriculture. For example, between the years 1960 and 2000, the world production of food plants doubled, marking a so-called Green Revolution. What made this Green Revolution possible? In part, it was due to improved agricultural practice, but more important was the development of superior crop genotypes by plant geneticists. These breeders are constantly on the lookout for the chance occurrence of single-gene mutations that significantly increase yield or nutrient value. However, such mutations arise in different lines in different parts of the world. For example, in rice, one of the world’s main food crops, the following mutations have been crucial in the Green Revolution: sd1. This recessive allele results in short stature, making the plant more resistant to “lodging,” or falling over, in wind and rain; it also increases the relative amount of the plant’s energy that is routed into the seed, the part that we eat. se1. This recessive allele alters the plant’s requirement for a specific daylength, enabling it to be grown at different latitudes. Xa4. This dominant allele confers resistance to the disease bacterial blight. bph2. This allele confers resistance to brown plant hoppers (a type of insect). Snb1. This allele confers tolerance to plant submersion after heavy rains. To make a truly superior genotype, combining such alleles into one line is clearly desirable. To achieve such a combination, mutant lines must be intercrossed two at a time. For instance, a plant geneticist might start by crossing a strain homozygous for sd1 to another homozygous for Xa4. The F1 progeny of this cross would carry both mutations but in a heterozygous state. However, most agriculture uses pure lines, because they can be efficiently propagated and distributed to farmers. To obtain a pure-breeding doubly mutant sd1/sd1·Xa4/Xa4 line, the F1 would have to be bred further to allow the alleles to “assort” into the desirable combination. Some products of such breeding are shown in Figure 3-1. What principles are relevant here? It depends very much on whether the two genes are on the same chromosome pair or on different chromosome pairs. In the latter case, the chromosome pairs act independently at meiosis, and the alleles of two heterozygous gene pairs are said Rice lines to show independent assortment. This chapter explains how we can recognize independent assortment and how the principle of independent assortment can be used in strain construction, both in agriculture and in basic genetic research. (Chapter 4 covers the analogous principles applicable to heterozygous gene pairs on the same chromosome pair.) We shall also see that independent assortment of an array of genes is also useful in providing a basic heritable mechanism for continuous phenotypes. These are properties such as height or weight that do not fall into distinct categories but are nevertheless often heavily influenced by multiple genes collectively called “polygenes.” We shall examine the role of independent assortment in the inheritance of conF i g u r e 3 -1 Superior genotypes of crops such as rice have revolutionized tinuous phenotypes influenced by such polyagriculture. This photograph shows some of the key genotypes used in rice breeding programs. [ Bloomberg/Getty Images.]
3.1 Mendel’s Law of Independent Assortment 8 9
genes. We will see that independent assortment of polygenes can produce a continuous phenotypic distribution among progeny. Lastly, we will introduce a different type of independent inheritance, that of genes in the organelles mitochondria and chloroplasts. Unlike nuclear chromosomes, these genes are inherited cytoplasmically and result in different patterns of inheritance than observed for nuclear genes and chromosomes. This pattern is independent of genes showing nuclear inheritance. First, we examine the analytical procedures that pertain to the independent assortment of nuclear genes. These were first developed by the father of genetics, Gregor Mendel. So, again, we turn to his work as a prototypic example.
3.1 Mendel’s Law of Independent Assortment In much of his original work on peas, Mendel analyzed the descendants of pure lines that differed in two characters. The following general symbolism is used to represent genotypes that include two genes. If two genes are on different chromosomes, the gene pairs are separated by a semicolon—for example, A/a ; B/b. If they are on the same chromosome, the alleles on one homolog are written adjacently with no punctuation and are separated from those on the other homolog by a slash—for example, AB/ab or Ab/aB. An accepted symbolism does not exist for situations in which it is not known whether the genes are on the same chromosome or on different chromosomes. For this situation of unknown position in this book, we will use a dot to separate the genes—for example, A/a·B/b. Recall from Chapter 2 that a heterozygote for a single gene (such as A/a) is sometimes called a monohybrid: accordingly, a double heterozygote such as A/a·B/b is sometimes called a dihybrid. From studying dihybrid crosses (A/a·B/b × A/a·B/b), Mendel came up with his second important principle of heredity, the law of independent assortment, Round and wrinkled phenotypes sometimes called Mendel’s second law. The pair of characters that he began working with were seed shape and seed color. We have already followed the monohybrid cross for seed color (Y/y × Y/y), which gave a progeny ratio of 3 yellow : 1 green. The seed shape phenotypes (Figure 3-2) were round (determined by allele R) and wrinkled (determined by allele r). The monohybrid cross R/r × R/r gave a progeny ratio of 3 round : 1 wrinkled as expected (see Table 2-1, page 37). To perform a dihybrid cross, Mendel started with two pure parental lines. One line had wrinkled, yellow seeds. Because Mendel had no concept of the chromosomal location of genes, we must use the dot representation to write the combined genotype initially as r/r·Y/Y. The other line had round, green seeds, with genotype R/R·y/y. When these two lines were crossed, they must have produced gametes that were r·Y and R·y , respectively. Hence, the F1 seeds had to be dihybrid, of genotype R/r·Y/y. Mendel discovered that the F1 seeds were round and yellow. This result showed that the dominance of R over r and Y over y was unaffected by the condition of the other gene pair in the R/r·Y/y dihybrid. In other words, R remained dominant over r, regardless of seed color, and Y remained dominant over y, regardless of seed shape. Next, Mendel selfed the dihybrid F1 to obtain the F2 generation. The F i g u r e 3 -2 Round (R/R or R/r) and wrinkled (r/r ) F2 seeds were of four different types in the following proportions:
9 16 3 16 3 16 1 16
round, yellow round, green wrinkled, yellow wrinkled, green
peas are present in a pod of a selfed heterozygous plant (R/r ). The phenotypic ratio in this pod happens to be precisely the 3 :1 ratio expected on average in the progeny of this selfing. (Molecular studies have shown that the wrinkled allele used by Mendel is produced by the insertion of a segment of mobile DNA into the gene; see Chapter 15.) [ Madan K. Bhattacharyya.]
9 0 CHA P TER 3 Independent Assortment of Genes
Mendel’s breeding program that produced a 9 : 3 : 3 : 1 ratio P
Gametes
R /R • y/y (round, green)
r /r • Y/ Y (wrinkled, yellow)
R •y
r •Y
F1 R/r • Y/y (round, yellow) F1
F2
F1
315 round, yellow
9
108 round, green
3
101 wrinkled, yellow
3
32 wrinkled, green
1
556 seeds
Ratio
a result that is illustrated in Figure 3-3 with the actual numbers obtained by Mendel. This initially unexpected 9 : 3 : 3 : 1 ratio for these two characters seems a lot more complex than the simple 3 : 1 ratios of the monohybrid crosses. Nevertheless, the 9 : 3 : 3 : 1 ratio proved to be a consistent inheritance pattern in peas. As evidence, Mendel also made dihybrid crosses that included several other combinations of characters and found that all of the dihybrid F1 individuals produced 9 : 3 : 3 : 1 ratios in the F2. The ratio was another inheritance pattern that required the development of a new idea to explain it. First, let’s check the actual numbers obtained by Mendel in Figure 3-3 to determine if the monohybrid 3 : 1 ratios can still be found in the F2. In regard to seed shape, there are 423 round seeds (315 + 108) and 133 wrinkled seeds (101 + 32). This result is close to a 3 : 1 ratio (actually 3.2 : 1). Next, in regard to seed color, there are 416 yellow seeds (315 + 101) and 140 green (108 + 32), also very close to a 3 : 1 ratio (almost exactly 3 : 1). The presence of these two 3 : 1 ratios hidden in the 9 : 3 : 3 : 1 ratio was undoubtedly a source of the insight that Mendel needed to explain the 9 : 3 : 3 : 1 ratio, because he realized that it was simply two different 3 : 1 ratios combined at random. One way of visualizing the random combination of these two ratios is with a branch diagram, as follows: 3 4 3 4
of the F2 is round 1 4
16
3 4
F i g u r e 3 - 3 Mendel synthesized a
1 4
dihybrid that, when selfed, produced F2 progeny in the ratio 9 : 3 : 3 : 1.
of the F2 is wrinkled 1 4
of these round seeds will be yellow will be green of these wrinkled seeds will be yellow will be green
The probabilities of the four possible outcomes are calculated by using the product rule (the probability of two independent events occurring together is the product of their individual probabilities). Hence, we multiply along the branches in the diagram. For example, 3/4 of all seeds will be round, and 3/4 of the round seeds will be yellow, so the probability of a seed being both round and yellow is calculated as 3/4 × 3/4, which equals 9/16. These multiplications give the following four proportions:
3 4
×
3 4
=
9 16
round, yellow
3 4
×
1 4
=
3 16
round, green
1 4
×
3 4
=
3 16
wrinkled, yellow
1 4
×
1 4
=
1 16
wrinkled, green
These proportions constitute the 9 : 3 : 3 : 1 ratio that we are trying to explain. However, is this exercise not merely number juggling? What could the combination of the two 3 : 1 ratios mean biologically? The way that Mendel phrased his explanation does in fact amount to a biological mechanism. In what is now known as the law of independent assortment (Mendel’s second law), he concluded that different gene pairs assort independently during gamete formation. The consequence is that, for two heterozygous gene pairs A/a and B/b, the b allele is just as likely to end up in a gamete with an a allele as with an A allele, and likewise for the B allele.
3.1 Mendel’s Law of Independent Assortment 91
In hindsight, we now know that, for the most part, this “law” applies to genes on different chromosomes. Genes on the same chromosome generally do not assort independently because they are held together by the chromosome itself. K e y C o n c e p t Mendel’s second law (the principle of independent assortment) states that gene pairs on different chromosome pairs assort independently at meiosis.
Mendel’s original statement of this law was that different genes assort independently because he apparently did not encounter (or he ignored) any exceptions that might have led to the concept of linkage. We have explained the 9 : 3 : 3 : 1 phenotypic ratio as two randomly combined 3 : 1 phenotypic ratios. But can we also arrive at the 9 : 3 : 3 : 1 ratio from a consideration of the frequency of gametes, the actual meiotic products? Let us consider the gametes produced by the F1 dihybrid R/r ; Y/y (the semicolon shows that we are now embracing the idea that the genes are on different chromosomes). Again, we will use the branch diagram to get us started because it visually illustrates independence. Combining Mendel’s laws of equal segregation and independent assortment, we can predict that 1 2
1 2
1 2
of these R gametes will be Y
1 2
will be y
1 2
of these r gametes will be Y
1 2
will be y
of the gametes will be R
of the gametes will be r
Multiplication along the branches according to the product rule gives us the gamete proportions: 1 R ; Y 4
1 4
R ; y
1 4
r ; Y
1 4
r ; y
These proportions are a direct result of the application of the two Mendelian laws: of segregation and of independence. However, we still have not arrived at the 9 : 3 : 3 : 1 ratio. The next step is to recognize that, because male and female gametes obey the same laws during formation, both the male and the female gametes will show the same proportions just given. The four female gametic types will be fertilized randomly by the four male gametic types to obtain the F2. The best graphic way of showing the outcomes of the cross is by using a 4 × 4 grid called a Punnett square, which is depicted in Figure 3-4. We have already seen that grids are useful in genetics for providing a visual representation of the data. Their usefulness lies in the fact that their proportions can be drawn according to the genetic proportions or ratios under consideration. In the Punnett square in Figure 3-4, for example, four rows and four columns were drawn to correspond to the four genotypes of female gametes and the four of male gametes. We see that there are 16 boxes representing the various gametic fusions and that each box is 1/16th of the total area of the grid. In accord with the product rule, each 1/16th is a result of the fertilization of one egg type at frequency 1/4 by one sperm type also at frequency 1/4, giving the probability of that fusion as (1/4)2. As the Punnett square shows, the F2 contains a variety of genotypes, but there are only four phenotypes and their proportions are in the 9 : 3 : 3 : 1 ratio. So we see that, when we calculate progeny frequencies directly
92 CHA P TER 3 Independent Assortment of Genes
through gamete frequencies, we still arrive at the 9 : 3 : 3 : 1 ratio. Hence, Mendel’s laws explain not only the F2 phenotypes, but also the genotypes of gametes and progeny that underly the F2 phenotypic ratio. Mendel went on to test his principle of independent assortment in a number of ways. The most direct way focused on the 1 : 1 : 1 : 1 gametic ratio hypothesized to be produced by the F1 dihybrid R/r ; Y/y, because this ratio sprang directly from his principle of independent assortment and was the biological basis of the 9 : 3 : 3 : 1 ratio in the F2, as shown by the Punnett square. To verify the 1 : 1 : 1 : 1 gametic ratio, Mendel used a testcross. He testcrossed the F1 dihybrid with a tester of genotype r/r ; y/y, which produces only gametes with recessive alleles (genotype r ; y). He reasoned that, if there were in fact a 1 : 1 : 1 : 1 ratio of R ; Y, R ; y , r ; Y, and r ; y gametes, the progeny proportions of this cross should directly correspond to the gametic proportions produced by the dihybrid; in other words,
Punnett square illustrating the genotypes underlying a 9 : 3 : 3 : 1 ratio P
R /R ; y/y (round, green)
r /r ; Y/ Y (wrinkled, yellow)
Gametes
R ;y
r ;Y
F1 R /r ; Y/ y (round, yellow) F1
F1
gametes F2
R ;Y 1 4
gametes
R ;y 1 4
r ;y 1 4
r ;Y 1 4
R ;Y
R ;y
r ;y
r ;Y
1 4
1 4
1 4
1 4
1 4
R/r ; Y/y → round, yellow
R /r ; Y/ y
R /r ; Y/ Y
1 4
R/r ; y/y → round, green
1 16
1 16
1 4
r/r ; Y/y → wrinkled, yellow
1 4
r/r ; y/y → wrinkled, green
R /R ; Y/ Y
R /R ; Y/y
1 16
R /R ; Y/ y
1 16
R /R ; y/y
1 16
1 16
R /r ; Y/ y
R /r ; y/y
1 16
1 16
R /r ; Y/ Y
R /r ; Y/y
1 16
1 16
9
:3
round, yellow round, green
F i g u r e 3 - 4 We can use a Punnett
square to predict the result of a dihybrid cross. This Punnett square shows the predicted genotypic and phenotypic constitution of the F2 generation from a dihybrid cross.
:3
R /r ; y/y
R /r ; Y/y
1 16
1 16
These proportions were the result that he obtained, perfectly consistent with his expectations. He obtained similar results for all the other dihybrid crosses that he made, and these tests and other types of tests all showed r/r ; Y/y r /r ; y/y that he had, in fact, devised a robust model to explain the 1 1 16 16 inheritance patterns observed in his various pea crosses. In the early 1900s, both of Mendel’s laws were tested in a wide spectrum of eukaryotic organisms. The results of r/r ; Y/Y r /r ; Y/y these tests showed that Mendelian principles were gener1 1 16 16 ally applicable. Mendelian ratios (such as 3 : 1, 1 : 1, 9 : 3 : 3 : 1, and 1 : 1 : 1 : 1) were extensively reported, suggesting that equal segregation and independent assortment are fundamental hereditary processes found throughout nature. :1 Mendel’s laws are not merely laws about peas; they are laws wrinkled, yellow about the genetics of eukaryotic organisms in general. As an example of the universal applicability of the wrinkled, green principle of independent assortment, we can examine its action in haploids. If the principle of equal segregation is valid across the board, then we should be able to observe its action in haploids, given that haploids undergo meiosis. Indeed, independent assortment can be observed in a cross of the type A ; B × a ; b. Fusion of parental cells results in a transient diploid meiocyte that is a dihybrid A/a ; B/b, and the randomly sampled products of meiosis (sexual spores such as ascospores in fungi) will be
1 4
A ; B
1 4
A ; b
1 4
a ; B
1 4
a ; b
3.2 Working with Independent Assortment 9 3
Hence, we see the same ratio as in the dihybrid testcross in a diploid organism; again, the ratio is a random combination of two monohybrid 1 : 1 ratios because of independent assortment. K e y C o n c e p t Ratios of 1 : 1 : 1 : 1 and 9 : 3 : 3 : 1 are diagnostic of independent assortment in one and two dihybrid meiocytes, respectively.
3.2 Working with Independent Assortment In this section, we will examine several analytical procedures that are part of everyday genetic research and are all based on the concept of independent assortment. These procedures are all used to analyze phenotypic ratios.
Predicting progeny ratios Genetics can work in either of two directions: (1) predicting the unknown genotypes of parents by using phenotype ratios of progeny or (2) predicting progeny phenotype ratios from parents of known genotype. The latter is an important part of genetics concerned with predicting the types of progeny that emerge from a cross and calculating their expected frequencies—in other words, their probabilities. This is useful not only in research on model organisms but also in predicting outcomes of matings in human genetics; for example in genetic counseling, people appreciate specific risk estimates. We have already examined two methods for prediction: Punnett squares and branch diagrams. Punnett squares can be used to show hereditary patterns based on one gene pair, two gene pairs, or more. Such grids are good graphic devices for representing progeny, but drawing them is time consuming. Even the 16-compartment Punnett square that we used to analyze a dihybrid cross takes a long time to write out, but, for a trihybrid cross, there are 23, or 8, different gamete types, and the Punnett square has 64 compartments. The branch diagram (shown below) is easier to create and is adaptable for phenotypic, genotypic, or gametic proportions, as illustrated for the dihybrid A/a ; B/b. Progeny genotypes from a self
Progeny phenotypes from a self 1 4
1 4
1 2
1 4
A/A
A/a
a/a
1 2 1 4 1 4 1 2
3 4
B/B B/b
3 4
Gametes B/ 1 2
A/
1 2
B
A
b/b
1 4
b/b
1 2
b
B/B
3 4
B/
1 2
B
1 2
b
B/b
1 4
b/b
1 4
B/B
1 2
B/b
1 4
b/b
1 4
1 2
a/a 1 4
b/b
a
Note, however, that the “tree” of branches for genotypes is quite unwieldy even in this simple case, which uses two gene pairs, because there are 32 = 9 genotypes. For three gene pairs, there are 33, or 27, possible genotypes. To simplify this problem, we can use a statistical approach, which constitutes a third method for calculating the probabilities (expected frequencies) of specific phenotypes or genotypes coming from a cross. The two statistical rules needed are the
9 4 CHA P TER 3 Independent Assortment of Genes
product rule (introduced in Chapter 2) and the sum rule, which we will now consider together. K e y C o n c e p t The product rule states that the probability of independent events occurring together is the product of their individual probabilities.
The possible outcomes from rolling two dice follow the product rule because the outcome on one die is independent of the other. As an example, let us calculate the probability, p, of rolling a pair of 4’s. The probability of a 4 on one die is 1/6 because the die has six sides and only one side carries the number 4. This probability is written as follows: p (one 4) =
1 6
Therefore, with the use of the product rule, the probability of a 4 appearing on both dice is 1/6 × 1/6 = 1/36, which is written p (two 4’s) =
1 6
×
1 6
=
1 36
Now for the sum rule: K e y C o n c e p t The sum rule states that the probability of either one or the other of two mutually exclusive events occurring is the sum of their individual probabilities.
(Note that, in the product rule, the focus is on outcomes A and B. In the sum rule, the focus is on the outcome A′ or A″.) Dice can also be used to illustrate the sum rule. We have already calculated that the probability of two 4’s is 1/36; clearly, with the use of the same type of calculation, the probability of two 5’s will be the same, or 1/36. Now we can calculate the probability of either two 4’s or two 5’s. Because these outcomes are mutually exclusive, the sum rule can be used to tell us that the answer is 1/36 + 1/36, which is 1/18. This probability can be written as follows: p (two 4’s or two 5’s) =
1 36
+
1 36
=
1 18
What proportion of progeny will be of a specific genotype? Now we can turn to a genetic example. Assume that we have two plants of genotypes A/a ; b/b ; C/c ; D/d ; E/e and A/a ; B/b ; C/c ; d/d ; E/e From a cross between these plants, we want to recover a progeny plant of genotype a/a ; b/b ; c/c ; d/d ; e/e (perhaps for the purpose of acting as the tester strain in a testcross). What proportion of the progeny should we expect to be of that genotype? If we assume that all the gene pairs assort independently, then we can do this calculation easily by using the product rule. The five different gene pairs are considered individually, as if five separate crosses, and then the individual probabilities of obtaining each genotype are multiplied together to arrive at the answer: From A/a × A/a, one-fourth of the progeny will be a/a. From b/b × B/b, half the progeny will be b/b. From C/c × C/c, one-fourth of the progeny will be c/c.
3.2 Working with Independent Assortment 9 5
From D/d × d/d, half the progeny will be d/d. From E/e × E/e, one-fourth of the progeny will be e/e. Therefore, the overall probability (or expected frequency) of obtaining progeny of genotype a/a ; b/b ; c/c ; d/d ; e/e will be 1/4 × 1/2 × 1/4 × 1/2 × 1/4 = 1/256. This probability calculation can be extended to predict phenotypic frequencies or gametic frequencies. Indeed, there are many other uses for this method in genetic analysis, and we will encounter some in later chapters. How many progeny do we need to grow? To take the preceding example a step farther, suppose we need to estimate how many progeny plants need to be grown to stand a reasonable chance of obtaining the desired genotype a/a ; b/b ; c/c ; d/d ; e/e. We first calculate the proportion of progeny that is expected to be of that genotype. As just shown, we learn that we need to examine at least 256 progeny to stand an average chance of obtaining one individual plant of the desired genotype. The probability of obtaining one “success” (a fully recessive plant) out of 256 has to be considered more carefully. This is the average probability of success. Unfortunately, if we isolated and tested 256 progeny, we would very likely have no successes at all, simply from bad luck. From a practical point of view, a more meaningful question to ask would be, What sample size do we need to be 95 percent confident that we will obtain at least one success? (Note: This 95 percent confidence value is standard in science.) The simplest way to perform this calculation is to approach it by considering the probability of complete failure—that is, the probability of obtaining no individuals of the desired genotype. In our example, for every individual isolated, the probability of its not being the desired type is 1 - (1/256) = 255/256. Extending this idea to a sample of size n, we see that the probability of no successes in a sample of n is (255/256)n. (This probability is a simple application of the product rule: 255/256 multiplied by itself n times.) Hence, the probability of obtaining at least one success is the probability of all possible outcomes (this probability is 1) minus the probability of total failure, or (255/256)n. Hence, the probability of at least one success is 1 - (255/256)n. To satisfy the 95 percent confidence level, we must put this expression equal to 0.95 (the equivalent of 95 percent). Therefore, 1 - (255/256)n = 0.95 Solving this equation for n gives us a value of 765, the number of progeny needed to virtually guarantee success. Notice how different this number is from the naïve expectation of success in 256 progeny. This type of calculation is useful in many applications in genetics and in other situations in which a successful outcome is needed from many trials. How many distinct genotypes will a cross produce? The rules of probability can be easily used to predict the number of genotypes or phenotypes in the progeny of complex parental strains. (Such calculations are used routinely in research, in progeny analysis, and in strain building.) For example, in a self of the “tetrahybrid” A/a ; B/b ; C/c ; D/d, there will be three genotypes for each gene pair; for example, for the first gene pair, the three genotypes will be A/a, A/A, and a/a. Because there are four gene pairs in total, there will be 34 = 81 different genotypes. In a testcross of such a tetrahybrid, there will be two genotypes for each gene pair (for example, A/a and a/a) and a total of 24 = 16 genotypes in the progeny. Because we are assuming that all the genes are on different chromosomes, all these testcross genotypes will occur at an equal frequency of 1/16.
9 6 CHA P TER 3 Independent Assortment of Genes
Using the chi-square test on monohybrid and dihybrid ratios In genetics generally, a researcher is often confronted with results that are close to an expected ratio but not identical to it. Such ratios can be from monohybrids, dihybrids, or more complex genotypes and with independence or not. But how close to an expected result is close enough? A statistical test is needed to check such numbers against expectations, and the chi-square test, or χ2 test, fulfills this role. In which experimental situations is the χ2 test generally applicable? The general situation is one in which observed results are compared with those predicted by a hypothesis. In a simple genetic example, suppose you have bred a plant that you hypothesize on the basis of a preceding analysis to be a heterozygote, A/a. To test this hypothesis, you cross this heterozygote with a tester of genotype a/a and count the numbers of phenotypes with genotypes A/− and a/a in the progeny. Then, you must assess whether the numbers that you obtain constitute the expected 1 : 1 ratio. If there is a close match, then the hypothesis is deemed consistent with the result, whereas if there is a poor match, the hypothesis is rejected. As part of this process, a judgment has to be made about whether the observed numbers are close enough to those expected. Very close matches and blatant mismatches generally present no problem, but, inevitably, there are gray areas in which the match is not obvious. The χ2 test is simply a way of quantifying the various deviations expected by chance if a hypothesis is true. Take the preceding simple hypothesis predicting a 1 : 1 ratio, for example. Even if the hypothesis were true, we can only rarely expect an exact 1 : 1 ratio. We can model this idea with a barrelful of equal numbers of red and white marbles. If we blindly remove samples of 100 marbles, on the basis of chance we would expect samples to show small deviations such as 52 red : 48 white quite commonly and to show larger deviations such as 60 red : 40 white less commonly. Even 100 red marbles is a possible outcome, at a very low probability of (1/2)100. However, if any result is possible at some level of probability even if the hypothesis is true, how can we ever reject a hypothesis? A general scientific convention is that a hypothesis will be rejected as false if there is a probability of less than 5 percent of observing a deviation from expectations at least as large as the one actually observed. The hypothesis might still be true, but we have to make a decision somewhere, and 5 percent is the conventional decision line. The implication is that, although results this far from expectations are expected 5 percent of the time even when the hypothesis is true, we will mistakenly reject the hypothesis in only 5 percent of cases and we are willing to take this chance of error. (This 5percent is the converse of the 95 percent confidence level used earlier.) Let’s look at some real data. We will test our earlier hypothesis that a plant is a heterozygote. We will let A stand for red petals and a stand for white. Scientists test a hypothesis by making predictions based on the hypothesis. In the present situation, one possibility is to predict the results of a testcross. Assume that we testcross the presumed heterozygote. On the basis of the hypothesis, Mendel’s law of equal segregation predicts that we should have 50 percent A/a and 50 percent a/a. Assume that, in reality, we obtain 120 progeny and find that 55 are red and 65 are white. These numbers differ from the precise expectations, which would have been 60 red and 60 white. The result seems a bit far off the expected ratio, which raises uncertainty; so we need to use the χ2 test. We calculate χ2 by using the following formula: χ2 = Σ (O − E)2/E for all classes in which E is the expected number in a class, O is the observed number in a class, and Σ means “sum of.” The resulting value, χ2, will provide a numerical value
3.2 Working with Independent Assortment 97
that estimates the degree of agreement between the expected (hypothesized) and observed (actual) results, with the number growing larger as the agreement increases. The calculation is most simply performed by using a table: Class
O
E (O − E)2 (O − E)2/E
Red 55 60 25 25/60 = 0.42 White 65 60 25 25/60 = 0.42 Total = χ2 = 0.84 Now we must look up this χ2 value in Table 3-1, which will give us the probability value that we want. The rows in Table 3-1 list different values of degrees of freedom (df ). The number of degrees of freedom is the number of independent variables in the data. In the present context, the number of independent variables is simply the number of phenotypic classes minus 1. In this case, df = 2 − 1 = 1. So we look only at the 1 df line. We see that our χ2 value of 0.84 lies somewhere between the columns marked 0.5 and 0.1—in other words, between 50 percent and 10 percent. This probability value is much greater than the cutoff value of 5 percent, and so we accept the observed results as being compatible with the hypothesis. Some important notes on the application of this test follow: 1. What does the probability value actually mean? It is the probability of observing a deviation from the expected results at least as large (not exactly this deviation) on the basis of chance if the hypothesis is correct. 2. The fact that our results have “passed” the chi-square test because p > 0.05 does not mean that the hypothesis is true; it merely means that the results are compatible with that hypothesis. However, if we had obtained a value of p < 0.05, we would have been forced to reject the hypothesis. Science is all about falsifiable hypotheses, not “truth.”
Table 3-1 Critical Values of the χ2 Distribution P df 0.995 0.975 0.9
0.5
0.1
0.05
0.025 0.01
0.005 df
1 .000 .000 0.016 0.455 2.706 3.841 5.024 6.635 7.879 1 2 0.010 0.051 0.211 1.386 4.605 5.991 7.378 9.210 10.597 2 3 0.072 0.216 0.584 2.366 6.251 7.815 9.348 11.345 12.838 3 4 0.207 0.484 1.064 3.357 7.779 9.488 11.143 13.277 14.860 4 5 0.412 0.831 1.610 4.351 9.236 11.070 12.832 15.086 16.750 5 6 0.676 1.237 2.204 5.348 10.645 12.592 14.449 16.812 18.548 6 7 0.989 1.690 2.833 6.346 12.017 14.067 16.013 18.475 20.278 7 8 1.344 2.180 3.490 7.344 13.362 15.507 17.535 20.090 21.955 8 9 1.735 2.700 4.168 8.343 14.684 16.919 19.023 21.666 23.589 9 10 2.156 3.247 4.865 9.342 15.987 18.307 20.483 23.209 25.188 10 11 2.603 3.816 5.578 10.341 17.275 19.675 21.920 24.725 26.757 11 12 3.074 4.404 6.304 11.340 18.549 21.026 23.337 26.217 28.300 12 13 3.565 5.009 7.042 12.340 19.812 22.362 24.736 27.688 29.819 13 14 4.075 5.629 7.790 13.339 21.064 23.685 26.119 29.141 31.319 14 15 4.601 6.262 8.547 14.339 22.307 24.996 27.488 30.578 32.801 15
9 8 CHA P TER 3 Independent Assortment of Genes
3. We must be careful about the wording of the hypothesis because tacit assumptions are often buried within it. The present hypothesis is a case in point; if we were to state it carefully, we would have to say that the “individual under test is a heterozygote A/a, these alleles show equal segregation at meiosis, and the A/a and a/a progeny are of equal viability.” We will investigate allele effects on viability in Chapter 6, but, for the time being, we must keep them in mind as a possible complication because differences in survival would affect the sizes of the various classes. The problem is that, if we reject a hypothesis that has hidden components, we do not know which of the components we are rejecting. For example, in the present case, if we were forced to reject the hypothesis as a result of the χ2 test, we would not know if we were rejecting equal segregation or equal viability or both. 4. The outcome of the χ2 test depends heavily on sample sizes (numbers in the classes). Hence, the test must use actual numbers, not proportions or percentages. Additionally, the larger the samples, the more powerful is the test. Any of the familiar Mendelian ratios considered in this chapter or in Chapter 2 can be tested by using the χ2 test—for example, 3 : 1 (1 df ), 1 : 2 : 1 (2 df ), 9 : 3 : 3 : 1 (3 df ), and 1 : 1 : 1 : 1 (3 df ). We will return to more applications of the χ2 test in Chapter 4.
Synthesizing pure lines Pure lines are among the essential tools of genetics. For one thing, only these fully homozygous lines will express recessive alleles, but the main need for pure lines is in the maintenance of stocks for research. The members of a pure line can be left to interbreed over time and thereby act as a constant source of the genotype for use in experiments. Hence, for most model organisms, there are international stock centers that are repositories of pure lines for use in research. Similar stock centers provide lines of plants and animals for use in agriculture. Pure lines of plants or animals are made through repeated generations of selfing. (In animals, selfing is accomplished by mating animals of identical genotype.) Selfing a monohybrid plant shows the principle at work. Suppose we start with a population of individuals that are all A/a and allow them to self. We can apply 1 Mendel’s first law to predict that, in the next generation, there will be 4 A/A, 1 1 2 A/a, and 4 a/a. Note that the heterozygosity (the proportion of heterozygotes) 1 has halved, from 1 to 2 . If we repeat this process of selfing for another generation, all descendants of homozygotes will be homozygous, but, again, the heterozygotes will halve their proportion to a quarter. The process is shown in the following display: All A/a 1 4
1 4
A/A
A/A
1 2
A/a
1 8
1 4
A/a
A/A
1 4
1 8
a/a
a/a
1 4
a/a
After, say, eight generations of selfing, the proportion of heterozygotes is reduced to (1/2)8, which is 1/256, or about 0.4 percent. Let’s look at this process in a slightly different way: we will assume that we start such a program with a genotype that is heterozygous at 256 gene pairs. If we also assume independent assortment, then, after selfing for eight generations, we would end up with an array of genotypes, each having on average only one heterozygous gene (that is, 1/256). In other words, we are well on our way to creating a number of pure lines. Let us apply this principle to the selection of agricultural lines, the topic with which we began the chapter. We can use as our example the selection of Marquis
3.2 Working with Independent Assortment 9 9
wheat by Charles Saunders in the early part of the twentieth century. Saunders’s goal was to develop a productive wheat line that would have a shorter growing season and hence open up large areas of terrain in northern countries such as Canada and Russia for growing wheat, another of the world’s staple foods. He crossed a line having excellent grain quality called Red Fife with a line called Hard Red Calcutta, which, although its yield and quality were poor, matured 20 days earlier than Red Fife. The F1 produced by the cross was presumably heterozygous for multiple genes controlling the wheat qualities. From this F1, Saunders made selfings and selections that eventually led to a pure line that had the combination of favorable properties needed—good-quality grain and early maturation. This line was called Marquis. It was rapidly adopted in many parts of the world. A similar approach can be applied to the rice lines with which we began the chapter. All the single-gene mutations are crossed in pairs, and then their F1 plants are selfed or intercrossed with other F1 plants. As a demonstration, let’s consider just four mutations, 1 through 4. A breeding program might be as follows, in which the mutant alleles and their wild-type counterparts are always listed in the same order (recall that the + sign designates wild type): 1/1 ; +/+ ; +/+ ; +/+ +/+ ; 2/2 ; +/+ ; +/+ F1
+/+ ; +/+ ; 3/3 ; +/+ +/+ ; +/+ ; +/+ ; 4/4
1/+ ; 2/+ ; +/+ ; +/+
F1
+/+ ; +/+ ; 3/+ ; 4/+
Self
Self
Select the homozygote 1/1 ; 2/2 ; +/+ ; +/+
Select the homozygote +/+ ; +/+ ; 3/3 ; 4/4
Cross these homozygotes F1
1/+ ; 2/+ ; 3/+ ; 4/+ Self
Representatives of many tomato lines
Select the homozygote 1/1 ; 2/2 ; 3/3 ; 4/4 This type of breeding has been applied to many other crop species. The colorful and diverse pure lines of tomatoes used in commerce are shown in Figure 3-5. Note that, in general when a multiple heterozygote is selfed, a range of different homozygotes is produced. For example, from A/a ; B/b ; C/c, there are two homozygotes for each gene pair (that is, for the first gene, the homozygotes are A/A and a/a), and so there are 23 = 8 different homozygotes possible: A/A ; b/b ; C/c, and a/a ; B/B ; c/c, and so on. Each distinct homozygote can be the start of a new pure line. K e y C o n c e p t Repeated selfing leads to an increased proportion of homozygotes, a process that can be used to create pure lines for research or other applications.
Hybrid vigor We have been considering the synthesis of superior pure lines for research and for agriculture. Pure lines are convenient in that propagation of the genotype from year to year is fairly easy. However, a large proportion of commercial seed that
F i g u r e 3 - 5 Tomato breeding has
resulted in a wide range of lines of different genotypes and phenotypes. [ © Mascarucci/ Corbis.]
10 0 CHA P TER 3 Independent Assortment of Genes
Hybrid vigor in corn
(a)
(b)
(a)
Figure 3-6 Multiple heterozygous hybrid flanked by the two pure lines crossed to make it.
(a) The plants. (b) Cobs from the same plants. [ (a) Photo courtesy of Jun Cao, Schnable Laboratory, Iowa State University; (b) Deana Namuth-Covert, PhD, Univ. of Nebraska.]
farmers (and gardeners) use is called hybrid seed. Curiously, in many cases in which two disparate lines of plants (and animals) are united in an F1 hybrid (presumed heterozygote), the hybrid shows greater size and vigor than do the two contributing lines (Figure 3-6). This general superiority of multiple heterozygotes is called hybrid vigor. The molecular reasons for hybrid vigor are mostly unknown and still hotly debated, but the phenomenon is undeniable and has made large contributions to agriculture. A negative aspect of using hybrids is that, every season, the two parental lines must be grown separately and then intercrossed to make hybrid seed for sale. This process is much more inconvenient than maintaining pure lines, which requires only letting plants self; consequently, hybrid seed is more expensive than seed from pure lines. From the user’s perspective, there is another negative aspect of using hybrids. After a hybrid plant has grown and produced its crop for sale, it is not realistic to keep some of the seeds that it produces and expect this seed to be equally vigorous the next year. The reason is that, when the hybrid undergoes meiosis, independent assortment of the various mixed gene pairs will form many different allelic combinations, and very few of these combinations will be that of the original hybrid. For example, the earlier described tetrahybrid, when selfed, produces 81 different genotypes, of which only a minority will be tetrahybrid. If we assume independent assortment, then, for each gene pair, selfing will produce one-half heterozygotes A/a → 41 A/A, 21 A/a, and 41 a/a. Because there are four gene pairs in this tetrahybrid, the proportion of progeny that will be like the original hybrid A/a ; B/b ; C/c ; D/d will be (1/2)4 = 1/16. K e y C o n c e p t Some hybrids between genetically different lines show hybrid vigor. However, gene assortment when the hybrid undergoes meiosis breaks up the favorable allelic combination, and thus few members of the next generation have it.
3.3 The Chromosomal Basis of Independent Assortment 101
3.3 The Chromosomal Basis of Independent Assortment Like equal segregation, the independent assortment of gene pairs on different chromosomes is explained by the behavior of chromosomes during meiosis. Consider a chromosome that we might call number 1; its two homologs could be named 1′ and 1″. If the chromosomes align on the equator, then 1′ might go “north” and 1″ “south,” or vice versa. Similarly, for a chromosome 2 with homologs 2′ and 2″, 2′ might go north and 2″ south, or vice versa. Hence, chromosome 1′ could end up packaged with either chromosome 2′ or 2″, depending on which chromosomes were pulled in the same direction. Independent assortment is not easy to demonstrate by observing segregating chromosomes under the microscope because homologs such as 1′ and 1″ do not usually look different, although they might carry minor sequence variation. However, independent assortment can be observed in certain specialized cases. One case was instrumental in the historical development of the chromosome theory. In 1913, Elinor Carothers found an unusual chromosomal situation in a certain species of grasshopper—a situation that permitted a direct test of whether different chromosome pairs do indeed segregate independently. Studying meioses in the testes of grasshoppers, she found a grasshopper in which one chromosome “pair” had nonidentical members. Such a pair is called a heteromorphic pair; presumably, the chromosomes show only partial homology. In addition, the same grasshopper had another chromosome (unrelated to the heteromorphic pair) that had no pairing partner at all. Carothers was able to use these unusual chromosomes as visible cytological markers of the behavior of chromosomes during meiosis. She visually screened many meioses and found that there were two distinct patterns, which are shown in Figure 3-7. In addition, she found that the two patterns were equally frequent. To summarize, if we hold the segregation of the heteromorphic pair constant (brown in the figure), then the unpaired (purple) chromosome can go to either pole equally frequently, half the time with the long form and half the time with the short form. In other words, the purple and brown sets were segregating independently. Although these are obviously not typical chromosomes, the results do strongly suggest that different chromosomes assort independently at the first division of meiosis.
Independent assortment in diploid organisms The chromosomal basis of the law of independent assortment is formally diagrammed in Figure 3-8, which illustrates how the separate behavior of two different chromosome pairs gives rise to the 1 : 1 : 1 : 1 Mendelian ratios of gametic types expected from independent assortment. The hypothetical cell has four chromosomes: a pair of homologous long chromosomes (yellow) and a pair of homologous short ones (blue). The genotype of the meiocytes is A/a ; B/b, and the two allelic pairs, A/a and B/b, are shown on two different chromosome pairs. Parts 4 and 4′ of Figure 3-8 show the key step in independent assortment: there are two equally frequent allelic segregation patterns, one shown in 4 and the other in 4′. In one case, the A/A and B/B alleles are pulled together into one cell, and the a/a and b/b are pulled into the other cell. In the other case, the alleles A/A and b/b are united in the same cell and the alleles a/a and B/B also are united in the same cell. The two patterns result from two equally frequent spindle attachments to the centromeres in the first anaphase. Meiosis then produces four cells of the indicated genotypes from each of these segregation patterns. Because segregation patterns 4 and 4′ are equally common, the meiotic product cells of genotypes A ; B, a ; b, A ; b, and a ; B are produced in
Different chromosomes segregate independently
=
F i g u r e 3 -7 Carothers observed these two equally frequent patterns by which a heteromorphic pair (brown) and an unpaired chromosome (purple) move into gametes at meiosis.
102 CHA P TER 3 Independent Assortment of Genes
F i g u r e 3 - 8 Meiosis in a diploid cell of
Independent assortment of chromosomes at meiosis explains Mendel’s ratio
genotype A/a ; B/b. The diagram shows how the segregation and assortment of different chromosome pairs give rise to the 1 : 1 : 1 : 1 Mendelian gametic ratio.
A
ANIMATED ART: Meiotic
Interphase. Chromosomes are unpaired.
recombination between unlinked genes by independent assortment
a B b
1
A
Prophase. Chromosomes and centromeres have replicated, but centromeres have not split.
A a a
A
B
A a
B b
3
a
b
B
The other, equally frequent, alignment
Prophase. Homologs synapse.
A B b
A a
b
a
4
Telophase. Two cells form.
5 Second anaphase. New spindles form, and centromeres finally divide.
6
b b B B
a
A
B
A
A a
B
A
b
b
a
B
a
b
a
B
A
B
A
b
A
B
A
b
a
b
a
B
a
b
a
B
A
B
A
b
A
b
a
B
a
B
5´
6´
A
B
a
b 1 4
a
7
A A a
4´
1 4
End of meiosis. Four cells produced from each meiosis.
b
B
2
Anaphase. Centromeres attach to spindle and are pulled to poles of cell.
b
B
b
1 4
1 4
b 7´
equal frequencies. In other words, the frequency of each of the four genotypes is 1/4. This gametic distribution is that postulated by Mendel for a dihybrid, and it is the one that we inserted along one edge of the Punnett square in Figure 3-4. The random fusion of these gametes results in the 9 : 3 : 3 : 1 F2 phenotypic ratio.
3.3 The Chromosomal Basis of Independent Assortment 10 3
Independent assortment in haploid organisms
Stages of a Neurospora cross In the ascomycete fungi, we can actually inspect the products of a single meiocyte to show independent assortment directly. Let’s Sexual spores grow to adults use the filamentous fungus Neurospora crassa to illustrate this point (see the Model Organism box on page 105). As we have seen from earlier fungal examples, a cross in Neurospora is made by mixing two parental haploid strains of opposite mating type. In a Asci manner similar to that of yeast, mating type is determined by two “alleles” of one gene—in this species, called MAT-A and MAT-a. The way in which a cross is made is shown in Figure 3-9. The products of meiosis in fungi are sexual spores. Recall that the ascomycetes (which include Neurospora and Saccharomyces) Meiosis are unique in that, for any given meiocyte, the spores are held together in a membranous sac called an ascus. Thus, for these organisms, the products of a single meiosis can be recovered and tested. In the orange bread mold Neurospora, the nuclear spindles of meioses I and II do not overlap within the cigar-shaped ascus, and so the four products of a single meiocyte lie in a straight row Synchronous division and fusion (Figure 3-10a). Furthermore, for some reason not understood, to form diploid meiocytes there is a postmeiotic mitosis, which also shows no spindle overlap. Hence, meiosis and the extra mitosis result in a linear ascus conCrosstaining eight ascospores, or an octad. In a heterozygous meiocyte fertilization A/a, if there are no crossovers between the gene and its centromere (see Chapter 4), then there will be two adjacent blocks of ascospores, four of A and four of a (Figure 3-10b). Now we can examine a dihybrid. Let’s make a cross between two distinct mutants having mutations in different genes on different chromosomes. By assuming that the loci of the mutated genes are both very close to their respective centromeres, we avoid complications due to crossing over between the loci and the cenMaternal nucleus Maternal nucleus tromeres. The first mutant is albino (a), contrasting with the norMating type A Mating type a + mal pink wild type (a ). The second mutant is biscuit (b), which has a very compact colony shaped like a biscuit in contrast with the flat, spreading colony of wild type (b+). We will assume that the two mutants are F i g u r e 3 - 9 The life cycle of Neurospora crassa, the orange bread of opposite mating type. Hence, the cross is a ; b+ ×
a+ ; b
Because of random spindle attachment, the following two octad types will be equally frequent: a+ ; b
a ; b
a+ ; b a ; b a+ ; b a ; b a+ ; b a ; b a ; b+
a+ ; b+
a ; b+
a+ ; b+
a ; b+
a+ ; b+
a ; b+ 50%
a+ ; b+ 50%
The equal frequency of these two types is a convincing demonstration of independent assortment occurring in individual meiocytes.
mold. Self-fertilization is not possible in this species: there are two mating types, determined by the alleles A and a of one gene, and either can act as “female.” An asexual spore from the opposite mating type fuses with a receptive hair, and a nucleus from the asexual spore travels down the hair to pair with a female nucleus in the knot of cells. The A and a pair then undergo synchronous mitoses, finally fusing to form diploid meiocytes.
10 4 CHA P TER 3 Independent Assortment of Genes
Independent assortment of combinations of autosomal and X-linked genes
The linear meiosis of Neurospora (a) Nuclear divisions Four meiotic product nuclei (tetrad)
2n meiocyte
The principle of independent assortment is also useful in analyzing genotypes that are heterozygous for both autosomal and X-linked genes. The autosomes and the sex chromosomes are moved independently by spindle fibers attached randomly to their centromeres, just as with two different pairs of autosomes. Some interesting dihybrid ratios are produced. Let’s look at an example from Drosophila. The cross is between a female with vestigial wings (autosomal recessive, vg) and a male with white eyes (X-linked recessive, w). Symbolically, the cross is
Octad of four spore pairs 1
1
2
2
3
3
vg /vg ; +/+/ × +/+ ; w/Y 4
4
The F1 will be:
Ascus First meiotic division
Second meiotic division
Postmeiotic mitotic division
Development of sexual spores (ascospores) around nuclei
(b) Allele segregation
Octad
A
A
A A
A A
A a
a a
a a First meiotic division
Second meiotic division
Males of genotype
+/vg ; +/Y
Females and males
1 Females 2 +/+ and 1 2
+/- (wild type)
1 4
vg /vg (vestigial)
1 2
+/w (all wild type)
+/Y (wild type) and
1 2
w/Y (white)
If the autosomal and X-linked genes are combined, the F2 phenotypic ratios will be 3 fully wild type Females 4
a
3 4
For the X-linked white eye gene, the ratios will be as follows:
Males
A a a
+/vg ; +/w
Tetrad
A
Females of genotype
These F1 flies must be interbred to obtain an F2. Because the cross is a monohybrid cross for the autosomal vestigial gene, both sexes of the F2 will show
A
Meiocyte after chromatid formation
1 4
vestigial
a
3 Males fully wild type ( 43 × 8
a
3 8
1 8
vestigial (
1 8
vestigial, white ( 41 ×
Mitosis
F i g u r e 3 -10 Neurospora is an ideal model system for studying allelic segregation at meiosis. (a) The four products of meiosis (tetrad) undergo mitosis to produce an octad. The products are contained within an ascus. (b) An A /a meiocyte undergoes meiosis followed by mitosis, resulting in equal numbers of A and a products and demonstrating the principle of equal segregation.
white ( 43 × 1 4
1 2
×
1 2
)
1 2
)
) 1 2
)
Hence, we see a progeny ratio that reveals clear elements of both autosomal and X-linked inheritance.
Recombination The independent assortment of genes at meiosis is one of the main ways by which an organism produces new combinations of alleles. The production of new allele combinations is formally called recombination.
3.3 The Chromosomal Basis of Independent Assortment 10 5
Model Organism
Neurospora
Neurospora crassa was one of the first eukaryotic microbes to be adopted by geneticists as a model organism. It is a haploid fungus (n = 7) found growing on dead vegetation in many parts of the world. When an asexual spore (haploid) germinates, it produces a tubular structure that extends rapidly by tip growth and throws off multiple side branches. The result is a mass of branched threads (called hyphae), which constitute a colony. Hyphae have no crosswalls, and so a colony is essentially one cell containing many haploid nuclei. A colony buds off millions of asexual spores, which can disperse and repeat the asexual cycle. Asexual colonies are easily and inexpensively maintained in the laboratory on a defined medium of inorganic salts plus an energy source such as sugar. (An inert gel such as agar is added to provide a firm surface.) The fact that Neurospora can chemically synthesize all its essential molecules from such a simple medium led biochemical geneticists (beginning with George Beadle and Edward Tatum; see Chapter 6) to choose it for studies of synthetic pathways. Geneticists worked out the steps in these pathways by introducing mutations and observing their effects. The haploid state of Neurospora is ideal for such mutational analysis because mutant alleles are always expressed directly in the phenotype. Neurospora has two mating types, MAT-A and MAT-a, which can be regarded as simple “sexes.” When colonies of different mating type come into contact, their cell walls and nuclei fuse, resulting in many transient diploid nuclei, each of which undergoes meiosis. The four haploid products of one meiosis stay together in a sac called an ascus. Each of these products of meiosis undergoes a further mitotic division, resulting in eight ascospores within each ascus. Ascospores germinate and produce colonies exactly like those produced by asexual spores. Hence, such ascomycete fungi are ideal for the study of the segregation and recombination of genes in individual meioses.
(a)
(b)
The fungus Neurospora crassa. (a) Orange colonies of Neurospora growing on sugarcane. In nature, Neurospora colonies are most often found after fire, which activates dormant ascospores. (Fields of sugarcane are burned to remove foliage before harvesting the cane stalks.) (b) Developing Neurospora octads from a cross of wild type to a strain carrying an engineered allele of jellyfish green fluorescent protein fused to histone. The octads show the expected 4 : 4 Mendelian segregation of fluorescence. In some spores, the nucleus has divided mitotically to form two; eventually, each spore will contain several nuclei. [ (a) David J. Jacobson, Ph.D.; (b) Namboori B. Raju, Stanford University.]
There is general agreement that the evolutionary advantage of producing new combinations of alleles is that it provides variation as the raw material for natural selection. Recombination is a crucial principle in genetics, partly because of its relevance to evolution but also because of its use in genetic analysis. It is particularly useful for analyzing inheritance patterns of multigene genotypes. In this section, we define recombination in such a way that we would recognize it in experimental results, and we lay out the way in which recombination is analyzed and interpreted. Recombination is observed in a variety of biological situations, but, for the present, we define it in relation to meiosis. Meiotic recombination is any meiotic process that generates a haploid product with new combinations of the alleles carried by the haploid genotypes that united to form the meiocyte.
10 6 CHA P TER 3 Independent Assortment of Genes
This seemingly wordy definition is actually quite simple; it makes the important point that we detect recombination by comparing the inputs into meiosis with the outputs (Figure 3-11). The inputs are the two haploid genotypes that combine to form the meiocyte, the diploid cell that undergoes meiosis. For humans, the inputs are the parental egg and sperm. They unite to form a diploid zygote, which divides to yield all the body cells, including the meiocytes that are set aside within the gonads. The output genotypes are the haploid products of meiosis. In humans, these haploid products are a person’s own eggs or sperm. Any meiotic product that has a new combination of the alleles provided by the two input genotypes is by definition a recombinant. K e y C o n c e p t Meiosis generates recombinants, which are haploid meiotic products with new combinations of the alleles carried by the haploid genotypes that united to form the meiocyte.
First, let us look at how recombinants are detected experimentally. The detection of recombinants in organisms with haploid life cycles such as fungi or algae is straightforward. The input and output types in haploid life cycles are the genotypes of individuals rather than gametes and may thus be inferred directly from phenotypes. Figure 3-11 can be viewed as summarizing the simple detection of recombinants in organisms with haploid life cycles. Detecting recombinants in organisms with diploid life cycles is trickier. The input and output types in diploid cycles are gametes. Thus, we must know the genotypes of both input and output gametes to detect recombinants in an organism with a diploid cycle. Though we
Recombinants are meiotic output different from meiotic input n
Input
n
A.B
a
.b
2n Meiotic diploid
.
A / a B/ b
Meiosis n
A
.B
Parental (input) type
n
a
.b
Parental (input) type
n
A
.b
Recombinant
n
a
.B
Recombinant
Output
Figure 3-11 Recombinants (blue) are those products of meiosis
with allele combinations different from those of the haploid cells that formed the meiotic diploid (yellow). Note that genes A /a and B /b are shown separated by a dot because they may be on the same chromosome or on different chromosomes.
3.3 The Chromosomal Basis of Independent Assortment 107
In diploids, recombinants are best detected in a testcross
P
Input
2n
2n
.
a /a b/ b
.
A /A B/B
n
A
.B
n
a
.b
2n Meiotic diploid (F1)
2n
.
.
A / a B/ b
Meiosis
Meiosis Output
Parental-type gamete
n
Parental-type gamete
n
Recombinant gamete
n
Recombinant gamete
n
Tester
a /a b/ b
A
.B
+
a
.b
n Fertilization
a
.b
+
a
.b
n
A
.b
+
a
.b
n
a
.B
+
a
.b
n
Progeny (2n) A /a
. B/ b
Parental type
a /a
. b/ b
Parental type
A /a
. b/ b
Recombinant
a /a
. B/ b
Recombinant
F i g u r e 3 -12 Recombinant products of a diploid meiosis are most readily detected in a
cross of a heterozygote and a recessive tester. Note that Figure 3-11 is repeated as part of this diagram.
cannot detect the genotypes of input or output gametes directly, we can infer these genotypes by using the appropriate techniques: • To know the input gametes, we use pure-breeding diploid parents because they can produce only one gametic type. • To detect recombinant output gametes, we testcross the diploid individual and observe its progeny (Figure 3-12). A testcross offspring that arises from a recombinant product of meiosis also is called a recombinant. Notice, again, that the testcross allows us to concentrate on one meiosis and prevent ambiguity. From a self of the F1 in Figure 3-12 for example, a recombinant A/A·B/b offspring could not be distinguished from A/A·B/B without further crosses. A central part of recombination analysis is recombinant frequency. One reason for focusing on recombinant frequency is that its numerical value is a convenient test for whether two genes are on different chromosomes. Recombinants are produced by two different cellular processes: the independent assortment of genes on different chromosomes (this chapter) and crossing over between genes on the same chromosome (see Chapter 4). The proportion of recombinants is the key idea here because the diagnostic value can tell us whether genes are on different chromosomes. We will deal with independent assortment here. For genes on separate chromosomes, recombinants are produced by independent assortment, as shown in Figure 3-13. Again, we see the 1 : 1 : 1 : 1 ratio that we have seen before, but now the progeny of the testcross are classified as either
10 8 CHA P TER 3 Independent Assortment of Genes
Independent assortment produces 50 percent recombinants A
B
a
b
A
B
a
b
A
B
a
b
P
Gametes
A
B
a
b
Testcross progeny
1 4 1 4 1 4
b
a
b (Tester)
Meiotic diploid (F1) 1 4
a
A
B
a
b
a
b
recombinant or resembling the P (parental) input types. Set up in this way, the proportion of recombinants is clearly 41 + 41 = 21 , or 50 percent of the total progeny. Hence, we see that independent assortment at meiosis produces a recombinant frequency of 50 percent. If we observe a recombinant frequency of 50 percent in a testcross, we can infer that the two genes under study assort independently. The simplest and most likely interpretation of independent assortment is that the two genes are on separate chromosome pairs. (However, we must note that genes that are very far apart on the same chromosome pair can assort virtually independently and produce the same result; see Chapter 4.) K e y C o n c e p t A recombinant frequency of 50 percent indicates that the genes are independently assorting and are most likely on different chromosomes.
Parental type
a
b
A
b
a
b
a
B
a
b
F i g u r e 3 -13 This diagram shows two chromosome pairs of a diploid organism with A and a on one pair and B and b on the other. Independent assortment produces a recombinant frequency of 50 percent. Note that we could represent the haploid situation by removing the parental (P) generations and the tester. ANIMATED ART: Meiotic recombination between unlinked genes by independent assortment
3.4 Polygenic Inheritance Parental type
So far, our analysis in this book has focused on single-gene differences, with the use of sharply contrasting phenotypes such as red versus white petals, smooth versus wrinkled seeds, and Recombinant long- versus vestigial-winged Drosophila. However, much of the variation found in nature is continuous, in which a phenotype can take any measurable value between two extremes. Height, weight, and color intensity are examples of such metric, or Recombinant quantitative, phenotypes. Typically, when the metric value is plotted against frequency in a natural population, the distribution curve is shaped like a bell (Figure 3-14). The bell shape is due to the fact that average values in the middle are the most common, whereas extreme values are rare. At first it is difficult to see how continuous distributions can be influenced by genes inherited in a Mendelian manner; after all Mendelian analysis is facilitated by using clearly distinguishable categories. However, we shall see in this section that the independent assortment of several-to-many heterozygous genes affecting a continuous trait can produce a bell curve. Of course many cases of continuous variation have a purely environmental basis, little affected by genetics. For example, a population of genetically homozygous plants grown in a plot of ground often show a bell-shaped curve for height, with the smaller plants around the edges of the plot and the larger plants in the middle. This variation can be explained only by environmental factors such as moisture and amount of fertilizer applied. However, many cases of continuous variation do have a genetic basis. Human skin color is an example: all degrees of skin darkness can be observed in populations from different parts of the world, and this variation clearly has a genetic component. In such cases, from several to many alleles interact with a more or less additive effect. The interacting genes underlying hereditary continuous variation are called polygenes or quantitative trait loci (QTLs). (The term quantitative trait locus needs some definition: quantitative is more or less synonymous with continuous; trait is more or less synonymous with character or property; locus, which literally means place on a chromosome, is more or less synonymous with gene.) The polygenes, or QTLs, for the same trait are distributed throughout the genome; in many cases, they are on different chromosomes and show independent assortment. Let’s see how the inheritance of several heterozygous polygenes (even as few as two) can generate a bell-shaped distribution curve. We can consider a simple
3.4 Polygenic Inheritance 10 9
model that was originally used to explain continuous variation in the degree of redness in wheat seeds. The work was done by Hermann Nilsson-Ehle in the early twentieth century. We will assume two independently assorting gene pairs R1 /r1 and R2 /r2. Both R1 and R2 contribute to wheat-seed redness. Each “dose” of an R allele of either gene is additive, meaning that it increases the degree of redness proportionately. An illustrative cross is a self of a dihybrid R1 /r1 ; R2 /r2. Both male and female gametes will show the genotypic proportions as follows: R1 ; R2 R1 ; r2 r1 ; R2 r1 ; r2
Frequency
Continuous variation in a natural population
2 doses of redness 1 dose of redness 1 dose of redness 0 doses of redness
Metric character (e.g., color intensity)
Overall, in this gamete population, one-fourth have two doses, one-half have one dose, and one-fourth have zero doses. The union of male and female gametes both showing this array of R doses is illustrated in Figure 3-15. The number of doses in the progeny ranges from four (R1 /R1 ; R2 /R2) down to zero (r1/r2 ; r2 /r2), with all values between. The proportions in the grid of Figure 3-15 can be drawn as a histogram, as shown in Figure 3-16. The shape of the histogram can be thought of as a scaffold that could be the underlying basis for a bell-shaped distribution curve. When this analysis of redness in wheat seeds was originally done, variation was found within the classes that allegedly represented one polygene “dose” level. Presumably, this variation within a class is the result of environmental differences. Hence, the environment can be seen to contribute in a way that rounds off the sharp shoulders of the histogram bars, resulting in a smooth bell-shaped curve (the red line in the histogram). If the number of polygenes is increased, the histogram more
F i g u r e 3 -14 In a population, a metric character such as color intensity can take on many values. Hence, the distribution is in the form of a smooth curve, with the most common values representing the high point of the curve. If the curve is symmetrical, it is bell shaped, as shown.
Polygenes in progeny of a dihybrid self Self of R1/r1 ; R2 /r2
gametes 2 doses
1 dose
0 doses
1 4
1 2
1 4
2 doses 4 doses 3 doses 2 doses
gametes
1 4
1 dose
1 16
2 16
1 16
3 doses 2 doses
1 2
2 16
0 doses 2 doses 1 4
Overall in progeny
1 16
1 dose
4 16
2 16
1 dose
0 doses
2 16
1 16
4 doses 3 doses 2 doses 1 16
4 16
6 16
1 dose
0 doses
4 16
1 16
F i g u r e 3 -15 The progeny of a dihybrid self for two polygenes can be expressed as numbers of additive allelic “doses.”
110 CHA P TER 3 Independent Assortment of Genes
Histogram of polygenes from a dihybrid self
Histogram of polygenes from a trihybrid self
A continuous distribution that might result from the effects of environmental variation
20
Possible effects of environmental variation
5
1 ths 64
6
4 3 2
4
4
1 0
1 1 3 4 0 1 2 Number of contributing polygenic alleles, or “doses”
F i g u r e 3 -16 The progeny shown in Figure 3-15 can be represented as a frequency histogram of contributing polygenic alleles (“doses”).
Frequency in
Frequency in
1 ths 16
6 15
20 10
15
15
5
1 0
6
6
1
1 2 3 4 5 6 Number of contributing polygenic alleles, or “doses”
F i g u r e 3 -17 The progeny of a polygene trihybrid can be graphed as a frequency histogram of contributing polygenic alleles (“doses”).
closely approximates a smooth continuous distribution. For example, for a characteristic determined by three polygenes, the histogram is as shown in Figure 3-17. In our illustration, we used a dihybrid self to show how the histogram is produced. But how is our example relevant to what is going on in natural populations? After all, not all crosses could be of this type. Nevertheless, if the alleles at each gene pair are approximately equal in frequency in the population (for example, R1 is about as common as r1), then the dihybrid cross can be said to represent an average cross for a population in which two polygenes are segregating. Identifying polygenes and understanding how they act and interact are important challenges for geneticists in the twenty-first century. Identifying polygenes will be especially important in medicine. Many common human diseases such as atherosclerosis (hardening of the arteries) and hypertension (high blood pressure) are thought to have a polygenic component. If so, a full understanding of these conditions, which affect large proportions of human populations, requires an understanding of these polygenes, their inheritance, and their function. Today, several molecular approaches can be applied to the job of finding polygenes, and we will consider some in subsequent chapters. Note that polygenes are not considered a special functional class of genes. They are identified as a group only in the sense that they have alleles that contribute to continuous variation. K e y C o n c e p t Variation and assortment of polygenes can contribute to continuous variation in a population.
3.5 Organelle Genes: Inheritance Independent of the Nucleus So far, we have considered how nuclear genes assort independently by virtue of their loci on different chromosomes. However, although the nucleus contains
3.5 Organelle Genes: Inheritance Independent of the Nucleus 111
most of a eukaryotic organism’s genes, a distinct and specialized subset of the genome is found in the mitochondria, and, in plants, also in the chloroplasts. These subsets are inherited independently of the nuclear genome, and so they constitute a special case of independent inheritance, sometimes called extranuclear inheritance. Mitochondria and chloroplasts are specialized organelles located in the cytoplasm. They contain small circular chromosomes that carry a defined subset of the total cell genome. Mitochondrial genes are concerned with the mitochondrion’s task of energy production, whereas chloroplast genes are needed for the chloroplast to carry out its function of photosynthesis. However, neither organelle is functionally autonomous because each Cell showing nucleoids within mitochondria relies to a large extent on nuclear genes for its function. Why some of the necessary genes are in the organelles themselves and others are in the nucleus is still something of a mystery, which will not be addressed here. Another peculiarity of organelle genes is the large number of copies present in a cell. Each organelle is present in many copies per cell, and, furthermore, each organelle contains many copies of its chromosome. Hence, each cell can contain hundreds or thousands of organelle chromosomes. Consider chloroplasts, for example. Any green cell of a plant has many chloroplasts, and each chloroplast contains many identical circular DNA molecules, the so-called chloroplast chromosomes. Hence, the number of chloroplast chromosomes per cell can be in the thousands, and the number can even vary somewhat from cell to cell. The DNA is sometimes seen to be packaged into suborganellar structures called nucleoids, which become visible if stained with a DNA-binding dye (Figure 3-18). The DNA is F i g u r e 3 -18 Fluorescent staining of a folded within the nucleoid but does not have the type of histone-associated cell of Euglena gracilis. With the dyes coiling shown by nuclear chromosomes. The same arrangement is true for the used, the nucleus appears red because of DNA in mitochondria. For the time being, we will assume that all copies of an the fluorescence of large amounts of organelle chromosome within a cell are identical, but we will have to relax this nuclear DNA. The mitochondria fluoresce assumption later. green, and, within mitochondria, the Many organelle chromosomes have now been sequenced. Examples of relative concentrations of mitochondrial DNA gene size and spacing in mitochondrial DNA (mtDNA) and chloroplast DNA (nucleoids) fluoresce yellow. [ From Y. Hayashi and K. Ueda, “The shape of (cpDNA) are shown in Figure 3-19. Organelle genes are very closely spaced, and, in mitochondria and the number of mitochondrial some organisms, organelle genes can contain untranslated segments called introns. nucleoids during the cell cycle of Euglena Note how most genes concern the chemical reactions taking place within the gracilis,” J. Cell Sci. 93, 1989, 565. Photo by organelle itself: photosynthesis in chloroplasts and oxidative phosphorylation in The Company of Biologists Ltd.] mitochondria.
Patterns of inheritance in organelles Organelle genes show their own special mode of inheritance called uniparental inheritance: progeny inherit organelle genes exclusively from one parent but not the other. In most cases, that parent is the mother, a pattern called maternal inheritance. Why only the mother? The answer lies in the fact that the organelle chromosomes are located in the cytoplasm and the male and female gametes do not contribute cytoplasm equally to the zygote. In regard to nuclear genes, both parents contribute equally to the zygote. However, the egg contributes the bulk of the cytoplasm, whereas the sperm contributes essentially none. Therefore, because organelles reside in the cytoplasm, the female parent contributes the organelles along with the cytoplasm, and essentially none of the organelle DNA in the zygote is from the male parent.
112 CHA P TER 3 Independent Assortment of Genes
Organelle genomes Liverwort chloroplast DNA (121 kb)
Yeast mitochondrial DNA (~ 78 kb)
IR
A
IR
B
Human mitochondrial DNA (~17 kb)
(a)
(b) Energy production
tRNAs for protein synthesis Nongenic
Ribosomal RNAs Introns
F i g u r e 3 -19 DNA maps for
mitochondria and chloroplasts. Many of the organelle genes encode proteins that carry out the energy-producing functions of these organelles (green), whereas others (red and orange) function in protein synthesis. (a) Maps of yeast and human mtDNAs. (Note that the human map is not drawn at the same scale as the yeast map.) (b) The 121-kb chloroplast genome of the liverwort Marchantia polymorpha. Genes shown inside the map are transcribed clockwise, and those outside are transcribed counterclockwise. IRA and IRB indicate inverted repeats. The upper drawing in the center of the map depicts a male Marchantia plant; the lower drawing depicts a female. [ Data from K. Umesono and H. Ozeki, Trends Genet. 3, 1987.]
Some phenotypic variants are caused by a mutant allele of an organelle gene, and we can use these mutants to track patterns of organelle inheritance. We will temporarily assume that the mutant allele is present in all copies of the organelle chromosome, a situation that is indeed often found. In a cross, the variant phenotype will be transmitted to progeny if the variant used is the female parent, but not if it is the male parent. Hence, generally, cytoplasmic inheritance shows the following pattern:
mutant female × wild-type male → progeny all mutant wild-type female × mutant male → progeny all wild type
Indeed, this inheritance pattern is diagnostic of organelle inheritance in cases in which the genomic location of a mutant allele is not known. Maternal inheritance can be clearly demonstrated in certain mutants of fungi. For example, in the fungus Neurospora, a mutant called poky has a slow-growth phenotype. Neurospora can be crossed in such a way that one parent acts as the maternal parent, contributing the cytoplasm (see Figure 3-9). The results of the following reciprocal crosses suggest that the mutant gene resides in the mitochondria (fungi have no chloroplasts):
poky female × wild-type male → progeny all poky wild-type female × poky male → progeny all wild type
Sequencing has shown that the poky phenotype is caused by a mutation of a ribosomal RNA gene in mtDNA. Its inheritance is shown diagrammatically in Figure 3-20. The cross includes an allelic difference (ad and ad+) in a nuclear gene
3.5 Organelle Genes: Inheritance Independent of the Nucleus 113
Maternal inheritance of mitochondrial mutant phenotype poky (b) Normal
(a) Poky
(ad +)
Poky, ad –
(ad – )
Normal, ad –
2n Normal
2n
Poky, ad +
(ad – )
Poky
Normal, ad +
(ad +)
F i g u r e 3 -2 0 Reciprocal crosses of poky and wild-type Neurospora produce different results because a
different parent contributes the cytoplasm. The female parent contributes most of the cytoplasm of the progeny cells. Brown shading represents cytoplasm with mitochondria containing the poky mutation, and green shading represents cytoplasm with wild-type mitochondria. Note that all the progeny in part a are poky, whereas all the progeny in part b are normal. Hence, both crosses show maternal inheritance. The nuclear gene with the alleles ad + (black) and ad− (red) is used to illustrate the segregation of the nuclear genes in the 1:1 Mendelian ratio expected for this haploid organism.
in addition to poky; notice how the Mendelian inheritance of the nuclear gene is independent of the maternal inheritance of the poky phenotype. K e y C o n c e p t Variant phenotypes caused by mutations in cytoplasmic organelle DNA are generally inherited maternally and independent of the Mendelian patterns shown by nuclear genes.
Cytoplasmic segregation In some cases, cells contain mixtures of mutant and normal organelles. These cells are called cytohets, or heteroplasmons. In these mixtures, a type of cytoplasmic segregation can be detected, in which the two types apportion themselves into different daughter cells. The process most likely stems from chance partitioning of the multiple organelles in the course of cell division. Plants provide a good example. Many cases of white leaves are caused by mutations in chloroplast genes that control the production and deposition of the green pigment chlorophyll. Because chlorophyll is necessary for a plant to live, this type of mutation is lethal, and white-leaved plants cannot be obtained for experimental crosses. However, some plants are variegated, bearing both green and white patches, and these plants are viable. Thus, variegated plants provide a way of demonstrating cytoplasmic segregation. The four-o’clock plant in Figure 3-21 shows a commonly observed variegated leaf and branch phenotype that demonstrates the inheritance of a mutant allele of a chloroplast gene. F i g u r e 3 -2 1 Leaf variegation in Mirabilis jalapa, the four-o’clock plant. Flowers can form on any branch (variegated, green, or white), and these flowers can be used in crosses.
Variegated leaves caused by a mutation in cpDNA
All-white branch All-green branch
Main shoot is variegated
114 CHA P TER 3 Independent Assortment of Genes
The mutant allele causes chloroplasts to be white; in turn, the color of the chloroplasts determines the color of cells and hence the color of the branches composed of those cells. Variegated branches are mosaics of all-green and all-white cells. Flowers can develop on green, white, or variegated branches, and the chloroplast genes of a flower’s cells are those of the branch on which it grows. Hence, in a cross (Figure 3-22), the maternal gamete within the flower (the egg cell) determines the color of the leaves and branches of the progeny plant. For example, if an egg cell is from a flower on a green branch, all the progeny will be green, regardless of the origin of the pollen. A white branch will have white chloroplasts, and the resulting progeny plants will be white. (Because of lethality, white descendants would not live beyond the seedling stage.) The variegated zygotes (bottom of Figure 3-22) demonstrate cytoplasmic segregation. These variegated progeny come from eggs that are cytohets. Interestingly, when such a zygote divides, the white and green chloroplasts often segregate; that is, they sort themselves into separate cells, yielding the distinct green and white sectors that cause the variegation in the branches. Here, then, is a direct demonstration of cytoplasmic segregation. Given that a cell is a population of organelle molecules, how is it ever possible to obtain a “pure” mutant cell, containing only mutant chromosomes? Most likely, pure mutants are created in asexual cells as follows. The variants arise by mutation of a single gene in a single chromosome. Then, in some cases, the mutationbearing chromosome may by chance increase in frequency in the population
Crosses using flowers from a variegated plant Egg cell of female Pollen cell (n) of male (n) White
Any
Green
Any
Variegated Egg type 1
White
Egg type 2
Green
Egg type 3
Cell division
Variegated
jalapa crosses can be explained by autonomous chloroplast inheritance. The large, dark spheres represent nuclei. The smaller bodies represent chloroplasts, either green or white. Each egg cell is assumed to contain many chloroplasts, and each pollen cell is assumed to contain no chloroplasts. The first two crosses exhibit strict maternal inheritance. If, however, the maternal branch is variegated, three types of zygotes can result, depending on whether the egg cell contains only white, only green, or both green and white chloroplasts. In the last case, the resulting zygote can produce both green and white tissue, and so a variegated plant results.
Chloroplast
Any
F i g u r e 3 -2 2 The results of the Mirabilis
Nucleus
White
Green
Zygote constitution (2n)
3.5 Organelle Genes: Inheritance Independent of the Nucleus 115
within the cell. This process is called random genetic drift. A cell that is a cytohet may have, say, 60 percent A chromosomes and 40 percent a chromosomes. When this cell divides, sometimes all the A chromosomes go into one daughter, and all the a chromosomes into the other (again, by chance). More often, this partitioning requires several subsequent generations of cell division to be complete (Figure 3-23). Hence, as a result of these chance events, both alleles are expressed in different daughter cells, and this separation will continue through the descendants of these cells. Note that cytoplasmic segregation is not a mitotic process; it does take place in dividing asexual cells, but it is unrelated to mitosis. In chloroplasts, cytoplasmic segregation is a common mechanism for producing variegated (green-and-white) plants, as already mentioned. In fungal mutants such as the poky mutant of Neurospora, the original mutation in one mtDNA molecule must have accumulated and undergone cytoplasmic segregation to produce the strain expressing the poky symptoms. K e y C o n c e p t Organelle populations that contain mixtures of two genetically distinct chromosomes often show segregation of the two types into the daughter cells at cell division. This process is called cytoplasmic segregation.
In certain special systems such as in fungi and algae, cytohets that are “dihybrid” have been obtained (say, AB in one organelle chromosome and ab in another). In such cases, rare crossover-like processes can occur, but such an occurrence must be considered a minor genetic phenomenon. K e y C o n c e p t Alleles on organelle chromosomes 1. in sexual crosses are inherited from one parent only (generally the maternal parent) and hence show no segregation ratios of the type nuclear genes do. 2. in asexual cells can show cytoplasmic segregation. 3. in asexual cells can occasionally show processes analogous to crossing over.
Cytoplasmic mutations in humans Are there cytoplasmic mutations in humans? Some human pedigrees show the transmission of rare disorders only through females and never through males. This pattern strongly suggests cytoplasmic inheritance and points to a mutation in mtDNA as the reason for the phenotype. The disease MERRF (myoclonic epilepsy and ragged red fiber) is such a phenotype, resulting from a single base change in mtDNA. It is a disease that affects muscles, but the symptoms also include eye and hearing disorders. Another example is Kearns–Sayre syndrome, a constellation of symptoms affecting the eyes, heart, muscles, and brain that is caused by the loss of part of the mtDNA. In some of these cases, the cells of a sufferer contain mixtures of normal and mutant chromosomes, and the proportions of each passed on to progeny can vary as a result of cytoplasmic segregation. The proportions in one person can also vary in different tissues or over time. The accumulation of certain types of mitochondrial mutations over time has been proposed as a possible cause of aging. Figure 3-24 shows some of the mutations in human mitochondrial genes that can lead to disease when, by random drift and cytoplasmic segregation, they rise in frequency to such an extent that cell function is impaired.
F i g u r e 3 -2 3 By chance, genetically distinct organelles may segregate into separate cells in a number of successive cell divisions. Red and blue dots represent genetically distinguishable organelles, such as mitochondria with and without a mutation.
Model for cytoplasmic segregation Organelle carrying the a allele Organelle carrying the A allele
Cytoplasmic segregation
116 CHA P TER 3 Independent Assortment of Genes
Sites of mtDNA mutations in certain human diseases Aminoglycosideinduced deafness Deafness
Myopathy
MELAS MILS
MELAS PEO Myopathy Cardiomyopathy Diabetes & deafness
V
Chorea MILS
PEO Encephalopathy Myopathy
Myopathy
T
Myopathy
Cytb E ND6
ND1
ND5
Typical deletion in KSS/PEO
L S H
COX I D
ND4L/4 COX II K
COX III G ATPase 8/6
LHON/ Dystonia MELAS
Human mtDNA 16,569 bp
I Q M ND2 W A N C Y
S
MERRF LHON NARP MELAS MMC PEO KSS MILS
P
L
MERRF Deafness Ataxia; myoclonus
Diseases:
12S F
16S
MELAS LHON PEO Cardiomyopathy
Respiratory deficiency
ND3
R
Anemia Myopathy
LHON LHON/ Dystonia
Cardiomyopathy Deafness MELAS Cardiopathy MERRF NARP Myoglobinuria Encephalomyopathy MILS FBSN
Myoclonic epilepsy and ragged red fiber disease Leber hereditary optic neuropathy Neurogenic muscle weakness, ataxia, and retinitis pigmentosum Mitochondrial encephalomyopathy, lactic acidosis, and strokelike symptoms Maternally inherited myopathy and cardiomyopathy Progressive external opthalmoplegia Kearns–Sayre syndrome Maternally inherited Leigh syndrome
Figure 3-24 This map of human mtDNA shows loci of mutations leading to cytopathies. The
transfer RNA genes are represented by single-letter amino acid abbreviations: ND = NADH dehydrogenase; COX = cytochrome C oxidase; and 12S and 16S refer to ribosomal RNAs.
[ Data from S. DiMauro et al., “Mitochondria in Neuromuscular Disorders,” Biochim. Biophys. Acta 1366, 1998, 199–210.]
The inheritance of a human mitochondrial disease is shown in Figure 3-25. Note that the condition is always passed to offspring by mothers and never fathers. Occasionally, a mother will produce an unaffected child (not shown), probably owing to cytoplasmic segregation in the gamete-forming tissue.
mtDNA in evolutionary studies Differences and similarities of homologous mtDNA sequences between species have been used extensively to construct evolutionary trees. Furthermore, it has been possible to introduce some extinct organisms into evolutionary trees using mtDNA sequences obtained from the remains of extinct organisms, such as skins and bones in museums. mtDNA evolves relatively rapidly, so this approach has been most useful in plotting recent evolution such as the evolution of humans and other primates.
Summary 117
Pedigree of a mitochondrial disease I
II
III
One key finding is that the “root” of the human mtDNA tree is in Africa, suggesting that Homo sapiens originated in Africa and from there dispersed throughout the world (see Chapter 18).
F i g u r e 3 -2 5 This pedigree shows that a human mitochondrial disease is inherited only from the mother.
s u m m a ry Genetic research and plant and animal breeding often necessitate the synthesis of genotypes that are complex combinations of alleles from different genes. Such genes can be on the same chromosome or on different chromosomes; the latter is the main subject of this chapter. In the simplest case—a dihybrid for which the two gene pairs are on different chromosome pairs—each individual gene pair shows equal segregation at meiosis as predicted by Mendel’s first law. Because nuclear spindle fibers attach randomly to centromeres at meiosis, the two gene pairs are partitioned independently into the meiotic products. This principle of independent assortment is called Mendel’s second law because Mendel was the first to observe it. From a dihybrid A/a ; B/b, four genotypes of meiotic products are produced, A ; B, A ; b, a ; B, and a ; b, all at an equal frequency of 25 percent each. Hence, in a testcross of a dihybrid with a double recessive, the phenotypic proportions of the progeny also are 25 percent (a 1 : 1 : 1 : 1 ratio). If such a dihybrid is selfed, the phenotypic classes in the progeny are are 169 A/- ; B/-, 163 A/- ; b/b, 163 a/a ; B/-, and 161 a/a ; b/b. The 1 : 1 : 1 : 1 and 9 : 3 : 3 : 1 ratios are both diagnostic of independent assortment. More complex genotypes composed of independently assorting genes can be treated as extensions of the case for single-gene segregation. Overall genotypic, phenotypic, or gametic ratios are calculated by applying the product rule— that is, by multiplying the proportions relevant to the individual genes. The probability of the occurrence of any of several categories of progeny is calculated by applying the sum rule—that is, by adding their individual probabilities. In mnemonic form, the product rule deals with “A AND B,” whereas the sum rule deals with “A′ OR A″.” The χ2 test can be used to test whether the observed proportions of classes in genetic analysis conform to the expectations of a genetic hypothesis, such as a hypothesis of single- or two-gene inheritance. If a probability value of less than 5 percent is calculated, the hypothesis must be rejected.
Sequential generations of selfing increase the proportions of homozygotes, according to the principles of equal segregation and independent assortment (if the genes are on different chromosomes). Hence, selfing is used to create complex pure lines with combinations of desirable mutations. The independent assortment of chromosomes at meiosis can be observed cytologically by using heteromorphic chromosome pairs (those that show a structural difference). The X and Y chromosomes are one such case, but other, rarer cases can be found and used for this demonstration. The independent assortment of genes at the level of single meiocytes can be observed in the ascomycete fungi, because the asci show the two alternative types of segregations at equal frequencies. One of the main functions of meiosis is to produce recombinants, new combinations of alleles of the haploid genotypes that united to form the meiocyte. Independent assortment is the main source of recombinants. In a dihybrid testcross showing independent assortment, the recombinant frequency will be 50 percent. Metric characters such as color intensity show a continuous distribution in a population. Continuous distributions can be based on environmental variation or on variant alleles of multiple genes or on a combination of both. A simple genetic model proposes that the active alleles of several genes (called polygenes) contribute more or less additively to the metric character. In an analysis of the progeny from the self of a multiply heterozygous individual, the histogram showing the proportion of each phenotype approximates a bell-shaped curve typical of continuous variation. The small subsets of the genome found in mitochondria and chloroplasts are inherited independently of the nuclear genome. Mutants in these organelle genes often show maternal inheritance, along with the cytoplasm, which is the location of these organelles. In genetically mixed cytoplasms
118 CHA P TER 3 Independent Assortment of Genes
(cytohets), the two genotypes (say, wild type and mutant) often sort themselves out into different daughter cells by a poorly understood process called cytoplasmic segregation.
Mitochondrial mutation in humans results in diseases that show cytoplasmic segregation in body tissues and maternal inheritance in a mating.
key terms chi-square test (p. 96) chloroplast DNA (cpDNA) (p. 111) cytoplasmic segregation (p. 113) dihybrid (p. 89) dihybrid cross (p. 89) hybrid vigor (p. 100) independent assortment (p. 88)
law of independent assortment (Mendel’s second law) (p. 89) maternal inheritance (p. 111) meiotic recombination (p. 105) mitochondrial DNA (mtDNA) (p. 111) polygene (quantitative trait locus) (p. 108)
product rule (p. 94) quantitative trait locus (QTL) (p. 108) recombinant (p. 106) recombination (p. 104) sum rule (p. 94) uniparental inheritance (p. 111)
s olv e d p r obl e m s SOLVED PROBLEM 1. Two Drosophila flies that had normal (transparent, long) wings were mated. In the progeny, two new phenotypes appeared: dusky wings (having a semiopaque appearance) and clipped wings (with squared ends). The progeny were as follows:
Females 179 transparent, long 58 transparent, clipped
Males 92 transparent, long 89 dusky, long 28 transparent, clipped 31 dusky, clipped
a. Provide a chromosomal explanation for these results, showing chromosomal genotypes of parents and of all pro geny classes under your model. b. Design a test for your model. Solution a. The first step is to state any interesting features of the data. The first striking feature is the appearance of two new phenotypes. We encountered the phenomenon in Chapter 2, where it was explained as recessive alleles masked by their dominant counterparts. So, first, we might suppose that one or both parental flies have recessive alleles of two different genes. This inference is strengthened by the observation that some progeny express only one of the new phenotypes. If the new phenotypes always appeared together, we might suppose that the same recessive allele determines both. However, the other striking feature of the data, which we cannot explain by using the Mendelian principles from Chapter 2, is the obvious difference between the sexes: although there are approximately equal numbers of males and females, the males fall into four phenotypic classes, but the females constitute only two. This fact should immediately suggest some kind of sex-linked inheritance. When we study the data, we see that the long and clipped phenotypes are segregating in both males and females, but only males have the dusky phenotype. This observation suggests that
the inheritance of wing transparency differs from the inheritance of wing shape. First, long and clipped are found in a 3 : 1 ratio in both males and females. This ratio can be explained if both parents are heterozygous for an autosomal gene; we can represent them as L/ l, where L stands for long and l stands for clipped. Having done this partial analysis, we see that only the inheritance of wing transparency is associated with sex. The most obvious possibility is that the alleles for transparent (D) and dusky (d ) are on the X chromosome, because we have seen in Chapter 2 that gene location on this chromosome gives inheritance patterns correlated with sex. If this suggestion is true, then the parental female must be the one sheltering the d allele, because, if the male had the d, he would have been dusky, whereas we were told that he had transparent wings. Therefore, the female parent would be D/d and the male D. Let’s see if this suggestion works: if it is true, all female progeny would inherit the D allele from their father, and so all would be transparent winged, as was observed. Half the sons would be D (transparent) and half d (dusky), which also was observed. So, overall, we can represent the female parent as D/d ; L/ l and the male parent as D ; L/l. Then the progeny would be Females 1 2
D/D
1 2
D/d
3 4
L/—
3 8
D/D ; L/—
1 4
l/l
1 8
D/D ; l/l
3 4
3 4
L/—
3 8
D/d ; L/—
1 4
l/l
1 8
D/d ; l/l
1 4
transparent, long transparent, clipped
Males 1 2
1 2
D d
3 4 1 4 3 4 1 4
L/—
3 8
D ; L/—
transparent, long
l/l
1 8
D ; l/l
transparent, clipped
L/—
3 8
d ; L/—
dusky, long
l/l
1 8
d ; l/l
dusky, clipped
Solved Problems 119
b. Generally, a good way to test such a model is to make a cross and predict the outcome. But which cross? We have to predict some kind of ratio in the progeny, and so it is important to make a cross from which a unique phenotypic ratio can be expected. Notice that using one of the female progeny as a parent would not serve our needs: we cannot say from observing the phenotype of any one of these females what her genotype is. A female with transparent wings could be D/D or D/d, and one with long wings could be L/L or L/l. It would be good to cross the parental female of the original cross with a dusky, clipped son, because the full genotypes of both are specified under the model that we have created. According to our model, this cross is D/d ; L/l × d ; l/l From this cross, we predict Females 1 2
1 2
1 2
D/d
1 2 1 2
d/d
1 2
L/l
1 4
D/d ; L/l
l/l
1 4
D/d ; l/l
L/l
1 4
d/d ; L/l
l/l
1 4
d/d ; l/l
Males 1 2
1 2
D
1 2
L/l
1 4
D ; L/l
1 2
l/l
1 4
D ; l/l
L/l
1 4
d ; L/l
l/l
1 4
d ; l/l
1 2
d
1 2
SOLVED PROBLEM 2. Consider three yellow, round peas, labeled A, B, and C. Each was grown into a plant and crossed with a plant grown from a green wrinkled pea. Exactly 100 peas issuing from each cross were sorted into phenotypic classes as follows:
A: 51 yellow, round 49 green, round
B: 100 yellow, round
C: 24 yellow, round
26 yellow, wrinkled
25 green, round
25 green, wrinkled
What were the genotypes of A, B, and C? (Use gene symbols of your own choosing; be sure to define each one.) Solution Notice that each of the crosses is yellow, round × green, wrinkled → progeny
Because A, B, and C were all crossed with the same plant, all the differences between the three progeny populations must be attributable to differences in the underlying genotypes of A, B, and C. You might remember a lot about these analyses from the chapter, which is fine, but let’s see how much we can deduce from the data. What about dominance? The key cross for deducing dominance is B. Here, the inheritance pattern is yellow, round × green, wrinkled → all yellow, round So yellow and round must be dominant phenotypes because dominance is literally defined by the phenotype of a hybrid. Now we know that the green, wrinkled parent used in each cross must be fully recessive; we have a very convenient situation because it means that each cross is a testcross, which is generally the most informative type of cross. Turning to the progeny of A, we see a 1 : 1 ratio for yellow to green. This ratio is a demonstration of Mendel’s first law (equal segregation) and shows that, for the character of color, the cross must have been heterozygote × homozygous recessive. Letting Y represent yellow and y represent green, we have Y/y × y/y →
1 2
Y/y (yellow) →
1 2
y/y (green)
For the character of shape, because all the progeny are round, the cross must have been homozygous dominant × homozygous recessive. Letting R represent round and r represent wrinkled, we have R/R × r/r → R/r (round) Combining the two characters, we have Y/y ; R/R × y/y ; r/r →
1 2
Y/y ; R/r
1 2
y/y ; R/r
Now cross B becomes crystal clear and must have been Y/Y ; R/R × y/y ; r/r → Y/y ; r/r because any heterozygosity in pea B would have given rise to several progeny phenotypes, not just one. What about C? Here, we see a ratio of 50 yellow : 50 green (1 : 1) and a ratio of 49 round : 51 wrinkled (also 1 : 1). So both genes in pea C must have been heterozygous, and cross C was Y/y ; R/r × y/y ; r/r which is a good demonstration of Mendel’s second law (independent assortment of different genes). How would a geneticist have analyzed these crosses? Basically, the same way that we just did but with fewer intervening steps. Possibly something like this: “yellow and round dominant; single-gene segregation in A; B homozygous dominant; independent two-gene segregation in C.”
120 CHA P TER 3 Independent Assortment of Genes
p r obl e m s Most of the problems are also available for review/grading through the launchpad/iga11e. Working with the Figures
1. Using Table 3-1, answer the following questions about probability values (see p. 97): a. If χ2 is calculated to be 17 with 9 df, what is the approximate probability value? b. If χ2 is 17 with 6 df, what is the probability value? c. What trend (“rule”) do you see in the previous two calculations? 2. Inspect Figure 3-8: which meiotic stage is responsible for generating Mendel’s second law? 3. In Figure 3-9, a. identify the diploid nuclei. b. identify which part of the figure illustrates Mendel’s first law. 4. Inspect Figure 3-10: what would be the outcome in the octad if on rare occasions a nucleus from the postmeiotic mitotic division of nucleus 2 slipped past a nucleus from the postmeiotic mitotic division of nucleus 3? How could you measure the frequency of such a rare event? 5. In Figure 3-11, if the input genotypes were a B and A b, what would be the genotypes colored blue? 6. In the progeny seen in Figure 3-13, what are the origins of the chromosomes colored dark blue, light blue, and very light blue? 7. In Figure 3-17, in which bar of the histogram would the genotype R1/r1 R2 /R2 r3/r3 be found? 8. In examining Figure 3-19, what do you think is the main reason for the difference in size of yeast and human mtDNA? 9. In Figure 3-20, what color is used to denote cytoplasm containing wild-type mitochondria? 10. In Figure 3-21, what would be the leaf types of progeny of the apical (top) flower? 11. From the pedigree in Figure 3-25, what principle can you deduce about the inheritance of mitochondrial disease from affected fathers? •
•
•
•
B a s i c P r obl e m s
12. Assume independent assortment and start with a plant that is dihybrid A/a ; B/b: a. What phenotypic ratio is produced from selfing it? b. What genotypic ratio is produced from selfing it? c. What phenotypic ratio is produced from testcrossing it? d. What genotypic ratio is produced from testcrossing it?
http://www.whfreeman.com/
13. Normal mitosis takes place in a diploid cell of genotype A/a ; B/b. Which of the following genotypes might represent possible daughter cells? a. A ; B b. a ; b c. A ; b d. a ; B e. A/A ; B/B f. A/a ; B/b g. a/a ; b/b 14. In a diploid organism of 2n = 10, assume that you can label all the centromeres derived from its female parent and all the centromeres derived from its male parent. When this organism produces gametes, how many maleand female-labeled centromere combinations are possible in the gametes? 15. It has been shown that when a thin beam of light is aimed at a nucleus, the amount of light absorbed is proportional to the cell’s DNA content. Using this method, the DNA in the nuclei of several different types of cells in a corn plant were compared. The following numbers represent the relative amounts of DNA in these different types of cells: 0.7, 1.4, 2.1, 2.8, and 4.2 Which cells could have been used for these measurements? (Note: In plants, the endosperm part of the seed is often triploid, 3n.) 16. Draw a haploid mitosis of the genotype a+ ; b. 17. In moss, the genes A and B are expressed only in the gametophyte. A sporophyte of genotype A/a ; B/b is allowed to produce gametophytes. a. What proportion of the gametophytes will be A ; B? b. If fertilization is random, what proportion of sporophytes in the next generation will be A/a ; B/b? 18. When a cell of genotype A/a ; B/b ; C/c having all the genes on separate chromosome pairs divides mitotically, what are the genotypes of the daughter cells? 19. In the haploid yeast Saccharomyces cerevisiae, the two mating types are known as MATa and MATα. You cross a purple (ad-) strain of mating type a and a white (ad+) strain of mating type α. If ad - and ad+ are alleles of one gene, and a and α are alleles of an independently inherited gene on a separate chromosome pair, what progeny do you expect to obtain? In what proportions? 20. In mice, dwarfism is caused by an X-linked recessive allele, and pink coat is caused by an autosomal dominant allele (coats are normally brownish). If a dwarf female from a pure line is crossed with a pink male from a pure line, what will be the phenotypic ratios in the F1 and F2 in each sex? (Invent and define your own gene symbols.) 21. Suppose you discover two interesting rare cytological abnormalities in the karyotype of a human male. (A
Problems 121
karyotype is the total visible chromosome complement.) There is an extra piece (satellite) on one of the chromosomes of pair 4, and there is an abnormal pattern of staining on one of the chromosomes of pair 7. With the assumption that all the gametes of this male are equally viable, what proportion of his children will have the same karyotype that he has? 22. Suppose that meiosis occurs in the transient diploid stage of the cycle of a haploid organism of chromosome number n. What is the probability that an individual haploid cell resulting from the meiotic division will have a complete parental set of centromeres (that is, a set all from one parent or all from the other parent)? 23. Pretend that the year is 1868. You are a skilled young lens maker working in Vienna. With your superior new lenses, you have just built a microscope that has better resolution than any others available. In your testing of this microscope, you have been observing the cells in the testes of grasshoppers and have been fascinated by the behavior of strange elongated structures that you have seen within the dividing cells. One day, in the library, you read a recent journal paper by G. Mendel on hypothetical “factors” that he claims explain the results of certain crosses in peas. In a flash of revelation, you are struck by the parallels between your grasshopper studies and Mendel’s pea studies, and you resolve to write him a letter. What do you write? (Problem 23 is based on an idea by Ernest Kroeker.) 24. From a presumed testcross A /a × a /a, in which A represents red and a represents white, use the χ2 test to find out which of the following possible results would fit the expectations: a. 120 red, 100 white b. 5000 red, 5400 white c. 500 red, 540 white d. 50 red, 54 white 25. Look at the Punnett square in Figure 3-4. a. How many different genotypes are shown in the 16 squares of the grid? b. What is the genotypic ratio underlying the 9 : 3 : 3 : 1 phenotypic ratio? c. Can you devise a simple formula for the calculation of the number of progeny genotypes in dihybrid, trihybrid, and so forth crosses? Repeat for phenotypes. d. Mendel predicted that, within all but one of the phenotypic classes in the Punnett square, there should be several different genotypes. In particular, he performed many crosses to identify the underlying genotypes of the round, yellow phenotype. Show two different ways that could be used to identify the various genotypes underlying the round, yellow phenotype. (Remember, all the round, yellow peas look identical.) 26. Assuming independent assortment of all genes, develop formulas that show the number of phenotypic classes and the number of genotypic classes from selfing a plant heterozygous for n gene pairs.
27. Note: The first part of this problem was introduced in Chapter 2. The line of logic is extended here. In the plant Arabidopsis thaliana, a geneticist is interested in the development of trichomes (small projections) on the leaves. A large screen turns up two mutant plants (A and B) that have no trichomes, and these mutants seem to be potentially useful in studying trichome development. (If they are determined by single-gene mutations, then finding the normal and abnormal function of these genes will be instructive.) Each plant was crossed with wild type; in both cases, the next generation (F1) had normal trichomes. When F1 plants were selfed, the resulting F2’s were as follows: F2 from mutant A: 602 normal ; 198 no trichomes F2 from mutant B: 267 normal ; 93 no trichomes a. What do these results show? Include proposed genotypes of all plants in your answer. b. Assume that the genes are located on separate chromosomes. An F1 is produced by crossing the original mutant A with the original mutant B. This F1 is testcrossed: What proportion of testcross progeny will have no trichomes? 28. In dogs, dark coat color is dominant over albino, and short hair is dominant over long hair. Assume that these effects are caused by two independently assorting genes. Seven crosses were done as shown below, in which D and A stand for the dark and albino phenotypes, respectively, and S and L stand for the short-hair and long-hair phenotypes.
Number of progeny Parental phenotypes
D, S
D, L
A, S
A, L
a. D, S × D, S
88
31
29
12
b. D, S × D, L
19
18
c. D, S × A, S
21
20
d. A, S × A, S
29
9
e. D, L × D, L
31
11
f. D, S × D, S g. D, S × D, L
45 31
16 30
0 10
0 10
Write the genotypes of the parents in each cross. Use the symbols C and c for the dark and albino coat-color alleles and the symbols H and h for the short-hair and long-hair alleles, respectively. Assume parents are homozygous unless there is evidence otherwise. 29. In tomatoes, one gene determines whether the plant has purple (P) or green (G) stems, and a separate, independent gene determines whether the leaves are “cut” (C) or “potato” (Po). Five matings of tomato-plant phenotypes give the following results:
122 CHA P TER 3 Independent Assortment of Genes
Parental Number of progeny Mating phenotypes P, C P, Po G, C G, Po
1
P, C × G, C
323
102
309
106
2
P, C × P, Po
220
206
65
72
3
P, C × G, C
723
229
4
P, C × G, Po 405
389
5
P, Po × G, C
90
85
78
71
a. Which alleles are dominant? b. What are the most probable genotypes for the parents in each cross? 30. A mutant allele in mice causes a bent tail. Six pairs of mice were crossed. Their phenotypes and those of their progeny are given in the following table. N is normal phenotype; B is bent phenotype. Deduce the mode of inheritance of bent tail. Parents Progeny Cross
1
N
B
All B
All N
1 1 2 B, 2 N
1 1 2 B, 2 N
2
B
N
3
B
N
All B
All B
4
N
N
All N
All N
5
B
B
All B
All B
All B
1 1 2 B, 2 N
6
B
B
a. Is it recessive or dominant? b. Is it autosomal or sex-linked? c. What are the genotypes of all parents and progeny? 31. The normal eye color of Drosophila is red, but strains in which all flies have brown eyes are available. Similarly, wings are normally long, but there are strains with short wings. A female from a pure line with brown eyes and short wings is crossed with a male from a normal pure line. The F1 consists of normal females and short-winged males. An F2 is then produced by intercrossing the F1. Both sexes of F2 flies show phenotypes as follows:
3 8
red eyes, long wings
3 8
red eyes, short wings
1 8
brown eyes, long wings
1 8
brown eyes, short wings
Deduce the inheritance of these phenotypes; use clearly defined genetic symbols of your own invention. State the genotypes of all three generations and the genotypic proportions of the F1 and F2.
www
Unpacking Problem 31 www
Before attempting a solution to this problem, try answering the following questions: 1. What does the word normal mean in this problem? 2. The words line and strain are used in this problem. What do they mean, and are they interchangeable? 3. Draw a simple sketch of the two parental flies showing their eyes, wings, and sexual differences. 4. How many different characters are there in this problem? 5. How many phenotypes are there in this problem, and which phenotypes go with which characters? 6. What is the full phenotype of the F1 females called “normal”? 7. What is the full phenotype of the F1 males called “short winged”? 8. List the F2 phenotypic ratios for each character that you came up with in answer to question 4. 9. What do the F2 phenotypic ratios tell you? 10. What major inheritance pattern distinguishes sexlinked inheritance from autosomal inheritance? 11. Do the F2 data show such a distinguishing criterion? 12. Do the F1 data show such a distinguishing criterion? 13. What can you learn about dominance in the F1? The F2? 14. What rules about wild-type symbolism can you use in deciding which allelic symbols to invent for these crosses? 15. What does “deduce the inheritance of these phenotypes” mean? Now try to solve the problem. If you are unable to do so, make a list of questions about the things that you do not understand. Inspect the learning goals at the beginning of the chapter and ask yourself which are relevant to your questions. If this approach doesn’t work, inspect the Key Concepts of this chapter and ask yourself which might be relevant to your questions. 32. In a natural population of annual plants, a single plant is found that is sickly looking and has yellowish leaves. The plant is dug up and brought back to the laboratory. Photosynthesis rates are found to be very low. Pollen from a normal dark-green-leaved plant is used to fertilize emasculated flowers of the yellowish plant. A hundred seeds result, of which only 60 germinate. All the resulting plants are sickly yellow in appearance. a. Propose a genetic explanation for the inheritance pattern. b. Suggest a simple test for your model. c. Account for the reduced photosynthesis, sickliness, and yellowish appearance.
Problems 123
33. What is the basis for the green-and-white color variegation in the leaves of Mirabilis? If the following cross is made, variegated × green what progeny types can be predicted? What about the reciprocal cross? 34. In Neurospora, the mutant stp exhibits erratic stop-andstart growth. The mutant site is known to be in the mtDNA. If an stp strain is used as the female parent in a cross with a normal strain acting as the male, what type of progeny can be expected? What about the progeny from the reciprocal cross? 35. Two corn plants are studied. One is resistant (R) and the other is susceptible (S) to a certain pathogenic fungus. The following crosses are made, with the results shown: S × R → all progeny S R × S → all progeny R
following cross, including a mutation nic3 located on chromosome VI? stp ⋅ nic3 × wild type 40. In polygenic systems, how many phenotypic classes corresponding to number of polygene “doses” are expected in selfs a. of strains with four heterozygous polygenes? b. of strains with six heterozygous polygenes? 41. In the self of a polygenic trihybrid R1/r1 ; R2/r2 ; R3/r3, use the product and sum rules to calculate the proportion of progeny with just one polygene “dose.” 42. Reciprocal crosses and selfs were performed between the two moss species Funaria mediterranea and F. hygrometrica. The sporophytes and the leaves of the gametophytes are shown in the accompanying diagram.
What can you conclude about the location of the genetic determinants of R and S? 36. A presumed dihybrid in Drosophila, B/b ; F/f, is testcrossed with b/b ; f/f. (B = black body ; b = brown body; F = forked bristles; f = unforked bristles.) The results are
black, forked 230 black, unforked 210
brown, forked 240 brown, unforked 250
Use the χ2 test to determine if these results fit the results expected from testcrossing the hypothesized dihybrid. 37. Are the following progeny numbers consistent with the results expected from selfing a plant presumed to be a dihybrid of two independently assorting genes, H/h ; R/r? (H = hairy leaves; h = smooth leaves; R = round ovary; r = elongated ovary.) Explain your answer.
hairy, round 178 hairy, elongated 62
smooth, round 56 smooth, elongated 24
The crosses are written with the female parent first. Progeny Progeny
38. A dark female moth is crossed with a dark male. All the male progeny are dark, but half the female progeny are light and the rest are dark. Propose an explanation for this pattern of inheritance. 39. In Neurospora, a mutant strain called stopper (stp) arose spontaneously. Stopper showed erratic “stop and start” growth, compared with the uninterrupted growth of wild-type strains. In crosses, the following results were found:
stopper × wild type → progeny all stopper
wild type × stopper → progeny all wild type a. What do these results suggest regarding the location of the stopper mutation in the genome? b. According to your model for part a, what progeny and proportions are predicted in octads from the
Progeny Progeny
124 CHA P TER 3 Independent Assortment of Genes
a. Describe the results presented, summarizing the main findings. b. Propose an explanation of the results. c. Show how you would test your explanation; be sure to show how it could be distinguished from other explanations.
only which crosses to make, but also how many progeny should be sampled in each case. 47. We have dealt mainly with only two genes, but the same principles hold for more than two genes. Consider the following cross:
43. Assume that diploid plant A has a cytoplasm genetically different from that of plant B. To study nuclear–cytoplasmic relations, you wish to obtain a plant with the cytoplasm of plant A and the nuclear genome predominantly of plant B. How would you go about producing such a plant?
a. What proportion of progeny will phenotypically resemble (1) the first parent, (2) the second parent, (3) either parent, and (4) neither parent? b. What proportion of progeny will be genotypically the same as (1) the first parent, (2) the second parent, (3) either parent, and (4) neither parent?
44. You are studying a plant with tissue comprising both green and white sectors. You wish to decide whether this phenomenon is due (1) to a chloroplast mutation of the type considered in this chapter or (2) to a dominant nuclear mutation that inhibits chlorophyll production and is present only in certain tissue layers of the plant as a mosaic. Outline the experimental approach that you would use to resolve this problem.
A/a ; B/b ; C/c ; D/d ; E/e × a/a ; B/b ; c/c ; D/d ; e/e
Assume independent assortment. 48. The accompanying pedigree shows the pattern of transmission of two rare human phenotypes: cataract and pituitary dwarfism. Family members with cataract are shown with a solid left half of the symbol; those with pituitary dwarfism are indicated by a solid right half. I
1
C h a ll e n g i n g P r obl e m s
45. You have three jars containing marbles, as follows:
jar 1 jar 2 jar 3
600 red 900 blue 10 green
and 400 white and 100 white and 990 white
a. If you blindly select one marble from each jar, calculate the probability of obtaining (1) a red, a blue, and a green. (2) three whites. (3) a red, a green, and a white. (4) a red and two whites. (5) a color and two whites. (6) at least one white. b. In a certain plant, R = red and r = white. You self a red R/r heterozygote with the express purpose of obtaining a white plant for an experiment. What minimum number of seeds do you have to grow to be at least 95 percent certain of obtaining at least one white individual? c. When a woman is injected with an egg fertilized in vitro, the probability of its implanting successfully is 20 percent. If a woman is injected with five eggs simultaneously, what is the probability that she will become pregnant? (Part c is from Margaret Holm.) 46. In tomatoes, red fruit is dominant over yellow, two-loculed fruit is dominant over many-loculed fruit, and tall vine is dominant over dwarf. A breeder has two pure lines: (1) red, two-loculed, dwarf and (2) yellow, manyloculed, tall. From these two lines, he wants to produce a new pure line for trade that is yellow, two-loculed, and tall. How exactly should he go about doing so? Show not
II
III
IV
1
1
2
1
2
2
3
3
3
4
2
4
5
5
4
6
5
6
7
6
7
8
9
7
8
9
a. What is the most likely mode of inheritance of each of these phenotypes? Explain. b. List the genotypes of all members in generation III as far as possible. c. If a hypothetical mating took place between IV-1 and IV-5, what is the probability of the first child’s being a dwarf with cataracts? A phenotypically normal child? (Problem 48 is adapted from J. Kuspira and R. Bhambhani, Compendium of Problems in Genetics. Copyright 1994 by Wm. C. Brown.) 49. A corn geneticist has three pure lines of genotypes a/a ; B/B ; C/C, A/A ; b/b ; C/C, and A/A ; B/B ; c/c. All the phenotypes determined by a, b, and c will increase the market value of the corn; so, naturally, he wants to combine them all in one pure line of genotype a/a ; b/b ; c/c. a. Outline an effective crossing program that can be used to obtain the a/a ; b/b ; c/c pure line. b. At each stage, state exactly which phenotypes will be selected and give their expected frequencies. c. Is there more than one way to obtain the desired genotype? Which is the best way?
Problems 125
radioactive nucleotide was added and was incorporated into newly synthesized DNA. The cells were then removed from the radioactivity, washed, and allowed to proceed through mitosis. Radioactive chromosomes or chromatids can be detected by placing photographic emulsion on the cells; radioactive chromosomes or chromatids appeared covered with spots of silver from the emulsion. (The chromosomes “take their own photograph.”) Draw the chromosomes at prophase and telophase of the first and second mitotic divisions after the radioactive treatment. If they are radioactive, show it in your diagram. If there are several possibilities, show them, too. 53. In the species of Problem 52, you can introduce radioactivity by injection into the anthers at the S phase before meiosis. Draw the four products of meiosis with their chromosomes, and show which are radioactive.
Assume independent assortment of the three gene pairs. (Note: Corn will self or cross-pollinate easily.) 50. In humans, color vision depends on genes encoding three pigments. The R (red pigment) and G (green pigment) genes are close together on the X chromosome, whereas the B (blue pigment) gene is autosomal. A recessive mutation in any one of these genes can cause color blindness. Suppose that a color-blind man married a woman with normal color vision. The four sons from this marriage were color-blind, and the five daughters were normal. Specify the most likely genotypes of both parents and their children, explaining your reasoning. (A pedigree drawing will probably be helpful.) (Problem 50 is by Rosemary Redfield.) 51. Consider the accompanying pedigree for a rare human muscle disease.
54. The plant Haplopappus gracilis is diploid and 2n = 4. There are one long pair and one short pair of chromosomes. The diagrams below (numbered 1 through 12) represent anaphases (“pulling apart” stages) of individual cells in meiosis or mitosis in a plant that is genetically a dihybrid (A /a ; B /b ) for genes on different chromosomes. The lines represent chromosomes or chromatids, and the points of the V’s represent centromeres. In each case, indicate if the diagram represents a cell in meiosis I, meiosis II, or mitosis. If a diagram shows an impossible situation, say so.
a. What unusual feature distinguishes this pedigree from those studied earlier in this chapter? b. Where do you think the mutant DNA responsible for this phenotype resides in the cell? 52. The plant Haplopappus gracilis has a 2n of 4. A diploid cell culture was established and, at premitotic S phase, a A
A
2 a a
b
A
3 A A
B
a
a
A
a
A
a A
A
a a
B b
A b b A
A
a a
a B B a
4 A a
B
a A
A
B
a
a A
A
B
A
A a
b
A
a
A a
b
B
A
1
a
B
b B
b B
b b
B B
b b
B
A b B a
b
B
b
B
B
b
b
B
5 a a
a B B a
6 A B
A a b B
b
b
7
B B
8
B b
9
b
10
B B
11
B a b
12
126 CHA P TER 3 Independent Assortment of Genes
55. The pedigree below shows the recurrence of a rare neurological disease (large black symbols) and spontaneous fetal abortion (small black symbols) in one family. (A slash means that the individual is deceased.) Provide an explanation for this pedigree in regard to the cytoplasmic segregation of defective mitochondria.
56. A man is brachydactylous (very short fingers; rare autosomal dominant), and his wife is not. Both can taste the chemical phenylthiocarbamide (autosomal dominant; common allele), but their mothers could not. a. Give the genotypes of the couple. If the genes assort independently and the couple has four children, what is the probability of b. all of them being brachydactylous? c. none being brachydactylous? d. all of them being tasters? e. all of them being nontasters? f. all of them being brachydactylous tasters? g. none being brachydactylous tasters? h. at least one being a brachydactylous taster? 57. One form of male sterility in corn is maternally transmitted. Plants of a male-sterile line crossed with normal
pollen give male-sterile plants. In addition, some lines of corn are known to carry a dominant nuclear restorer allele (Rf ) that restores pollen fertility in male-sterile lines. a. Research shows that the introduction of restorer alleles into male-sterile lines does not alter or affect the maintenance of the cytoplasmic factors for male sterility. What kind of research results would lead to such a conclusion? b. A male-sterile plant is crossed with pollen from a plant homozygous for Rf. What is the genotype of the F1? The phenotype? c. The F1 plants from part b are used as females in a testcross with pollen from a normal plant (rf/rf ). What are the results of this testcross? Give genotypes and phenotypes, and designate the kind of cytoplasm. d. The restorer allele already described can be called Rf-1. Another dominant restorer, Rf-2, has been found. Rf-1 and Rf-2 are located on different chromosomes. Either or both of the restorer alleles will give pollen fertility. With the use of a male-sterile plant as a tester, what will be the result of a cross in which the male parent is (1) heterozygous at both restorer loci? (2) homozygous dominant at one restorer locus and homozygous recessive at the other? (3) heterozygous at one restorer locus and homozygous recessive at the other? (4) heterozygous at one restorer locus and homozygous dominant at the other?
344
Mapping Eukaryote Chromosomes by Recombination
4
C h a p t e r
Learning Outcomes After completing this chapter, you will be able to
0.0 1.5 3.0 5.5 7.5 13.7 20.0
Yellow body Scute bristles White eyes Facet eyes Echinus eyes Ruby eyes Crossveinless wings Cut wings
21.0
Singed bristles
27.7
Lozenge eyes
33.0 36.1
Vermillion eyes Miniature wings
43.0
Sable body
44.0
Garnet eyes
56.7
Forked bristles
57.0 59.5 62.5 66.0
Bar eyes Fused veins Carnation eyes Bobbed hairs
• Perform a quantitative analysis of the progeny of a dihybrid testcross to assess whether or not the two genes are linked on the same chromosome. • Extend the same type of analysis to several loci to produce a map of the relative positions of loci on a chromosome. • In ascomycete fungi, map the centromeres to other linked loci.
At the left is a recombination-based map of one of the chromosomes of Drosophila (the organism in the image above), showing the loci of genes whose mutations produce known phenotypes.
• In asci, predict allele ratios stemming from specific steps in the heteroduplex model of crossing over.
[ © David Scharf/Corbis]
outline 4.1 Diagnostics of linkage 4.2 Mapping by recombinant frequency 4.3 Mapping with molecular markers 4.4 Centromere mapping with linear tetrads 4.5 Using the chi-square test to infer linkage 4.6 Accounting for unseen multiple crossovers 4.7 Using recombination-based maps in conjunction with physical maps 4.8 The molecular mechanism of crossing over
127
128 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
S
ome of the questions that geneticists want to answer about the genome are, What genes are present in the genome? What functions do they have? What positions do they occupy on the chromosomes? Their pursuit of the third question is broadly called mapping. Mapping is the main focus of this chapter, but all three questions are interrelated, as we will see. We all have an everyday feeling for the importance of maps in general, and, indeed, we have all used them at some time in our lives to find our way around. Relevant to the focus of this chapter is that, in some situations, several maps need to be used simultaneously. A good example in everyday life is in navigating the dense array of streets and buildings of a city such as London, England. A street map that shows the general layout is one necessity. However, the street map is used by tourists and Londoners alike in conjunction with another map, that of the underground railway system. The underground system is so complex and spaghetti-like that, in 1933, an electrical circuit engineer named Harry Beck drew up the streamlined (although distorted) map that has remained to this day an icon of London. The street and underground maps of London are compared in Figure 4-1. Note that the positions of the underground stations and the exact distances between them are of no interest in themselves, except as a way of getting to a destination of interest such as Westminster Abbey. We will see three parallels with the London maps when chromosome maps are used to zero in on individual “destinations,” or specific genes. First, several different types of chromosome maps are often necessary and must be used in conjunction; second, maps that contain distortions are still useful; and third, many sites on a chromosome map are Two maps are better than one
F i g u r e 4 -1 These London maps illustrate the principle that, often, several maps are
needed to get to a destination of interest. The map of the underground railway (“the Tube”) is used to get to a destination of interest such as a street address, shown on the street map. In genetics, two different kinds of genome maps are often useful in locating a gene, leading to an understanding of its structure and function. [ (left) © MAPS.com/Corbis; (right) Transport for London.]
4.1 Diagnostics of Linkage 129
charted only because they are useful in trying to zero in on other sites that are the ones of real interest. Obtaining a map of gene positions on the chromosomes is an endeavor that has occupied thousands of geneticists for the past 80 years or so. Why is it so important? There are several reasons: 1. Gene position is crucial information needed to build complex genotypes required for experimental purposes or for commercial applications. For example, in Chapter 6, we will see cases in which special allelic combinations must be put together to explore gene interaction. 2. Knowing the position occupied by a gene provides a way of discovering its structure and function. A gene’s position can be used to define it at the DNA level. In turn, the DNA sequence of a wild-type gene or its mutant allele is a necessary part of deducing its underlying function. 3. The genes present and their arrangement on chromosomes are often slightly different in related species. For example, the rather long human chromosome number 2 is split into two shorter chromosomes in the great apes. By comparing such differences, geneticists can deduce the evolutionary genetic mechanisms through which these genomes diverged. Hence, chromosome maps are useful in interpreting mechanisms of evolution. The arrangement of genes on chromosomes is represented diagrammatically as a unidimensional chromosome map, showing gene positions, known as loci (sing., locus), and the distances between the loci based on some kind of scale. Two basic types of chromosome maps are currently used in genetics; they are assembled by quite different procedures yet are used in a complementary way. Recombination-based maps, which are the topic of this chapter, map the loci of genes that have been identified by mutant phenotypes showing single-gene inheritance. Physical maps (see Chapter 14) show the genes as segments arranged along the long DNA molecule that constitutes a chromosome. These maps show different views of the genome, but, just like the maps of London, they can be used together to arrive at an understanding of what a gene’s function is at the molecular level and how that function influences phenotype. K e y C o n c e p t Genetic maps are useful for strain building, for interpreting evolutionary mechanisms, and for discovering a gene’s unknown function. Discovering a gene’s function is facilitated by integrating information on recombination-based and physical maps.
4.1 Diagnostics of Linkage Recombination maps of chromosomes are usually assembled two or three genes at a time, with the use of a method called linkage analysis. When geneticists say that two genes are linked, they mean that the loci of those genes are on the same chromosome, and, hence, the alleles on any one homolog are physically joined (linked) by the DNA between them. The way in which early geneticists deduced linkage is a useful means of introducing most of the key ideas and procedures in the analysis.
Using recombinant frequency to recognize linkage In the early 1900s, William Bateson and R. C. Punnett (for whom the Punnett square was named) were studying the inheritance of two genes in sweet peas. In
13 0 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
a standard self of a dihybrid F1, the F2 did not show the 9 : 3 : 3 : 1 ratio predicted by the principle of independent assortment. In fact, Bateson and Punnett noted that certain combinations of alleles showed up more often than expected, almost as though they were physically attached in some way. However, they had no explanation for this discovery. Later, Thomas Hunt Morgan found a similar deviation from Mendel’s second law while studying two autosomal genes in Drosophila. Morgan proposed linkage as a hypothesis to explain the phenomenon of apparent allele association. Let’s look at some of Morgan’s data. One of the genes affected eye color (pr , purple, and pr +, red), and the other gene affected wing length (vg , vestigial, and vg +, normal). (Vestigial wings are very small compared to wild type.) The wildtype alleles of both genes are dominant. Morgan performed a cross to obtain dihybrids and then followed with a testcross:
P
pr/pr ⋅ vg/vg × pr+/pr1⋅ vg1/vg1
↓
Gametes
pr ⋅ vg pr1⋅ vg1
↓
F1 dihybrid
pr+/pr
⋅ vg+/vg
Testcross: pr+/pr ⋅ vg+/vg/ × pr/pr ⋅ vg/vg F1 dihybrid female Tester male Morgan’s use of the testcross is important. Because the tester parent contributes gametes carrying only recessive alleles, the phenotypes of the offspring directly reveal the alleles contributed by the gametes of the dihybrid parent, as described in Chapters 2 and 3. Hence, the analyst can concentrate on meiosis in one parent (the dihybrid) and essentially forget about meiosis in the other (the tester). In contrast, from an F1 self, there are two sets of meioses to consider in the analysis of progeny: one in the male parent and the other in the female. Morgan’s testcross results were as follows (listed as the gametic classes from the dihybrid): pr+ ⋅ vg+ 1339 pr ⋅ vg 1195 pr+ ⋅ vg 151 + pr ⋅ vg 154 2839 Obviously, these numbers deviate drastically from the Mendelian prediction of a 1 : 1 : 1 : 1 ratio expected from independent assortment (approximately 710 in each of the four classes). In Morgan’s results, we see that the first two allele combinations are in the great majority, clearly indicating that they are associated, or “linked.” Another useful way of assessing the testcross results is by considering the percentage of recombinants in the progeny. By definition, the recombinants in the present cross are the two types pr +· vg and pr· vg + because they are clearly not the two input genotypes contributed to the F1 dihybrid by the original homozygous parental flies (more precisely, by their gametes). We see that the two recombinant types are approximately equal in frequency (151 ~ 154). Their total is 305, which is a frequency of (305/2839) × 100, or 10.7 percent. We can make sense of
4.1 Diagnostics of Linkage 131
these data, as Morgan did, by postulating that the genes were linked on the same chromosome, and so the parental allelic combinations are held together in the majority of progeny. In the dihybrid, the allelic conformation must have been as follows: pr1 vg 1 pr
Linked alleles tend to be inherited together
P
vg
The tendency of linked alleles to be inherited as a package is illustrated in Figure 4-2. Now let’s look at another cross that Morgan made with the use of the same alleles but in a different combination. In this cross, each parent is homozygous for the wild-type allele of one gene and the mutant allele of the other. Again, F1 females were testcrossed:
Gametes
↓ pr+/pr ⋅ vg+/vg
Testcross:
pr+/pr ⋅ vg+/vg/ ×
F1 dihybrid female
pr/pr ⋅ vg/vg Tester male
The following progeny were obtained from the testcross: pr+ ⋅ vg+ 157 pr ⋅ vg pr+ ⋅ vg
146 965
pr ⋅ vg+ 1067 2335 Again, these results are not even close to a 1 : 1 : 1 : 1 Mendelian ratio. Now, however, the recombinant classes are the converse of those in the first analysis, pr + vg + and pr vg. But notice that their frequency is approximately the same: (157 + 146)/2335 × 100 = 12.9 percent. Again, linkage is suggested, but, in this case, the F1 dihybrid must have been as follows: pr1
vg
pr
vg1
Dihybrid testcross results like those just presented are commonly encountered in genetics. They follow the general pattern: Two equally frequent nonrecombinant classes totaling in excess of 50 percent Two equally frequent recombinant classes totaling less than 50 percent
pr +
vg +
vg
pr +
vg +
pr
Gametes pr+ ⋅ vg pr ⋅ vg+ F1 dihybrid
vg
pr
↓
vg +
vg
P pr+/pr+ ⋅ vg/vg × pr/pr ⋅ vg+/vg+
pr +
pr
F1
pr
vg
pr +
vg +
F i g u r e 4 -2 Simple inheritance of
two genes located on the same chromosome pair. The same genes are present together on a chromosome in both parents and progeny.
132 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
K e y C o n c e p t When two genes are close together on the same chromosome pair (that is, they are linked), they do not assort independently but produce a recombinant frequency of less than 50 percent. Hence, conversely, a recombinant frequency of less than 50 percent is a diagnostic for linkage.
Crossing over produces new allelic combinations pr
pr +
vg
vg +
Parental chromosomes
pr
vg
pr +
vg +
Crossover between chromatids Meiosis
F i g u r e 4 - 3 The exchange of parts by
crossing over may produce gametic chromosomes whose allelic combinations differ from the parental combinations.
How crossovers produce recombinants for linked genes
The linkage hypothesis explains why allele combinations from the parental generations remain together: the genes are physically attached by the segment of chromosome between them. But exactly how are any recombinants produced when genes are linked? Morgan suggested that, when homologous chromosomes pair at meiosis, the chromosomes occasionally break and exchange parts in a process pr + vg called crossing over. Figure 4-3 illustrates this physical exchange of chromosome segments. The two new combinations are called crossover products. Is there any microscopically observable process that Crossover chromosomes could account for crossing over? At meiosis, when duplicated homologous chromosomes pair with each other—in genetic terms, when the two dyads unite as a bivalent—a cross-shaped structure called a chiasma (pl., chiasmata) often forms between two nonsister chromatids. Chiasmata are shown in Figure 4-4. To Morgan, the appearance of the chiasmata visually corroborated the concept of crossing over. (Note that the chiasmata seem to indicate that chromatids, not unduplicated chromosomes, participate in a crossover. We will return to this point later.) pr
vg +
K e y C o n c e p t For linked genes, recombinants are produced by crossovers. Chiasmata are the visible manifestations of crossovers.
Chiasmata are the sites of crossing over
Linkage symbolism and terminology The work of Morgan showed that linked genes in a dihybrid may be present in one of two basic conformations. In one, the two dominant, or wild-type, alleles are present on the same homolog (as in Figure 4-3); this arrangement is called a cis conformation (cis means “adjacent”). In the other, they are on different homologs, in what is called a trans conformation (trans means “opposite”). The two conformations are written as follows: Cis AB/ab or + +/ab Trans Ab/aB or + b/a + Note the following conventions that pertain to linkage symbolism: 1. Alleles on the same homolog have no punctuation between them.
Figure 4-4 Several chiasmata appear in
this photograph taken in the course of meiosis in a grasshopper testis. [ G. H. Jones and F. C. H. Franklin, ”Meiotic Crossing-over: Obligation and Interference,” Cell 126:2 (28 July 2006), 246–248. © Elsevier.]
2. A slash symbolically separates the two homologs. 3. Alleles are always written in the same order on each homolog. 4. As in earlier chapters, genes known to be on different chromosomes (unlinked genes) are shown separated by a semicolon—for example, A/a ; C/c. 5. In this book, genes of unknown linkage are shown separated by a dot, A/a·D/d.
4.1 Diagnostics of Linkage 13 3
Evidence that crossing over is a breakage-and-rejoining process The idea that recombinants are produced by some kind of exchange of material between homologous chromosomes was a compelling one. But experimentation was necessary to test this hypothesis. A first step was to find a case in which the exchange of parts between chromosomes would be visible under the microscope. Several investigators approached this problem in the same way, and one of their analyses follows. In 1931, Harriet Creighton and Barbara McClintock were studying two genes of corn that they knew were both located on chromosome 9. One affected seed color (C, colored; c, colorless), and the other affected endosperm composition (Wx, waxy; wx, starchy). The plant was a dihybrid in cis conformation. However, in one plant, the chromosome 9 carrying the alleles C and Wx was unusual in that it also carried a large, densely staining element (called a knob) on the C end and a longer piece of chromosome on the Wx end; thus, the heterozygote was Wx
C
wx
c
In the progeny of a testcross of this plant, they compared recombinants and parental genotypes. They found that all the recombinants inherited one or the other of the two following chromosomes, depending on their recombinant makeup: wx
C
Wx
c
Thus, there was a precise correlation between the genetic event of the appearance of recombinants and the chromosomal event of crossing over. Consequently, the chiasmata appeared to be the sites of exchange, although what was considered to be the definitive test was not undertaken until 1978. What can we say about the molecular mechanism of chromosome exchange in a crossover event? The short answer is that a crossover results from the breakage and reunion of DNA. Two parental chromosomes break at the same position, and then each piece joins up with the neighboring piece from the other chromosome. In Section 4.8, we will see a model of the molecular processes that allow DNA to break and rejoin in a precise manner such that no genetic material is lost or gained. K e y C o n c e p t A crossover is the breakage of two DNA molecules at the same position and their rejoining in two reciprocal recombinant combinations.
Evidence that crossing over takes place at the four- chromatid stage As already noted, the diagrammatic representation of crossing over in Figure 4-3 shows a crossover taking place at the four-chromatid stage of meiosis; in other words, crossovers are between nonsister chromatids. However, it was theoretically possible that crossing over took place before replication, at the two-chromosome stage. This uncertainty was resolved through the genetic analysis of organisms whose four products of meiosis remain together in groups of four called tetrads. These organisms, which we met in Chapters 2 and 3, are fungi and unicellular
13 4 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
algae. The products of meiosis of a single tetrad can be isolated, which is equivalent to isolating all four chromatids from a single meiocyte. Tetrad analyses of crosses in which genes are linked show many tetrads that contain four different allele combinations. For example, from the cross AB × ab some (but not all) tetrads contain four genotypes: AB Ab aB
Crossing over is between chromatids, not chromosomes Two-chromosome stage A
B
A
b
a
b
a
B
ab
A
b
A
b
a
B
a
B
A
B
A
b
Four-chromatid stage A A
B
a
b a
B
b
F i g u r e 4 - 5 Crossing over takes place at the four-chromatid stage. Because more than two different products of a single meiosis can be seen in some tetrads, crossing over cannot take place at the two-strand stage (before DNA replication). The white circle designates the position of the centromere. When sister chromatids are visible, the centromere appears unreplicated.
This result can be explained only if crossovers take place at the four-chromatid stage because, if crossovers took place at the two-chromosome stage, there could only ever be a maximum of two different genotypes in an individual tetrad. This reasoning is illustrated in Figure 4-5.
Multiple crossovers can include more than two chromatids
Tetrad analysis can also show two other important features of crossing over. First, in some individual meiocytes, several a B crossovers can occur along a chromosome pair. Second, in a any one meiocyte, these multiple crossovers can exchange b material between more than two chromatids. To think about this matter, we need to look at the simplest case: double crossovers. To study double crossovers, we need three linked genes. For example, if the three loci are all linked in a cross such as ABC × abc many different tetrad types are possible, but some types are informative in the present connection because they can be accounted for only by double crossovers in which more than two chromatids take part. Consider the following tetrad as an example: ABc AbC aBC abc This tetrad must be explained by two crossovers in which three chromatids take part, as shown in Figure 4-6a. Furthermore, the following type of tetrad shows that all four chromatids can participate in crossing over in the same meiosis (Figure 4-6b): ABc Abc aBC abC Therefore, for any pair of homologous chromosomes, two, three, or four chromatids can take part in crossing-over events in a single meiocyte. Note, however, that any single crossover is between two chromatids.
4.2 Mapping by Recombinant Frequency 13 5
Multiple crossovers can include more than two chromatids (b)
(a) Position of crossovers
Tetrad genotypes
Position of crossovers
Tetrad genotypes
A
B
C
A
B
c
A
B
C
A
B
c
A
B
C
A
b
C
A
B
C
A
b
c
a
b
c
a
B
C
a
b
c
a
B
C
a
b
c
a
b
c
a
b
c
a
b
C
You might be wondering about crossovers between sister chromatids. They do occur but are rare. They do not produce new allele combinations and so are not usually considered.
F i g u r e 4 - 6 Double crossovers can include (a) three chromatids or (b) four chromatids.
4.2 Mapping by Recombinant Frequency The frequency of recombinants produced by crossing over is the key to chromosome mapping. Fungal tetrad analysis has shown that, for any two specific linked genes, crossovers take place between them in some, but not all, meiocytes (Figure 4-7). The farther apart the genes are, the more likely that a crossover will take place and the higher the proportion of recombinant products will be. Thus, the proportion of recombinants is a clue to the distance separating two gene loci on a chromosome map. As stated earlier in regard to Morgan’s data, the recombinant frequency was significantly less than 50 percent, specifically 10.7 percent. Figure 4-8 shows the general situation for linkage in which recombinants are less than 50 percent. Recombinant frequencies for different linked genes range from 0 to 50 percent, depending on their closeness. The farther apart genes are, the more closely their recombinant frequencies approach 50 percent, and, in such cases, one cannot decide whether genes are linked or are on different chromosomes. What about recombinant frequencies greater than 50 percent? The answer is that such frequencies are never observed, as will be proved later. Recombinants are produced by crossovers Meiotic chromosomes
Meioses with no crossover between the genes
Meioses with a crossover between the genes
Meiotic products
A
B
A
B
A
B
A
B
a
b
a
b
a
b
a
b
A
B
A
B
A
B
A
b
a
b
a
B
a
b
a
b
Parental Parental Parental Parental Parental Recombinant Recombinant Parental
F i g u r e 4 -7 Recombinants arise from meioses in which a crossover takes place between nonsister chromatids. ANIMATED ART: Meiotic recombination between linked genes by crossing over
13 6 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
For linked genes, recombinant frequencies are less than 50 percent A
B
a
b
A
B
a
b
A
B
a
b
P
Gametes
A
B
a
b
Meiotic diploid (F1)
Testcross progeny
1 4 1 4 1 4 1 4
F i g u r e 4 - 8 A testcross reveals that the frequencies of recombinants arising from crossovers between linked genes are less than 50 percent.
A
B
a
b
a
b
a
b
A
b
a
b
a
B
a
b
Note in Figure 4-7 that a single crossover generates two reciprocal recombinant products, which explains why the reciprocal recombinant classes are generally approximately equal in frequency. The corollary of this point is that the two parental nonrecombinant types also must be equal in frequency, as also observed by Morgan.
Map units The basic method of mapping genes with the use of recombinant frequencies was devised by a student of Morgan’s. As Mora b gan studied more and more linked genes, he saw that the proportion of recombinant progeny varied considerably, a b depending on which linked genes were being studied, and he (Tester) thought that such variation in recombinant frequency might somehow indicate the actual distances separating genes on the chromosomes. Morgan assigned the quantification of this proParental type cess to an undergraduate student, Alfred Sturtevant, who also became one of the great geneticists. Morgan asked Sturtevant to try to make some sense of the data on crossing over between Parental type different linked genes. In one evening, Sturtevant developed a method for mapping genes that is still used today. In Sturtevant’s own words, “In the latter part of 1911, in conversation Recombinant with Morgan, I suddenly realized that the variations in strength of linkage, already attributed by Morgan to differences in the spatial separation of genes, offered the possibility of determinRecombinant ing sequences in the linear dimension of a chromosome. I went home and spent most of the night (to the neglect of my undergraduate homework) in producing the first chromosome map.” As an example of Sturtevant’s logic, consider Morgan’s testcross results with the pr and vg genes, from which he calculated a recombinant frequency of 10.7 percent. Sturtevant suggested that we can use this percentage of recombinants as a quantitative index of the linear distance between two genes on a genetic map, or linkage map, as it is sometimes called. The basic idea here is quite simple. Imagine two specific genes positioned a certain fixed distance apart. Now imagine random crossing over along the paired homologs. In some meioses, nonsister chromatids cross over by chance in the chromosomal region between these genes; from these meioses, recombinants are produced. In other meiotic divisions, there are no crossovers between these genes; no recombinants result from these meioses. (See Figure 4-7 for a diagrammatic illustration.) Sturtevant postulated a rough proportionality: the greater the distance between the linked genes, the greater the chance of crossovers in the region between the genes and, hence, the greater the proportion of recombinants that would be produced. Thus, by determining the frequency of recombinants, we can obtain a measure of the map distance between the genes. In fact, Sturtevant defined one genetic map unit (m.u.) as that distance between genes for which 1 product of meiosis in 100 is recombinant. For example, the recombinant frequency (RF) of 10.7 percent obtained by Morgan is defined as 10.7 m.u. A map unit is sometimes referred to as a centimorgan (cM) in honor of Thomas Hunt Morgan. Does this method produce a linear map corresponding to chromosome linearity? Sturtevant predicted that, on a linear map, if 5 map units (5 m.u.) separate genes A and B, and 3 m.u. separate genes A and C, then the distance separating B and C should be either 8 or 2 m.u. (Figure 4-9). Sturtevant found his prediction to be the case. In other words, his analysis strongly suggested that genes are arranged in some linear order, making map distances additive. (There are some minor but not insignificant exceptions, as we will see later.) Since we now know from
4.2 Mapping by Recombinant Frequency 137
Map distances are generally additive A
Map based on A–B recombination
B 5 m.u.
A
Map based on A–C recombination
C 3 m.u.
A
C 3 m.u.
Possible combined maps
A
B 5 m.u.
8 m.u.
C
3 m.u.
5 m.u.
B 2 m.u.
molecular analysis that a chromosome is a single DNA molecule with the genes arranged along it, it is no surprise for us today to learn that recombination-based maps are linear because they reflect a linear array of genes. How is a map represented? As an example, in Drosophila, the locus of the eyecolor gene and the locus of the wing-length gene are approximately 11 m.u. apart, as mentioned earlier. The relation is usually diagrammed in the following way: pr
11.0 m.u.
vg
Generally, we refer to the locus of this eye-color gene in shorthand as the “pr locus,” after the first discovered mutant allele, but we mean the place on the chromosome where any allele of this gene will be found, mutant or wild type. As stated in Chapters 2 and 3, genetic analysis can be applied in two opposite directions. This principle is applicable to recombinant frequencies. In one direction, recombinant frequencies can be used to make maps. In the other direction, when given an established map with genetic distance in map units, we can predict the frequencies of progeny different classes. For example, the genetic distance Introduction to Genetic Analysis,in 11e Figure 04.09 #413 between the pr and vg loci in Drosophila is approximately 11 m.u. So knowing this 04/25/14 value, we know that there will be 11 percent recombinants in the progeny from a Dragonfly Media Group testcross of a female dihybrid heterozygote in cis conformation (pr vg/pr + vg +). These recombinants will consist of two reciprocal recombinants of equal frequency: thus, 5.5 percent will be pr vg + and 5.5 percent will be pr + vg. We also know that 100 − 11 = 89 percent will be nonrecombinant in two equal classes, 44.5 percent pr + vg + and 44.5 percent pr vg. (Note that the tester contribution pr vg was ignored in writing out these genotypes.) There is a strong implication that the “distance” on a linkage map is a physical distance along a chromosome, and Morgan and Sturtevant certainly intended to imply just that. But we should realize that the linkage map is a hypothetical entity constructed from a purely genetic analysis. The linkage map could have been derived without even knowing that chromosomes existed. Furthermore, at this point in our discussion, we cannot say whether the “genetic distances” calculated by means of recombinant frequencies in any way represent actual physical distances on chromosomes. However, physical mapping has shown that genetic
F i g u r e 4 - 9 A chromosome region containing three linked genes. Because map distances are additive, calculation of A–B and A–C distances leaves us with the two possibilities shown for the B–C distance.
13 8 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
distances are, in fact, roughly proportional to recombination-based distances. There are exceptions caused by recombination hotspots, places in the genome where crossing over takes place more frequently than usual. The presence of hotspots causes proportional expansion of some regions of the map. Recombination blocks, which have the opposite effect, also are known. A summary of the way in which recombinants from crossing over are used in mapping is shown in Figure 4-10. Crossovers occur more or less randomly along the chromosome pair. In general, in longer regions, the average number of crossovers is higher and, accordingly, recombinants are more frequently obtained, translating into a longer map distance.
Longer regions have more crossovers and thus higher recombinant frequencies Unseen distribution of crossovers A
B
C
a
b
c
A
B
C
Meiocyte 1
Meiocyte 2
a
c
b
A
B
C
a
b
c
A
B
C
a
b
c
Meiocyte 3
…
Meiocyte 4 etc.
Short map distance
F i g u r e 4 -10 Crossovers produce
recombinant chromatids whose frequency can be used to map genes on a chromosome. Longer regions produce more crossovers. Brown shows recombinants for that interval.
Few recombinants
Chromosome map
A locus
Numerous recombinants Long map distance
B locus
C locus
4.2 Mapping by Recombinant Frequency 13 9
K e y C o n c e p t Recombination between linked genes can be used to map their distance apart on a chromosome. The unit of mapping (1 m.u.) is defined as a recombinant frequency of 1 percent.
Three-point testcross So far, we have looked at linkage in crosses of dihybrids (double heterozygotes) with doubly recessive testers. The next level of complexity is a cross of a trihybrid (triple heterozygote) with a triply recessive tester. This kind of cross, called a three-point testcross or a three-factor cross, is commonly used in linkage analysis. The goal is to deduce whether the three genes are linked and, if they are, to deduce their order and the map distances between them. Let’s look at an example, also from Drosophila. In our example, the mutant alleles are v (vermilion eyes), cv (crossveinless, or absence of a crossvein on the wing), and ct (cut, or snipped, wing edges). The analysis is carried out by performing the following crosses: P v+/v+ ⋅ cv /cv ⋅ ct /ct × v /v ⋅ cv+/cv+ ⋅ ct+/ct+
↓ Gametes v+ ⋅ cv ⋅ t v ⋅ cv+ ⋅ ct+ F1 trihybrid v /v+ ⋅ cv /cv+ ⋅ ct /ct+ Trihybrid females are testcrossed with triple recessive males: v /v+ ⋅ cv /cv+ ⋅ ct /ct+/ × v /v ⋅ cv /cv ⋅ ct /ct F1 trihybrid female
Tester male
From any trihybrid, only 2 × 2 × 2 = 8 gamete genotypes are possible. They are the genotypes seen in the testcross progeny. The following chart shows the number of each of the eight gametic genotypes in a sample of 1448 progeny flies. The columns alongside show which genotypes are recombinant (R) for the loci taken two at a time. We must be careful in our classification of parental and recombinant types. Note that the parental input genotypes for the triple heterozygotes are v +· cv· ct and v· cv +· ct +; any combination other than these two constitutes a recombinant. Recombinant for loci Gametes 580
⋅ cv ⋅ ct
592
v ⋅ v+
v and cv ct+
cv+
⋅
v and ct
cv and ct
ct+
45
R
R
ct
40
R
R
v ⋅ cv ⋅ ct
89
R
R
94
R
R
v ⋅ cv ⋅ v+ v+
⋅ ⋅
v ⋅ v+
cv+ ⋅ cv+ ⋅ cv+
ct+
⋅ ct
⋅ cv ⋅
ct+
3
R
R
5
R
R
191
93
1448
268
Let’s analyze the loci two at a time, starting with the v and cv loci. In other words, we look at just the first two columns under “Gametes” and cover up the third one. Because the parentals for this pair of loci are v +· cv and v· cv +, we know that the recombinants are by definition v· cv and v +· cv +. There are 45 + 40 + 89 + 94 = 268
140 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
of these recombinants. Of a total of 1448 flies, this number gives an RF of 18.5 percent. For the v and ct loci, the recombinants are v· ct and v+· ct +. There are 89 + 94 + 3 + 5 = 191 of these recombinants among 1448 flies, and so the RF = 13.2 percent. For ct and cv, the recombinants are cv· ct+ and cv+· ct. There are 45 + 40 + 3 + 5 = 93 of these recombinants among the 1448, and so the RF = 6.4 percent. Clearly, all the loci are linked, because the RF values are all considerably less than 50 percent. Because the v and cv loci have the largest RF value, they must be farthest apart; therefore, the ct locus must lie between them. A map can be drawn as follows: v
ct 13.2 m.u.
cv 6.4 m.u.
The testcross can be rewritten as follows, now that we know the linkage arrangement: v+ ct cv/v ct+ cv+ × v ct cv/v ct cv Note several important points here. First, we have deduced a gene order that is different from that used in our list of the progeny genotypes. Because the point of the exercise was to determine the linkage relation of these genes, the original listing was of necessity arbitrary; the order was simply not known before the data were analyzed. Henceforth, the genes must be written in correct order. Second, we have definitely established that ct is between v and cv. In the diagram, we have arbitrarily placed v to the left and cv to the right, but the map could equally well be drawn with the placement of these loci inverted. Third, note that linkage maps merely map the loci in relation to one another, with the use of standard map units. We do not know where the loci are on a chromosome—or even which specific chromosome they are on. In subsequent analyses, as more loci are mapped in relation to these three, the complete chromosome map would become “fleshed out.” Key Concept Three-point (and higher) testcrosses enable geneticists to evaluate linkage between three (or more) genes and to determine gene order, all in one cross.
A final point to note is that the two smaller map distances, 13.2 m.u. and 6.4 m.u., add up to 19.6 m.u., which is greater than 18.5 m.u., the distance calcucrossover between two chromatids. Notice that a double crossover produces lated for v and cv. Why? The answer to this question lies in the way in which we double-recombinant chromatids that have treated the two rarest classes of progeny (totaling 8) with respect to the have the parental allele combinations at recombination of v and cv. Now that we have the map, we can see that these two the outer loci. The position of the rare classes are in fact double recombinants, arising from two crossovers (Figure centromere cannot be determined from 4-11). However, when we calculated the RF value for v and cv, we did not count the the data. It has been added for v ct cv+ and v+ ct+ cv genotypes; after all, with regard to v and cv , they are parental completeness. combinations (v cv+ and v+ cv). In light of our map, however, we see that this oversight led us to underestimate the disDouble recombinants arising tance between the v and the cv loci. Not only should we have from two crossovers counted the two rarest classes, we should have counted each of them twice because each represents double recombinants. v cv + ct + Hence, we can correct the value by adding the numbers 45 + ct + cv + v 40 + 89 + 94 + 3 + 3 + 5 + 5 = 284. Of the total of 1448, this number is exactly 19.6 percent, which is identical with the v+ ct cv sum of the two component values. (In practice, we do not ct cv v+ need to do this calculation, because the sum of the two shorter distances gives us the best estimate of the overall distance.) F i g u r e 4 -11 Example of a double
4.2 Mapping by Recombinant Frequency 141
Deducing gene order by inspection Now that we have had some experience with the three-point testcross, we can look back at the progeny listing and see that, for trihybrids of linked genes, gene order can usually be deduced by inspection, without a recombinant frequency analysis. Typically, for linked genes, we have the eight genotypes at the following frequencies: two at high frequency two at intermediate frequency two at a different intermediate frequency two rare Only three gene orders are possible, each with a different gene in the middle position. It is generally true that the double-recombinant classes are the smallest ones, as listed last here. Only one order is compatible with the smallest classes’ having been formed by double crossovers, as shown in Figure 4-12; that is, only one order gives double recombinants of genotype v ct cv+ and v+ ct + cv. A simple rule of thumb for deducing the gene in the middle is that it is the allele pair that has “flipped” position in the double-recombinant classes.
Interference Knowing the existence of double crossovers permits us to ask questions about their possible interdependence. We can ask, Are the crossovers in adjacent chromosome regions independent events or does a crossover in one region affect the likelihood of there being a crossover in an adjacent region? The answer is that, generally, crossovers inhibit each other somewhat in an interaction called interference. Doublerecombinant classes can be used to deduce the extent of this interference. Interference can be measured in the following way. If the crossovers in the two regions are independent, we can use the product rule (see page 94) to predict the frequency of double recombinants: that frequency would equal the product of the recombinant frequencies in the adjacent regions. In the v-ct-cv recombination data, the v-ct RF value is 0.132 and the ct-cv value is 0.064; so, if there is no interference, double recombinants might be expected at the frequency 0.132 × 0.064 = 0.0084 (0.84 percent). In the sample of 1448 flies, 0.0084 × 1448 = 12 double recombinants Different gene orders give different double recombinants Possible gene orders
Double-recombinant chromatids
v
ct +
cv +
v
ct
cv +
v+
ct
cv
v+
ct +
cv
ct +
v
cv +
ct +
v+
cv +
ct
v+
cv
ct
v
cv
ct +
cv +
v
ct +
cv
v
ct
cv
v+
ct
cv +
v+
F i g u r e 4 -12 The three possible gene orders shown on the left yield the six products of a double crossover shown on the right. Only the first possibility is compatible with the data in the text. Note that only the nonsister chromatids taking part in the double crossover are shown.
142 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
are expected. But the data show that only 8 were actually observed. If this deficiency of double recombinants were consistently observed, it would show us that the two regions are not independent and suggest that the distribution of crossovers favors singles at the expense of doubles. In other words, there is some kind of interference: a crossover does reduce the probability of a crossover in an adjacent region. Interference is quantified by first calculating a term called the coefficient of coincidence (c.o.c.), which is the ratio of observed to expected double recombinants. Interference (I) is defined as 1 − c.o.c. Hence, observed frequency, or number, of double recombinants I = 1− expected frequency, or ble recombinants number, of doub
In our example, I=1-
8 12
=
4 12
=
1 3
, or 33 percent
In some regions, there are never any observed double recombinants. In these cases, c.o.c. = 0, and so I = 1 and interference is complete. Interference values anywhere between 0 and 1 are found in different regions and in different organisms. You may have wondered why we always use heterozygous females for testcrosses in Drosophila. The explanation lies in an unusual feature of Drosophila males. When, for example, pr vg / pr+ vg+ males are crossed with pr vg / pr vg females, only pr vg / pr+ vg+ and pr vg / pr vg progeny are recovered. This result shows that there is no crossing over in Drosophila males. However, this absence of crossing over in one sex is limited to certain species; it is not the case for males of all species (or for the heterogametic sex). In other organisms, there is crossing over in XY males and in WZ females. The reason for the absence of crossing over in Drosophila males is that they have an unusual prophase I, with no synaptonemal complexes. Incidentally, there is a recombination difference between human sexes as well. Women show higher recombinant frequencies for the same autosomal loci than do men. With the use of a reiteration of the preceding recombination-based techniques, maps have been produced of thousands of genes for which variant (mutant) phenotypes have been identified. A simple illustrative example from the tomato is shown in Figure 4-13. The tomato chromosomes are shown in Figure 4-13a, their numbering in Figure 4-13b, and recombination-based gene maps in Figure 4-13c. The chromosomes are shown as they appear under the microscope, together with chromosome maps based on linkage analysis of various allelic pairs shown with their phenotypes.
Using ratios as diagnostics The analysis of ratios is one of the pillars of genetics. In the text so far, we have encountered many different ratios whose derivations are spread out over several chapters. Because recognizing ratios and using them in diagnosis of the genetic system under study are part of everyday genetics, let’s review the main ratios that we have covered so far. They are shown in Figure 4-14. You can read the ratios from the relative widths of the colored boxes in a row. Figure 4-14 deals with selfs and testcrosses of monohybrids, dihybrids (with independent assortment and linkage), and trihybrids (also with independent assortment and linkage of all genes). One situation not represented is a trihybrid in which only two of the three genes are linked; as an exercise, you might like to deduce the general pattern that would have to be included in such a diagram from this situation. Note that, in regard to linkage, the sizes of the classes depend on map distances. A geneticist deduces unknown genetic states in something like the following way: “a 9 : 3 : 3 : 1 ratio tells me that this ratio was very likely produced by a selfed dihybrid in which the genes are on different chromosomes.”
4.2 Mapping by Recombinant Frequency 14 3
A map of the 12 tomato chromosomes
6 5 3
12
11
9 LII 9SI
2 7 4 1
8SI
10
8I
(b)
(a)
5
2
1
(c)
7 23
Normal (M ) 12 Mottled (m )
Red (R )
Normal (F ) 23 Fasciated (f )
Yellow (r )
Green-base (U )
15
Tall (D)
4
Purple (A ) Dwarf (d )
Yellow (W f )
Green (a ) 18
White (w f ) Hairy (H I)
Smooth (P)
17
3
Woolly (Wo/ wo)
Normal (Br )
Normal (wo )
5
Normal (Ne) 4 Necrotic (n e )
35
Resistance to leaf mold (C fsc)
6
8
30
Resistance to leaf mold from Potentate #2 (C fp 2 )
Susceptibility to leaf mold (cfp 2 ) Susceptibility to leaf mold from Stirling Castle (cfsc)
Clear skin (y )
4
Nonwilty (W ) 35 Wilty (w)
Cut leaf (C )
Potato leaf (c)
Spread Compact dwarf dwarf (D m ) 10 modifier (d m )
6 12
Many locules (lc)
Anthocyanin loser (a l ) 9
14
Few locules (Lc)
Purple stem (Al )
Nipple-tip (n t)
Normal (Nt)
Beaked (b k)
Green (x a )
Jointless (j)
Compound inflor. (s)
Simple inflor. (S)
Non-beaked (B k)
Non-tangerine 30 Tangerine (t) (T )
Xanthophyllous (Xa /xa )
Brachytic (b r ) Jointed (J)
Yellow skin (Y)
Hairy (h)
Hairless (h l )
2 Normal (L f ) 16 Leafy (lf )
30
16
21
20
Peach (p )
Oblate (o )
Normal (O)
Smooth (H )
Uniform fruit (u)
Indeterminate (Sp ) Resistance to leaf mold (C fp 1 )
Self-pruning (sp ) 33
Green (L )
Lutescent (l ) 27
Susceptibility to leaf mold from Potentate #1 (cfp 1 )
Broad cotyledons (N c) 11
Normal (Bu )
Bushy (bu )
Normal (B)
Figure 4-13 (a) Photomicrograph of a meiotic prophase I (pachytene) from anthers,
showing the 12 pairs of chromosomes. (b) Illustration of the 12 chromosomes shown in part a. The chromosomes are identified by the currently used chromosome-numbering system. The centromeres are shown in orange, and the flanking, densely staining regions (heterochromatin) in green. (c) 1952 linkage map. Each locus is flanked by drawings of the normal and variant phenotypes. Interlocus map distances are shown in map units. [ (a and b) From C. M. Rick, “The Tomato,” Scientific American, 1978. (c) Data from L. A. Butler.]
Narrow cotyledons (n c)
Broad (b) 12
Normal (Mc)
Macrocalyx (m c )
14 4 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
F i g u r e 4 -14 P = parental, R = recombinant, SCO = single crossover, DCO = double crossover.
Phenotypic ratios in progeny reveal the type of cross Phenotypic Ratios Monohybrid testcrossed 1:1 Monohybrid selfed 3:1 Dihybrid testcrossed (independent assortment) 1:1:1:1 Dihybrid selfed (independent assortment) 9:3:3:1 Dihybrid testcrossed (linked) [Example only (P:R:R:P)] Trihybrid testcrossed (independent assortment) 1:1:1:1:1:1:1:1 Trihybrid testcrossed (all linked) [Example only (P:P:SCO:SCO:SCO:SCO:DCO:DCO)]
4.3 Mapping with Molecular Markers So far in this chapter we have mapped gene loci using RF values by counting visible phenotypes produced by the various alleles involved. However, there are also differences in the DNA between two chromosomes that do not produce visibly different phenotypes, either because these DNA differences are not located in genes or they are located in genes but do not alter the product protein. Such sequence differences can be thought of as molecular alleles or molecular markers. Their loci can be mapped by RF values in the same way as alleles producing visible phenotypes. Molecular markers are extremely numerous and hence are very useful as genomic landmarks that can be used to locate genes of interest. The two main types of molecular markers used in mapping are single nucleotide polymorphisms and simple sequence length polymorphisms.
Single nucleotide polymorphisms Sequencing has shown that, as expected, the genomic sequences of individuals in a species are mostly identical. For example, comparisons of the sequences of different individuals have revealed that we are about 99.9 percent identical. Almost all of the 0.1 percent difference turns out to be based on single-nucleotide differences. As an example, in one individual, a localized sequence might be ....AAGGCTCAT.... ....TTCCGAGTA.... and, in another, it might be ....AAAGCTCAT.... ....TTTCGAGTA....
4.3 Mapping with Molecular Markers 14 5
Furthermore, a large proportion of these localized sequences are found to be polymorphic, meaning that both molecular “alleles” are quite common in the population. Overall, such differences between individuals are called single nucleotide polymorphisms, abbreviated as SNPs and pronounced “snips.” In humans, there are thought to be about 3 million SNPs distributed more or less randomly at a frequency of 1 in every 300 to 1000 bases. Some of these SNPs lie within genes; many do not. In Chapter 2, we saw cases where the change in a single nucleotide pair could produce a new allele, causing a mutant phenotype. The two nucleotide pairs, wild type and mutant, are examples of a SNP. Most SNPs, though, do not produce different phenotypes, either because they do not lie in a gene or because they lie in a gene but both versions of the gene produce the same protein product. There are two ways to detect a SNP. The first is to sequence a segment of DNA in homologous chromosomes and compare the homologous segments to spot differences. A second way is possible in the case of SNPs located at a restriction enzyme’s target site: these SNPs are restriction fragment length polymorphisms (RFLPs). In such cases, there will be two RFLP “alleles,” or morphs, one of which has the restriction enzyme target and the other of which does not. The restriction enzyme will cut the DNA at the SNP containing the target and ignore the other SNP. The SNPs are then detected as different bands on an electrophoretic gel. RFLP sites can be between or within genes.
Simple sequence length polymorphisms One of the surprises from molecular genomic analysis is that most genomes contain a great deal of repetitive DNA. Furthermore, there are many types of repetitive DNA. At one end of the spectrum are adjacent multiple repeats of short, simple DNA sequences. The origin of these repeats is not clear, but the feature that makes them useful is that, in different individuals, there are often different numbers of copies. Hence, these repeats are called simple sequence length polymorphisms (SSLPs). They are also sometimes called variable number tandem repeats, or VNTRs. SSLPs commonly have multiple alleles; as many as 15 alleles have been found for an SSLP locus. As a consequence, sometimes 4 alleles (2 from each parent) can be tracked in a pedigree. Two types of SSLPs are useful in mapping and other genome analysis: minisatellite and microsatellite markers. (The word satellite in this connection refers to the observation that, when genomic DNA is isolated and fractionated with the use of physical techniques, the repetitive sequences often form a fraction that is physically separate from the rest; that is, it is a satellite fraction in the sense that it is apart from the bulk.) Minisatellite markers A minisatellite marker is based on variation in the number of tandem repeats of a repeating unit from 15 to 100 nucleotides long. In humans, the total length of the unit is from 1 to 5 kb. Minisatellite loci having the same repeating unit but different numbers of repeats are dispersed throughout the genome. Microsatellite markers A microsatellite marker is based on variable numbers of tandem repeats of an even simpler sequence, generally a small number of nucleotides such as a dinucleotide. The most common type is a repeat of CA and its complement GT, as in the following example: 5′ C-A-C-A-C-A-C-A-C-A-C-A-C-A-C-A 3′ 3′ G-T-G-T-G-T-G-T-G-T-G-T-G-T-G-T 5′
146 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
Detecting simple sequence length polymorphisms Simple sequence length polymorphisms are detected by taking advantage of the fact that homologous regions bearing different numbers of tandem repeats will be of different lengths. A commonly used procedure for getting at these differences is to use flanking regions as primers in a PCR analysis (see Chapter 10). PCR replicates the DNA sequences until they are available in enough bulk for further analysis. The different lengths of the amplified PCR products can be detected by the different mobilities of the sequences on an electrophoretic gel. In the case of minisatellites, the patterns produced on the gel are sometimes called DNA fingerprints. (These fingerprints are highly individualistic and, hence, have great value in forensics, as detailed in Chapter 18.)
Recombination analysis using molecular markers When we map the position of a gene whose phenotypes are determined by a single nucleotide difference, we are effectively mapping a SNP. The same technique used to map gene loci can also be used to map SNPs that do not determine a phenotype. Suppose an individual has a GC base pair at position, say, 5658 on the DNA of one chromosome and an AT at position 5658 on the other chromosome. Such an individual is a molecular heterozygote (“AT/GC”) for that DNA position. This fact is useful in mapping because a molecular heterozygote (“AT/GC”) can be mapped just like a phenotypic heterozygote A/a. The locus of a molecular heterozygote can be inserted into a chromosomal map by analyzing recombination frequency in exactly the same way as the locus of heterozygous “phenotypic” alleles is inserted. This principle holds even though the variation is usually a silent difference (perhaps not in a gene). Acting as important “milestones” on the map, molecular markers are useful in orienting the researcher in a quest to find a gene of interest. To understand this point, consider real milestones: they are of little interest in themselves, but are very useful in telling you how close you are to your destination. In a specific genetic example, let’s assume that we want to know the map position of a disease gene in mice, perhaps as a way of zeroing in on its DNA sequence. We carry out a number of crosses. In each instance, we cross an individual carrying the disease gene with an individual carrying one of a range of different molecular markers whose map positions are already known. Using PCR, parents and progeny are scored for molecular markers of known map position and then recombination analysis is performed to see if the gene of interest is linked to any of them. The result of these crosses might reveal that the disease gene is 2 m.u. from one of these markers, which we will call M. The procedure has thus given us an approximate location for the disease gene on the chromosome. The location of the gene for the human disease cystic fibrosis was originally discovered through its linkage to molecular markers known to be located on chromosome 7. This discovery led to the isolation and sequencing of the gene, resulting in the further discovery that it encodes the protein now called cystic fibrosis transmembrane conductance regulator (CFTR). The gene for Huntington disease was also located in this way, leading to the discovery that it encodes a muscle protein now called huntingtin. The experimental procedure for a hypothetical example might be as follows. Let A and a be the disease-gene alleles and M1 and M2 be alleles of a specific molecularmarker locus. Assume that the cross is A/a· M1/M2 × a/a· M1/M1, a kind of testcross. Progeny would be first scored for the A and a phenotypes, and then DNA would be extracted from each individual and sequenced or otherwise assessed to determine the molecular alleles. Assume that we obtain the following results: A/a ⋅ M1/M1 49 percent
A/a ⋅ M2/M1 1 percent
a/a ⋅ M2/M1 49 percent
a/a ⋅ M1/M1 1 percent
4.3 Mapping with Molecular Markers 147
These results tell us that the testcross must have been in the following conformation: A M1/a M2 × a M1/a M1 and the two progeny genotypes on the right in the list must be recombinants, giving a map distance of 2 m.u. between the A/a locus and the molecular locus M1/M2. Hence, we now know the general location of the gene in the genome and can narrow its location down with more finely scaled approaches. In addition, different molecular markers can be mapped to each other, creating a map that can act like a series of stepping-stones on the way to some gene with an interesting phenotype. Although mapping molecular markers with the use of what are effectively testcrosses is the simplest type of informative analysis, in many analyses (such as those in humans) the molecular markers cannot be mapped using a testcross. However, because each molecular allele has its own signature, recombinant and nonrecombinant products can be identified from any meiosis, even in crosses that are not testcrosses. Such an analysis is diagrammed in Figure 4-15. Figure 4-16 contains some real data showing how molecular markers can flesh out a map of a human chromosome. You can see that the number of mapped molecular markers greatly exceeds the number of mapped genes with mutant phenotypes. Note that SNPs, because of their even higher density, cannot be represented on a whole-chromosome map such as that in Figure 4-16, inasmuch as there would be thousands of them. One centimorgan (1 m.u.) of human DNA is a huge segment, estimated as 1 megabase (1 Mb = 1 million base pairs, or 1000 kb). Hence, you can see the need for closely packed molecular markers for a fine-scale analysis that resolves smaller distances. Note that the DNA equivalent of 1 m.u.
A microsatellite locus can show linkage to a disease gene (a) Parental genotypes p
p
M´
M´´´
× P
M´´
Key
p
PCR primers
P
M´´´´ Dominant disease allele
M´ _ M´´´´ Molecular markers
Microsatellite repeats
(b) Banding patterns of parents and children
1
M´´´´ M´´ M´
2
3
4
5
6
F i g u r e 4 -15 A PCR banding pattern is
M´´´
PCR products
shown for a family with six children, and this pattern is interpreted at the top of the illustration with the use of four differentsize microsatellite “alleles,” M′ through M″″. One of these markers (M′′ ) is probably linked in cis configuration to the disease allele P. (Note: This mating is not a testcross, yet is informative about linkage.)
14 8 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
Phenotypic and molecular markers mapped on human chromosome 1
6 10 8
5 3 6
18
4 1
3
11.4 13.4
1
8.7 5.0
3
11.8 9.8
15 1
11 5
9
6
11 6
8
1
-1
14.2 9.4 4.4
4
13.9 -1
13.6
5 1 2 4 6 1
6.5 7.9
7
10.7
5 11
11.0
1
11
1
-1
9
1 6
4
2
7.7 7.3 4.7
7
2
1
2
6
14 1
17.0
4
7
24
7.5
1 1
6.2 5.6
6
33.8
10
9.3 9.5
14 2 5
12.7
2 6
15.8
4 7
356.0 cM
Key
Short sequence length polymorphisms Other DNA polymorphisms
DNA markers
Genes Genes included on the linkage map
D1S434 D1S496
2 8 3 3
D1S209
13.7
10
36.3 36.2 36.1 35 34.3 34.2 34.1 33 32.3 32.2 32.1 31.3 31.2 31.1 22.3 22.2 22.1 D1S221
1-
D1S160 D1S243 D1S548 D1S450 D1S228 D1S507 D1S436 D1S1592 D1S199 D1S482 D1S234 D1S247 D1S513 D1S233 D1S201 D1S441 D1S472 D1S186 D1S1157 D1S193 D1S319 D1S161 D1S417 D1S200 D1S476 D1S220 D1S312 D1S473 D1S246 D1S1613 D1S198 D1S159 D1S224 D1S532 D1S500 D1S1728 D1S207 D1S167 D1S188 D1S236 D1S223 D1S239 D1S221 D1S187 D1S418 D1S189 D1S440 D1S534 D1S498 D1S305 D1S303 SPTA1 CRP D1S484 APOA2 D1S104 D1S194 D1S318 D1S210 D1S218 D1S416 D1S215 D1S399 D1S240 D1S191 D1S518 D1S461 D1S422 D1S412 D1S310 D1S510 D1S249 D1S245 D1S414 D1S505 D1S237 D1S229 D1S549 D1S213 D1S225 D1S459 D1S446 ACTN2 D1S547 D1S1609 D1S180
Idiogram
differences that had been mapped to chromosome 1 at the time at which this diagram was drawn. Some markers are genes of known phenotype (their numbers are shaded in green), but most are polymorphic DNA markers (the numbers shaded in mauve and blue represent two different classes of molecular markers). A linkage map displaying a well-spaced-out set of these markers, based on recombinant frequency analyses of the type described in this chapter, is in the center of the illustration. Map distances are shown in centimorgans (cM). At a total length of 356 cM, chromosome 1 is the longest human chromosome. Some markers have also been localized on the chromosome 1 cytogenetic map (right-hand map, called an idiogram), by using techniques described later in this chapter. Having common landmark markers on the different genetic maps permits the locations of other genes and molecular markers to be estimated on each map. [ Data from B. R. Jasny et al., Science, September 30, 1994.]
varies a lot between species; for example, in the malarial parasite Plasmodium falciparium, 1 m.u. = 17 kb. K e y C o n c e p t Loci of any DNA heterozygosity can be mapped and used as molecular chromosome markers or milestones.
21 13.3 13.2 13.1 12 11 11 12
D1S431
19 9 -1
Linkage map
21.1 21.2 21.3 22 23 24 25
D1S237 D1S412
Distance (cM)
D1S446
Locus distribution
Figure 4-16 The diagram shows the distribution of all genetic
31 32.1 32.2 32.3 41 42.1 42.2 42.3 43 44
4.4 Centromere Mapping with Linear Tetrads Centromeres are not genes, but they are regions of DNA on which the orderly reproduction of living organisms absolutely depends and are therefore of great interest in genetics. In most eukaryotes, recombination analysis cannot be used to map the loci of centromeres because they show no heterozygosity that can enable them to be used as markers. However, in the fungi that produce linear tetrads (see Chapter 3, page 103), centromeres can be mapped. We will use the fungus Neurospora as an example. Recall that, in haploid fungi such as Neurospora, haploid nuclei from each parent fuse to form a transient diploid, which undergoes meiotic divisions along the long axis of the ascus, and so each meiocyte produces a linear array of eight ascospores, called an octad. These eight ascospores constitute the four products of meiosis (a tetrad) plus a postmeiotic mitosis. In its simplest form, centromere mapping considers a gene locus and asks how far this locus is from its centromere. The method is based on the fact that a different pattern of alleles will appear in a linear tetrad or octad that arises from a meiosis with a crossover between a gene and its centromere. Consider a cross between two individuals, each having a different allele at a locus (say, A × a). Mendel’s law of equal segregation dictates that, in an octad, there will always be four ascospores of genotype A and four of a, but how will they be arranged? If there has been no crossover in the region between A/a and the centromere, there will be two adjacent blocks of four ascospores in the linear octad (see Figure 3-10, page 104). However, if there has been a crossover in that region, there
4.4 Centromere Mapping with Linear Tetrads 149
will be one of four different patterns in the octad, each pattern showing blocks of two adjacent identical alleles. Some data from an actual cross of A × a are shown in the following table.
A second-division segregation pattern in a fungal octad
Octads A A A A a a a a
a a a a A A A A
A A a a A A a a
a a A A a a A A
A A a a a a A A
a a A A A A a a
126
132
9
11
10
12
A A
A a
A a
a
A A a
a A
A
a
Total = 300
A
A a
The first two columns on the left are from meioses First with no crossover in the region between the A locus division and the centromere. The letter M is used to stand for a type of segregation at meiosis. The patterns for the first two columns are called MI patterns, or firstdivision segregation patterns, because the two different alleles segregate into the two daughter nuclei at the first division of meiosis. The other four columns are all from meiocytes with a crossover. These patterns are called second-division segregation patterns (MII) because, as a result of crossing over in the centromereto-locus region, the A and a alleles are still together in the nuclei at the end of the first division of meiosis (Figure 4-17). There has been no first-division segregation. However, the second meiotic division does segregate the A and a alleles into separate nuclei. The other patterns are produced similarly; the difference is that the chromatids move in different directions at the second division (Figure 4-18). You can see that the frequency of octads with an MII pattern should be proportional to the size of the centromere–A/a region and could be used as a measure of the size of that region. In our example, the MII frequency is 42/300 = 14 percent. Does this percentage mean that the A/a locus is 14 m.u. from the centromere? The answer is no, but this value can be used to calculate the number of map units. The 14 percent value is a percentage of meioses, which is not the way that map units are
a
Second division
a a
Mitosis A seconddivision segregation pattern, M II
F i g u r e 4 -17 A and a segregate into
separate nuclei at the second meiotic division when there is a crossover between the centromere and the A locus.
Four different spindle attachments produce four second-division segregation patterns A a A
A a
A
A
a
a
a 1
a
A 2
a
A
A
a
a
a
A
A 3
A
a
a
a
A
A
a
A
A
A
a 4
a
F i g u r e 4 -18 In the second
meiotic division, the centromeres attach to the spindle at random, producing the four arrangements shown. The four arrangements are equally frequent.
150 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
defined. Map units are defined as the percentage of recombinant chromatids issuing from meiosis. Because a crossover in any meiosis results in only 50 percent recombinant chromatids (four out of eight; see Figure 4-17), we must divide the 14 percent by 2 to convert the MII frequency (a frequency of meioses) into map units (a frequency of recombinant chromatids). Hence, this region must be 7 m.u. in length, and this measurement can be introduced into the map of that chromosome.
4.5 Using the Chi-Square Test to Infer Linkage The standard genetic test for linkage is a dihybrid testcross. Consider a general cross of that type, in which it is not known if the genes are linked or not: A/a·B/b × a/a·b/b If there is no linkage, that is, the genes assort independently, we have seen from the discussions in this chapter and Chapter 3 that the following phenotypic proportions are expected in progeny: A B A b a B a b
0.25 0.25 0.25 0.25
A cross of this type was made and the following phenotypes obtained in a progeny sample of 200. A B 60 A b 37 a B 41 a b 62 There is clearly a deviation from the prediction of no linkage (which would have given the progeny numbers 50 : 50 : 50 : 50). The results suggest that the dihybrid was a cis configuration of linked genes, A B / a b, because the progeny A B and a b are in the majority. The recombinant frequency would be (37 + 41)/200 = 78/200 = 39 percent, or 39 m.u. However, we know that chance deviations due to sampling error can provide results that resemble those produced by genetic processes; hence, we need the χ2 (pronounced “chi square”) test to help us calculate the probability of a chance deviation of this magnitude from a 1 : 1 : 1 : 1 ratio. First, let us examine the allele ratios for both loci. These are 97 : 103 for A : a, and 101 : 99 for B : b. Such numbers are close to the 1 : 1 allele ratios expected from Mendel’s first law, so skewed allele ratios cannot be responsible for the quite large deviations from the expected numbers of 50 : 50 : 50 : 50. We must apply the χ2 analysis to test a hypothesis of no linkage. If that hypothesis is rejected, we can infer linkage. (We cannot test a hypothesis of linkage directly because we have no way of predicting what recombinant frequency to test.) The calculation for testing lack of linkage is as follows: Observed (O) 60 37 41 62
Expected (E) 50 50 50 50
O-E 10 −13 −9 12
(O - E)2 100 169 81 144
(O - E)2 / E 2.00 3.38 1.62 2.88
c2 = Σ (O - E)2 / E for all classes = 9.88
4.6 Accounting for Unseen Multiple Crossovers 151
Since there are four genotypic classes, we must use 4 − 1 = 3 degrees of freedom. Consulting the chi-square table in Chapter 3, we see our values of 9.88 and 3 df give a p value of ~0.025, or 2.5 percent. This is less than the standard cut-off value of 5 percent, so we can reject the hypothesis of no linkage. Hence, we are left with the conclusion that the genes are very likely linked, approximately 39 m.u. apart. Notice, in retrospect, that it was important to make sure alleles were segregating 1 : 1 to avoid a compound hypothesis of 1 : 1 allele ratios and no linkage. If we rejected such a compound hypothesis, we would not know which part of it was responsible for the rejection.
4.6 Accounting for Unseen Multiple Crossovers
F i g u r e 4 -19 Demonstration that the average RF is 50 percent for meioses in which the number of crossovers is not zero. Recombinant chromatids are brown. Two-strand double crossovers produce all parental types; so all the chromatids are orange. Note that all crossovers are between nonsister chromatids. Try the triple crossover class yourself.
In the discussion of the three-point testcross, some parental (nonrecombinant) chromatids resulted from double crossovers. These crossovers initially could not be counted in the recombinant frequency, skewing the results. This situation leads to the worrisome notion that all map distances based on recombinant frequency might be underestimations of physical distances because undetected multiple crossovers might have occurred, some of Any number of crossovers gives whose products would not be recombinant. Several creative 50 percent recombinants mathematical approaches have been designed to get around the multiple-crossover problem. We will look at two methods. A B First, we examine a method originally worked out by J. B. S. No crossovers A B Haldane in the early years of genetics. RF a
A mapping function The approach worked out by Haldane was to devise a mapping function, a formula that relates an observed recombinant-frequency value to a map distance corrected for multiple crossovers. The approach works by relating RF to the mean number of crossovers, m, that must have taken place in that chromosomal segment per meiosis and then deducing what map distance this m value should have produced. To find the relation of RF to m, we must first think about outcomes of the various crossover possibilities. In any chromosomal region, we might expect meioses with 0, 1, 2, 3, 4, or more crossovers. Surprisingly, the only class that is really crucial is the zero class. To see why, consider the following. It is a curious but nonintuitive fact that any number of crossovers produces a frequency of 50 percent recombinants within those meioses. Figure 4-19 proves this statement for single and double crossovers as examples, but it is true for any number of crossovers. Hence, the true determinant of RF is the relative sizes of the classes with no crossovers (the zero class) compared with the classes with any nonzero number of crossovers. Now the task is to calculate the size of the zero class. The occurrence of crossovers in a specific chromosomal region is well described by a statistical distribution called the Poisson distribution. The Poisson formula in general describes the distribution of “successes” in samples when the average probability of successes is low. An illustrative example is to dip a child’s net into a pond of fish: most dips will produce no fish, a smaller proportion will produce one fish, an even smaller proportion two, and so on. This analogy can be
One crossover (Can be between any nonsister pair.)
Two crossovers (Holding one crossover constant and varying the position of the second produces four equally frequent twocrossover meioses.)
b
0 4
0%
a
b
A
B
A
B
a
b
a
b
A
B
A
B
a
b
a
b
A
B
A
B
a
b
a
b
A
B
A
B
a
b
a
b
RF 2 4
50%
RF 0 4
0%
RF 2 4
50%
RF 2 4
50%
A
B
A
B
a
b
a
b
RF 4 4
100%
Average two-crossover RF
Twostrand double crossover
Threestrand double crossover
Threestrand double crossover
Fourstrand double crossover 8 16
50%
152 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
directly applied to a chromosomal region, which will have 0, 1, 2, and so forth, crossover “successes” in different meioses. The Poisson formula, given here, will tell us the proportion of the classes with different numbers of crossovers: fi = (e-mmi)/i! The terms in the formula have the following meanings: e = the base of natural logarithms (approximately 2.7) m = the mean number of successes in a defined sample size i = the actual number of successes in a sample of that size fi = the frequency of samples with i successes in them
! = the factorial symbol (for example, 5! = 5 × 4 × 3 × 2 × 1)
The Poisson distribution tells us that the frequency of the i = 0 class (the key one) is m0 e-m 0! Because m0 and 0! both equal 1, the formula reduces to e−m. Now we can write a function that relates RF to m. The frequency of the class with any nonzero number of crossovers will be 1 − e −m, and, in these meioses, 50 percent (1/2) of the products will be recombinant; so RF = 21 (1 - e-m) and this formula is the mapping function that we have been seeking. Let’s look at an example in which RF is converted into a map distance corrected for multiple crossovers. Assume that, in one testcross, we obtain an RF value of 27.5 percent (0.275). Plugging this into the function allows us to solve for m: 0.275 = 21 (1 - e-m) so e-m = 1 - (2 × 0.275) = 0.45 By using a calculator to find the natural logarithm (ln) of 0.45, we can deduce that m = 0.8. That is, on average, there are 0.8 crossovers per meiosis in that chromosomal region. The final step is to convert this measure of crossover frequency to give a “corrected” map distance. All that we have to do to convert into corrected map units is to multiply the calculated average crossover frequency by 50 because, on average, a crossover produces a recombinant frequency of 50 percent. Hence, in the preceding numerical example, the m value of 0.8 can be converted into a corrected recombinant fraction of 0.8 × 50 = 40 corrected m.u. We see that, indeed, this value is substantially larger than the 27.5 m.u. that we would have deduced from the observed RF. Note that the mapping function neatly explains why the maximum RF value for linked genes is 50 percent. As m gets very large, e−m tends to zero and the RF tends to 1/2, or 50 percent.
The Perkins formula For fungi and other tetrad-producing organisms, there is another way of compensating for multiple crossovers—specifically, double crossovers (the most common type expected). In tetrad analysis of “dihybrids” generally, only three types of tetrads are possible, when classified on the basis of the presence of parental and
4.6 Accounting for Unseen Multiple Crossovers 153
recombinant genotypes in the products. The classification of tetrads is based on whether there are two genotypes present (ditype) or four (tetratype). Within ditypes there are two classes: parental (showing two parental genotypes) and nonparental (showing two nonparental genotypes). From a cross AB × ab, they are Parental ditype (PD)
Tetratype (T)
Nonparental ditype (NPD)
A⋅B
A⋅B
A⋅b
A⋅B
A⋅b
A⋅b
a⋅b
a⋅B
a⋅B
a⋅b
a⋅b
a⋅B
The recombinant genotypes are shown in red. If the genes are linked, a simple approach to mapping their distance apart might be to use the following formula: map distance = RF = 100(NPD + 21 T) because this formula gives the percentage of all recombinants. However, in the 1960s, David Perkins developed a formula that compensates for the effects of double crossovers. The Perkins formula thus provides a more accurate estimate of map distance: corrected map distance = 50(T + 6 NPD) We will not go through the derivation of this formula other than to say that it is based on the totals of the PD, T, and NPD classes expected from meioses with 0, 1, and 2 crossovers (it assumes that higher numbers are vanishingly rare). Let’s look at an example of its use. In our hypothetical cross of A B × a b, the observed frequencies of the tetrad classes are 0.56 PD, 0.41 T, and 0.03 NPD. By using the Perkins formula, we find the corrected map distance between the a and b loci to be 50[0.41 + (6 × 0.03)] = 50(0.59) = 29.5 m.u. Let us compare this value with the uncorrected value obtained directly from the RF. By using the same data, we find uncorrected map distance = 100( 21 T + NPD) = 100(0.205 + 0.03) = 23.5 m.u. This distance is 6 m.u. less than the estimate that we obtained by using the Perkins formula because we did not correct for double crossovers. As an aside, what PD, NPD, and T values are expected when dealing with unlinked genes? The sizes of the PD and NPD classes will be equal as a result of independent assortment. The T class can be produced only from a crossover between either of the two loci and their respective centromeres, and, therefore, the size of the T class will depend on the total size of the two regions lying between locus and centromere. However, the formula 21 T + NPD should always yield 0.50, reflecting independent assortment. K e y C o n c e p t The inherent tendency of multiple crossovers to lead to an underestimation of map distance can be circumvented by the use of map functions (in any organism) and by the Perkins formula (in tetrad-producing organisms such as fungi).
154 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
4.7 Using Recombination-Based Maps in Conjunction with Physical Maps Recombination maps have been the main topic of this chapter. They show the loci of genes for which mutant alleles (and their mutant phenotypes) have been found. The positions of these loci on a map is determined on the basis of the frequency of recombinants at meiosis. The frequency of recombinants is assumed to be proportional to the distance apart of two loci on the chromosome; hence, recombinant frequency becomes the mapping unit. Such recombination-based mapping of genes with known mutant phenotypes has been done for nearly a century. We have seen how sites of molecular heterozygosity (unassociated with mutant phenotypes) also can be incorporated into such recombination maps. Like any heterozygous site, these molecular markers are mapped by recombination and then used to navigate toward a gene of biological interest. We make the perfectly reasonable assumption that a recombination map represents the arrangement of genes on chromosomes, but, as stated earlier, these maps are really hypothetical constructs. In contrast, physical maps are as close to the real genome map as science can get. The topic of physical maps will be examined more closely in Chapter 14, but we can foreshadow it here. A physical map is simply a map of the actual genomic DNA, a very long DNA nucleotide sequence, showing where genes are, their sequence, how big they are, what is between them, and other landmarks of interest. The units of distance on a physical map are numbers of DNA bases; for convenience, the kilobase is the preferred unit. The complete sequence of a DNA molecule is obtained by sequencing large numbers of small genomic fragments and then assembling them into one whole sequence. The sequence is then scanned by a computer, programmed to look for gene-like segments recognized by characteristic base sequences including known signal sequences for the initiation and termination of transcription. When the computer’s program finds a gene, it compares its sequence with the public database of other sequenced genes for which functions have been discovered in other organisms. In many cases, there is a “hit”; in other words, the sequence closely resembles that of a gene of known function in another species. In such cases, the functions of the two genes also may be similar. The sequence similarity (often close to 100 percent) is explained by the inheritance of the gene from some common ancestor and the general conservation of functional sequences through evolutionary time. Other genes discovered by the computer show no sequence similarity to any gene of known function. Hence, they can be considered “genes in search of a function.” In reality, of course, it is the researcher, not the gene, who searches and who must find the function. Sequencing different individual members of a population also can yield sites of molecular heterozygosity, which, just as they do in recombination maps, act as orientation markers on the physical map. Because physical maps are now available for most of the main genetic model organisms, is there really any need for recombination maps? Could they be considered outmoded? The answer is that both maps are used in conjunction with each other to “triangulate” in determining gene function, a principle illustrated earlier by the London maps. The general approach is illustrated in Figure 4-20, which shows a physical map and a recombination map of the same region of a genome. Both maps contain genes and molecular markers. In the lower part of Figure 4-20, we see a section of a recombination-based map, with positions of genes for which mutant phenotypes have been found and mapped. Not all the genes in that segment are included. For some of these genes, a function may have been discovered on the basis of biochemical or other studies of mutant strains; genes for proteins A and B are examples. The gene in the middle is a “gene of interest” that a researcher has found to affect the aspect of development being studied. To determine its function, the physical map can be useful. The genes in the physical map that are in the general region of the gene of interest on the recombination map become candidate genes, any one of which could be the gene
4.8 The Molecular Mechanism of Crossing Over 155
Alignment of physical and recombination maps DNA sequence for protein A Physical map
Candidate genes
DNA sequence for protein B
20 kb
1 map unit
1.2 m.u.
3 m.u.
Recombination map Locus of gene with mutant phenotype, known to lack protein A
Locus of gene with mutant phenotype, unknown cell function
Locus of gene with mutant phenotype, known to lack protein B
Key Function suspected from other organisms
Function unknown
of interest. Further studies are needed to narrow the choice to one. If that single case is a gene whose function is known for other organisms, then a function for the gene of interest is suggested. In this way, the phenotype mapped on the recombination map can be tied to a function deduced from the physical map. Molecular markers on both maps (not shown in Figure 4-20) can be aligned to help in the zeroing-in process. Hence, we see that both maps contain elements of function: the physical map shows a gene’s possible action at the cellular level, whereas the recombination map contains information related to the effect of the gene at the phenotypic level. At some stage, the two have to be melded to understand the gene’s contribution to the development of the organism. There are several other genetic-mapping techniques, some of which we will encounter in Chapters 5, 18, and 19. K e y C o n c e p t The union of recombination and physical maps can ascribe biochemical function to a gene identified by its mutant phenotype.
4.8 The Molecular Mechanism of Crossing Over In this chapter we have analyzed the genetic consequences of the cytologically visible process of crossing over without worrying about the mechanism of crossing over. However, crossing over is remarkable in itself as a molecular process: how can two large coiled molecules of DNA exchange segments with a precision so exact that no nucleotides are lost or gained? Studies on fungal octads gave a clue. Although most octads show the expected 4 : 4 segregation of alleles such as 4A : 4a, some rare octads show aberrant ratios. There are several types, but as an example we will use 5 : 3 octads (either 5A : 3a or 5a : 3A). Two things are peculiar about this ratio. First, there is one too many spores of one allele and one too few of the other. Second, there is a nonidentical sister-spore pair. Normally, postmeiotic replication gives identical sister-spore pairs as follows: the A A a a tetrad becomes A-A A-A a-a a-a (the hyphens show sister spores). In contrast, an aberrant 5A : 3a octad must be A-A A-A A-a a-a In other words, there is one nonidentical sister-spore pair (in bold).
F i g u r e 4 -2 0 Comparison of relative positions on physical and recombination maps can connect phenotype with an unknown gene function.
156 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
F i g u r e 4 -2 1 A molecular model of
Crossing over creates heteroduplex DNA Inner two chromatids Double-strand break G
5' 3'
C T
3' 5'
A Erosion G
T A Invasion and displacement G
T A Polymerization G
T
3' 5' 5'
crossing over. Only the two chromatids (blue and red) participating in the crossover are shown. The 3′-to-5′ strand is placed on the inside of both for clarity. The chromatids differ at one site, GC, in one allele (perhaps allele A) and AT in the other (perhaps a). Only the outcome with mispaired heteroduplex DNA and a crossover are shown. The final crossover products are shaded in yellow and blue.
3'
The observation of a nonidentical sister-spore pair suggests that the DNA of one of the final four meiotic homologs contains heteroduplex DNA. Heteroduplex DNA is DNA in which there is a mismatched nucleotide pair in the gene under study. The logic is as follows. If in a cross of A × a, one allele (A) is G : C and the other allele (a) is A : T, the two alleles would usually replicate faithfully. However, a heteroduplex, which forms only rarely, would be a mismatched nucleotide pair such G : T or A : C (effectively a DNA molecule bearing both A and a information). Note that a heteroduplex involves only one nucleotide position: the surrounding DNA segment might be as follows, where the heteroduplex site is shown in bold: GCTAATGTTATTAG CGATTATAATAATC
At replication to form an octad, a G : T heteroduplex would pull apart and A replicate faithfully, with G bonding to C and A bonding to T. The result would Heteroduplex region Resolution to be a nonidentical spore pair of G : C crossover by nicks ( ) (allele A) and A : T (allele a). G Nonidentical sister spores (and aberrant octads generally) were found to be statistically correlated with crossT T ing over in the region of the gene concerned, providing an important clue A that crossing over might be based on the formation of heteroduplex DNA. In the currently accepted model (follow it in Figure 4-21), the heteroduplex DNA and a crossover are both produced by a double-stranded break in the DNA of one of the chromatids participating in the crossover. Let’s see how that works. Molecular studies show that broken ends of DNA will promote recombination between different chromatids. In step 1, both strands of a chromatid break in the same location. From the break, DNA is eroded at the 5′ end of each broken strand, leaving both 3′ends single stranded (step 2). One of the single strands “invades” the DNA of the other participating chromatid; that is, it enters the center of the
Summary 157
helix and base-pairs with its homologous sequence (step 3), displacing the other strand. Then the tip of the invading strand uses the adjacent sequence as a template for new polymerization, which proceeds by forcing the two resident strands of the helix apart (step 4). The displaced single-stranded loop hydrogen bonds with the other single strand (the blue one in the figure). If the invasion and strand displacement spans a site of heterozygosity (such as A/a), then a region of heteroduplex DNA is formed. Replication also takes place from the other single-stranded end to fill the gap left by the invading strand (also shown on the upper blue strand in step 4 of Figure 4-21). The replicated ends are sealed, and the net result is a strange structure with two single-stranded junctions called Holliday junctions after their original proposer, Robin Holliday. These junctions are potential sites of single-strand breakage and reunion; two such events, shown by the darts in the figure, then lead to a complete double-stranded crossover (step 5). Note that when the invading strand uses the invaded DNA as a replication template, this automatically results in an extra copy of the invaded sequence at the expense of the invading sequence, thus explaining the departure from the expected 4 : 4 ratio. This same sort of recombination takes place at many different chromosomal sites where the invasion and strand displacement do not span a heterozygous mutant site. Here DNA would be formed that is heteroduplex in the sense that it is composed of strands of each participating chromatid, but there would not be a mismatched nucleotide pair and the resulting octad would contain only identical spore pairs. Those rare occasions in which the invasion and polymerization do span a heterozygous site are simply lucky cases that provided the clue for the mechanism of crossing over. K e y C o n c e p t A crossover is initiated by a double-stranded break in the DNA of a chromatid at meiosis. A series of molecular events ensues that eventually produces crossover DNA molecules. (In addition, if the site of the crossover happens to be near a site of DNA heterozygosity in meiosis, aberrant non-Mendelian allele ratios for the heterozygous site may be produced.
s u m m a ry In a dihybrid testcross in Drosophila, Thomas Hunt Morgan found a deviation from Mendel’s law of independent assortment. He postulated that the two genes were located on the same pair of homologous chromosomes. This relation is called linkage. Linkage explains why the parental gene combinations stay together but not how the recombinant (nonparental) combinations arise. Morgan postulated that, in meiosis, there may be a physical exchange of chromosome parts by a process now called crossing over. A result of the physical breakage and reunion of chromosome parts, crossing over takes place at the four-chromatid stage of meiosis. Thus, there are two types of meiotic recombination. Recombination by Mendelian independent assortment results in a recombinant frequency of 50 percent. Crossing over results in a recombinant frequency generally less than 50 percent. As Morgan studied more linked genes, he discovered many different values for recombinant frequency and
wondered if these values corresponded to the actual distances between genes on a chromosome. Alfred Sturtevant, a student of Morgan’s, developed a method of determining the distance between genes on a linkage map, based on the RF. The easiest way to measure RF is with a testcross of a dihybrid or trihybrid. RF values calculated as percentages can be used as map units to construct a chromosomal map showing the loci of the genes analyzed. In ascomycete fungi, centromeres also can be located on the map by measuring seconddivision segregation frequencies. Single nucleotide polymorphisms (SNPs) are singlenucleotide differences in sequences of DNA. Single sequence length polymorphisms (SSLPs) are differences in the number of repeating units. SNPs and SSLPs can be used as molecular markers for mapping genes. Although the basic test for linkage is deviation from independent assortment, such a deviation may not be obvious in a testcross, and a statistical test is needed. The χ2 test,
158 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
which shows all the gene-like sequences. Knowledge of gene position in both maps enables the melding of cellular function with a gene’s effect on phenotype. The mechanism of crossing over is thought to start with a double-stranded break in one participating chromatid. Erosion leaves the ends single stranded. One single strand invades the double helix of the other participating chromatid, leading to the formation of heteroduplex DNA. Gaps are filled by polymerization. The molecular resolution of this structure becomes a full double-stranded crossover at the DNA level.
which tells how often observations deviate from expectations purely by chance, is particularly useful in determining whether loci are linked. Some multiple crossovers can result in nonrecombinant chromatids, leading to an underestimation of map distance based on RF. The mapping function, applicable in any organism, corrects for this tendency. The Perkins formula has the same use in fungal tetrad analysis. In genetics generally, the recombination-based map of loci conferring mutant phenotypes is used in conjunction with a physical map such as the complete DNA sequence,
key terms centimorgan (cM) (p. 136) chromosome map (p. 129) cis conformation (p. 132) coefficient of coincidence (c.o.c.) (p. 142) crossing over (p. 132) crossover product (p. 132) DNA fingerprint (p. 146) double-stranded break (p. 156) first-division segregation pattern (MI pattern) (p. 149) genetic map unit (m.u.) (p. 136) heteroduplex DNA (p. 156)
interference (p. 141) linkage map (p. 136) linked genes (p. 129) locus (p. 129) mapping function (p. 151) microsatellite marker (p. 145) minisatellite marker (p. 145) molecular marker (p. 144) octad (p. 148) physical map (p. 154) Poisson distribution (p. 151) recombinant frequency (RF) (p. 136) recombination map (p. 129)
restriction fragment length polymorphism (RFLP) (p. 145) second-division segregation pattern (MII) (p. 149) simple sequence length polymorphism (SSLP) (p. 145) single nucleotide polymorphism (SNP) (p. 145) three-factor cross (p. 139) three-point testcross (p. 139) trans conformation (p. 132) variable number tandem repeat (VNTR) (p. 145)
s olv e d p r obl e m s SOLVED PROBLEM 1. A human pedigree shows people
c. If there is evidence of linkage, then draw the alleles on the relevant homologs of the grandparents. If there is no evidence of linkage, draw the alleles on two homologous pairs. d. According to your model, which generation II descendants are recombinants? e. What is the best estimate of RF? f. If man III-1 marries a normal woman of blood type O, what is the probability that their first child will be blood type B with nail–patella syndrome?
affected with the rare nail–patella syndrome (misshapen nails and kneecaps) and gives the ABO blood-group genotype of each person. Both loci concerned are autosomal. Study the pedigree below. a. Is the nail–patella syndrome a dominant or recessive phenotype? Give reasons to support your answer. b. Is there evidence of linkage between the nail–patella gene and the gene for ABO blood type, as judged from this pedigree? Why or why not? I
II
III
1
2
i/i
I B/i
1
2
3
4
5
6
7
i/i
I B/i
I B/i
i/i
I B/i
I B/i
I A/i
1
2
3
4
5
I B/i
I B/i
I B/i
I A/i
I A/i
8 i/i
9 i/i
10 i/i
11 I B/i
12 i/i
13 i/i
14 i/i
15 I B/i
16 I B/i
Solved Problems 159
Solution a. Nail–patella syndrome is most likely dominant. We are told that it is a rare abnormality, and so the unaffected people marrying into the family are unlikely to carry a presumptive recessive allele for nail–patella syndrome. Let N be the causative allele. Then all people with the syndrome are heterozygotes N/n because all (probably including the grandmother) result from matings with n/n normal people. Notice that the syndrome appears in all three generations—another indication of dominant inheritance. b. There is evidence of linkage. Notice that most of the affected people—those who carry the N allele—also carry the I B allele; most likely, these alleles are linked on the same chromosome. n i N IB c. n i n i
type; 6 black, waxy, cinnabar; 69 waxy, cinnabar; 67 black; 382 cinnabar; 379 black, waxy; 48 waxy; and 44 black, cinnabar. Note that a progeny group may be specified by listing only the mutant phenotypes. a. Explain these numbers.
(The grandmother must carry both recessive alleles to produce offspring of genotype i/i and n/n.) d. Notice that the grandparental mating is equivalent to a testcross; so the recombinants in generation II are
Notice that there are distinct pairs of progeny classes in regard to frequency. Already, we can guess that the two largest classes represent parental chromosomes, that the two classes of about 68 represent single crossovers in one region, that the two classes of about 45 represent single crossovers in the other region, and that the two classes of about 5 represent double crossovers. We can write out the progeny as classes derived from the female’s gametes, grouped as follows:
II-5 : n I B/n i and II-8 : N i /n i whereas all others are nonrecombinants, being either N I B/n i or n i /n i. e. Notice that the grandparental cross and the first two crosses in generation II are identical and are testcrosses. Three of the total 16 progeny are recombinant (II-5, II-8, and III-3). The cross of II-6 with II-7 is not a testcross, but the chromosomes donated from II-6 can be deduced to be nonrecombinant. Thus, RF = 3/18, which is 17 percent f. (III-1 )
N
IB
n
i
n
i
n
i
17.0%
N IB
41.5%
ni
41.5%
Ni
8.5%
nI
8.5%
B
c. If appropriate according to your explanation, calculate interference. Solution a. A general piece of advice is to be methodical. Here, it is a good idea to write out the genotypes that may be inferred from the phenotypes. The cross is a testcross of type b+/b ⋅ wx+/wx ⋅ cn+/cn × b /b ⋅ wx/wx ⋅ cn /cn
b+ ⋅ wx+ ⋅ cn b ⋅ wx ⋅ cn+ b+ ⋅ wx ⋅ cn b ⋅ wx+ ⋅ cn+ b+ ⋅ wx ⋅ cn+ b ⋅ wx+ ⋅ cn b ⋅ wx ⋅ cn b+ ⋅ wx+ ⋅ cn+
(normal type O )
382 379 69 67 48 44 6 5 1000
Gametes 83.0%
b. Draw the alleles in their proper positions on the chromosomes of the triple heterozygote.
nail–patella, blood type B
The two parental classes are always equal, and so are the two recombinant classes. Hence, the probability that the first child will have nail–patella syndrome and blood type B is 41.5 percent. SOLVED PROBLEM 2. The allele b gives Drosophila flies a black body, and b+ gives brown, the wild-type phenotype. The allele wx of a separate gene gives waxy wings, and wx+ gives nonwaxy, the wild-type phenotype. The allele cn of a third gene gives cinnabar eyes, and cn+ gives red, the wild-type phenotype. A female heterozygous for these three genes is testcrossed, and 1000 progeny are classified as follows: 5 wild
Listing the classes in this way confirms that the pairs of classes are in fact reciprocal genotypes arising from zero, one, or two crossovers. At first, because we do not know the parents of the triple heterozygous female, it looks as if we cannot apply the definition of recombination in which gametic genotypes are compared with the two parental genotypes that form an individual fly. But, on reflection, the only parental types that make sense in regard to the data presented are b+/b+⋅ wx+/wx+⋅ cn /cn and b /b ⋅ wx /wx ⋅ cn+/cn+ because these types represent the most common gametic classes. Now, we can calculate the recombinant frequencies. For b–wx, RF =
69+ 67+ 48+ 44 = 22 .8 % 1000
for b–cn, RF =
48+ 44 + 6 +5 = 10 .3 % 1000
16 0 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
convert this percentage into map units, we must divide by 2, which gives 5.05 m.u. nic
and for wx–cn, RF =
69+ 67+ 6 +5 = 14 .7 % 1000
5.05 m.u.
The map is therefore b
cn
w
10.3 m.u.
We do the same for the ad locus. Here, the total of the MII patterns is given by types 3, 5, 6, and 7 and is 90 + 90 + 1 + 5 = 186 of 1000, or 18.6 percent, which is 9.3 m.u.
14.7
ad
b. The parental chromosomes in the triple heterozygote are b1
cn
wx1
9.30 m.u.
b
cn1
wx
Now we have to put the two together and decide between the following alternatives, all of which are compatible with the preceding locus-to-centromere distances:
c. The expected number of double recombinants is 0.103 × 0.147 × 1000 = 15.141. The observed number is 6 + 5 = 11, and so interference can be calculated as
a.
nic 5.05 m.u.
I = 1 − (11/15.141) = 1 − 0.726 = 0.274 = 27.4% SOLVED PROBLEM 3. A cross is made between a haploid
2
ad nic ad nic ad nic ad nic ad nic ad nic ad nic ad 808
3 ad
nic ad nic ad nic ad nic ad nic ad nic ad nic ad 1
nic
ad
nic ad nic ad nic ad nic ad nic ad nic ad nic ad 90
9.30 m.u. nic
c.
ad
5.05 m.u. 9.30 m.u.
Solution What principles can we draw on to solve this problem? It is a good idea to begin by doing something straightforward, which is to calculate the two locus-to-centromere distances. We do not know if the ad and the nic loci are linked, but we do not need to know. The frequencies of the MII patterns for each locus give the distance from locus to centromere. (We can worry about whether it is the same centromere later.) Remember that an MII pattern is any pattern that is not two blocks of four. Let’s start with the distance between the nic locus and the centromere. All we have to do is add the ascus types 4, 5, 6, and 7, because all of them are MII patterns for the nic locus. The total is 5 + 90 + 1 + 5 = 101 of 1000, or 10.1 percent. In this chapter, we have seen that to
nic
ad
5.05 m.u.
strain of Neurospora of genotype nic+ ad and another haploid strain of genotype nic ad +. From this cross, a total of 1000 linear asci are isolated and categorized as in the table below. Map the ad and nic loci in relation to centromeres and to each other.
1
9.30 m.u.
nic
b.
nic
ad
Here, a combination of common sense and simple analysis tells us which alternative is correct. First, an inspection of the asci reveals that the most common single type is the one labeled 1, which contains more than 80 percent of all the asci. This type contains only nic+⋅ ad and nic ⋅ ad + genotypes, and they are parental genotypes. So we know that recombination is quite low and the loci are certainly linked. This rules out alternative a. Now consider alternative c. If this alternative were correct, a crossover between the centromere and the nic locus would generate not only an MII pattern for that locus, but also an MII pattern for the ad locus, because it is farther from the centromere than nic is. The ascus pattern pro4
5
6
nic
nic
nic
ad nic ad nic ad nic ad nic ad nic ad nic ad nic ad 5
ad nic ad nic ad nic ad nic ad nic ad nic ad nic ad 90
7 ad
nic ad nic ad nic ad nic ad nic ad nic ad nic ad 1
nic
ad nic ad nic ad nic ad nic ad nic ad nic ad nic ad 5
Problems 161
duced by a crossover between nic and the centromere in alternative c should be nic1 ad nic1 ad nic1 ad nic1 ad nic ad1 nic ad1
nic
ad1
nic
ad1
nic
ad
1
nic1 ad nic
ad1
nic
ad1
Remember that the nic locus shows MII patterns in asci types 4, 5, 6, and 7 (a total of 101 asci); of them, type 5 is the very one that we are talking about and contains 90 asci. Therefore, alternative c appears to be correct because ascus type 5 comprises about 90 percent of the MII asci for the nic locus. This relation would not hold if alternative b were correct because crossovers on either side of the centromere would generate the MII patterns for the nic and the ad loci independently. Is the map distance from nic to ad simply 9.30 − 5.05 = 4.25 m.u.? Close, but not quite. The best way of calculating map distances between loci is always by measuring the recombinant frequency. We could go through the asci and count all
1
the recombinant ascospores, but using the formula RF = 2 T + NPD is simpler. The T asci are classes 3, 4, and 7, and the NPD asci are classes 2 and 6. Hence, RF = [ 21 (100) + 2]/1000 = 5.2 percent, or 5.2 m.u., and a better map is nic
ad
5.05 m.u. 5.2 m.u. 10.25 m.u. The reason for the underestimation of the ad-tocentromere distance calculated from the MII frequency is the occurrence of double crossovers, which can produce an MI pattern for ad, as in ascus type 4: nic1 ad nic1 ad nic1 nic1 nic nic
ad ad ad1 ad1
nic
ad
nic
ad
nic1 ad1 nic1 ad1 nic
ad1
nic
ad1
p r obl e m s Most of the problems are also available for review/grading through the .com/launchpad/iga11e. Working with the Figures
1. In Figure 4-3, would there be any meiotic products that did not undergo a crossover in the meiosis illustrated? If so, what colors would they be in the color convention used? 2. In Figure 4-6, why does the diagram not show meioses in which two crossovers occur between the same two chromatids (such as the two inner ones)? 3. In Figure 4-8, some meiotic products are labeled parental. Which parent is being referred to in this terminology? 4. In Figure 4-9, why is only locus A shown in a constant position? 5. In Figure 4-10, what is the mean frequency of crossovers per meiosis in the region A–B ? The region B–C ? 6. In Figure 4-11, is it true to say that from such a cross the product v cv+ can have two different origins? 7. In Figure 4-14, in the bottom row four colors are labeled SCO. Why are they not all the same size (frequency)?
http://www.whfreeman.com/
8. Using the conventions of Figure 4-15, draw parents and progeny classes from a cross P M′′′/p M′ × p M′/p M′′′′ 9. In Figure 4-17, draw the arrangements of alleles in an octad from a similar meiosis in which the upper product of the first division segregated in an upside-down manner at the second division. 10. In Figure 4-19, what would be the RF between A/a and B/b in a cross in which purely by chance all meioses had four-strand double crossovers in that region? 11. a. In Figure 4-21, let GC = A and AT = a, then draw the fungal octad that would result from the final structure (5). b. (Challenging) Insert some closely linked flanking markers into the diagram, say P/p to the left and Q/q to the right (assume either cis or trans arrangements). Assume neither of these loci show non-Mendelian segregation. Then draw the final octad based on the structure in part 5.
162 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
B a s i c P r obl e m s
12. A plant of genotype
is testcrossed with
A
B
a
b
a
b
a
b
If the two loci are 10 m.u. apart, what proportion of progeny will be AB/ab? 13. The A locus and the D locus are so tightly linked that no recombination is ever observed between them. If Ad/ Ad is crossed with aD/aD and the F1 is intercrossed, what phenotypes will be seen in the F2 and in what proportions? 14. The R and S loci are 35 m.u. apart. If a plant of genotype R
S
r s is selfed, what progeny phenotypes will be seen and in what proportions? 15. The cross E/E ⋅ F/F × e/e ⋅ f/f is made, and the F1 is then backcrossed with the recessive parent. The progeny genotypes are inferred from the phenotypes. The progeny genotypes, written as the gametic contributions of the heterozygous parent, are in the following proportions: E⋅F
2 6
E⋅f
1 6
e⋅F
1 6
e⋅f
16. A strain of Neurospora with the genotype H ⋅ I is crossed with a strain with the genotype h ⋅ i. Half the progeny are H ⋅ I, and the other half are h ⋅ i. Explain how this outcome is possible. 17. A female animal with genotype A/a ⋅ B/b is crossed with a double-recessive male (a/a ⋅ b/b). Their progeny include 442 A/a ⋅ B/b, 458 a/a ⋅ b/b, 46 A/a ⋅ b/b, and 54 a/a ⋅ B/b. Explain these results. 18. If A/A ⋅ B/B is crossed with a/a ⋅ b/b and the F1 is testcrossed, what percentage of the testcross progeny will be a/a ⋅ b/b if the two genes are (a) unlinked; (b) completely linked (no crossing over at all); (c) 10 m.u. apart; (d) 24 m.u. apart? 19. In a haploid organism, the C and D loci are 8 m.u. apart. From a cross C d × c D, give the proportion of each of the following progeny classes: (a) C D; (b) c d; (c) C d; (d) all recombinants combined. 20. A fruit fly of genotype B R/b r is testcrossed with b r/b r. In 84 percent of the meioses, there are no chiasmata between the linked genes; in 16 percent of the meioses, there is one chiasma between the genes. What proportion of the progeny will be B r/b r? 21. A three-point testcross was made in corn. The results and a recombination analysis are shown in the display below, which is typical of three-point testcrosses (p = purple leaves, + = green; v = virus-resistant seedlings, + = sensitive; b = brown midriff to seed, + = plain). Study the display, and answer parts a through c. P +/+ ⋅ +/+ ⋅ +/+ × p/p ⋅ v/v ⋅ b/b Gametes + ⋅ + ⋅ +
p⋅v⋅b
a. Determine which genes are linked. b. Draw a map that shows distances in map units. c. Calculate interference, if appropriate.
2 6
Explain these results.
Recombinant for
Class
Progeny phenotypes
F1 gametes
1 2 3 4 5 6 7 8
gre sen pla pur res bro gre res pla pur sen bro pur res pla gre sen bro gre res bro pur sen pla
p v b v p b p v b v b p Total
Numbers
p b
p–v
v–b
R R
R R R R
3,210 3,222 1,024 1,044 690 678 72 60
R R R R
R R
10,000
1,500
2,200
3,436
Problems 16 3
www Unpacking Problem 21 www
1. Sketch cartoon drawings of the P, F1, and tester corn plants, and use arrows to show exactly how you would perform this experiment. Show where seeds are obtained. 2. Why do all the +’s look the same, even for different genes? Why does this not cause confusion? 3. How can a phenotype be purple and brown, for example, at the same time? 4. Is it significant that the genes are written in the order p-v-b in the problem? 5. What is a tester and why is it used in this analysis? 6. What does the column marked “Progeny phenotypes” represent? In class 1, for example, state exactly what “gre sen pla” means. 7. What does the line marked “Gametes” represent, and how is it different from the column marked “F1 gametes”? In what way is comparison of these two types of gametes relevant to recombination? 8. Which meiosis is the main focus of study? Label it on your drawing. 9. Why are the gametes from the tester not shown? 10. Why are there only eight phenotypic classes? Are there any classes missing? 11. What classes (and in what proportions) would be expected if all the genes are on separate chromosomes? 12. To what do the four pairs of class sizes (very big, two intermediates, very small) correspond? 13. What can you tell about gene order simply by inspecting the phenotypic classes and their frequencies? 14. What will be the expected phenotypic class distribution if only two genes are linked? 15. What does the word “point” refer to in a three-point testcross? Does this word usage imply linkage? What would a four-point testcross be like? 16. What is the definition of recombinant, and how is it applied here? 17. What do the “Recombinant for” columns mean? 18. Why are there only three “Recombinant for” columns? 19. What do the R’s mean, and how are they determined? 20. What do the column totals signify? How are they used? 21. What is the diagnostic test for linkage? 22. What is a map unit? Is it the same as a centimorgan? 23. In a three-point testcross such as this one, why aren’t the F1 and the tester considered to be parental in calculating recombination? (They are parents in one sense.)
24. What is the formula for interference? How are the “expected” frequencies calculated in the coefficient-ofcoincidence formula? 25. Why does part c of the problem say “if appropriate”? 26. How much work is it to obtain such a large progeny size in corn? Which of the three genes would take the most work to score? Approximately how many progeny are represented by one corncob? 22. You have a Drosophila line that is homozygous for autosomal recessive alleles a, b, and c, linked in that order. You cross females of this line with males homozygous for the corresponding wild-type alleles. You then cross the F1 heterozygous males with their heterozygous sisters. You obtain the following F2 phenotypes (where letters denote recessive phenotypes and pluses denote wild-type phenotypes): 1364 + + +, 365 a b c, 87 a b +, 84 + + c, 47 a + +, 44 + b c, 5 a + c, and 4 + b +. a. What is the recombinant frequency between a and b? Between b and c? (Remember, there is no crossing over in Drosophila males.) b. What is the coefficient of coincidence? 23. R. A. Emerson crossed two different pure-breeding lines of corn and obtained a phenotypically wild-type F1 that was heterozygous for three alleles that determine recessive phenotypes: an determines anther; br, brachytic; and f, fine. He testcrossed the F1 with a tester that was homozygous recessive for the three genes and obtained these progeny phenotypes: 355 anther; 339 brachytic, fine; 88 completely wild type; 55 anther, brachytic, fine; 21 fine; 17 anther, brachytic; 2 brachytic; 2 anther, fine. a. What were the genotypes of the parental lines? b. Draw a linkage map for the three genes (include map distances). c. Calculate the interference value. 24. Chromosome 3 of corn carries three loci (b for plant-color booster, v for virescent, and lg for liguleless). A testcross of triple recessives with F1 plants heterozygous for the three genes yields progeny having the following genotypes: 305 + v lg, 275 b + +, 128 b + lg, 112 + v +, 74 + + lg, 66 b v +, 22 + + +, and 18 b v lg. Give the gene sequence on the chromosome, the map distances between genes, and the coefficient of coincidence. 25. Groodies are useful (but fictional) haploid organisms that are pure genetic tools. A wild-type groody has a fat body, a long tail, and flagella. Mutant lines are known that have thin bodies, are tailless, or do not have flagella. Groodies can mate with one another (although they are so shy that we do not know how) and produce recombinants. A wild-type groody mates with a thin-bodied groody lacking both tail and flagella. The 1000 baby groodies produced are classified as shown in the
16 4 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
illustration here. Assign genotypes, and map the three genes. (Problem 25 is from Burton S. Guttman.)
370
398
72
67
44
35
5
9
26. In Drosophila, the allele dp+ determines long wings and dp determines short (“dumpy”) wings. At a separate locus, e+ determines gray body and e determines ebony body. Both loci are autosomal. The following crosses were made, starting with pure-breeding parents: P
long, ebony
F1
long, gray
F2
short, gray short, ebony (pure)
long, ebony long, gray short, gray short, ebony
54 47 52 47 200
Use the χ2 test to determine if these loci are linked. In doing so, indicate (a) the hypothesis, (b) calculation of χ2, (c) p value, (d) what the p value means, (e) your conclusion, (f ) the inferred chromosomal constitutions of parents, F1, tester, and progeny. 27. The mother of a family with 10 children has blood type Rh+. She also has a very rare condition (elliptocytosis, phenotype E) that causes red blood cells to be oval rather than round in shape but that produces no adverse clinical effects. The father is Rh− (lacks the Rh+ antigen) and has normal red blood cells (phenotype e). The children are 1 Rh+ e, 4 Rh+ E, and 5 Rh− e. Information is available on the mother’s parents, who are Rh+ E and Rh− e. One of the 10 children (who is Rh+ E) marries someone who is Rh+ e, and they have an Rh+ E child. a. Draw the pedigree of this whole family. b. Is the pedigree in agreement with the hypothesis that the Rh+ allele is dominant and Rh− is recessive? c. What is the mechanism of transmission of elliptocytosis? d. Could the genes governing the E and Rh phenotypes be on the same chromosome? If so, estimate the map distance between them, and comment on your result.
28. From several crosses of the general type A/A ⋅ B/B × a/a ⋅ b/b, the F1 individuals of type A/a ⋅ B/b were testcrossed with a/a ⋅ b/b. The results are as follows: Testcross progeny Testcross of F1 from cross
A/a ⋅
a/a ⋅
A/a ⋅
a/a ⋅
B/b
b/b
b/b
B/b
1 2 3 4
310 36 360 74
315 38 380 72
287 23 230 50
288 23 230 44
For each set of progeny, use the χ2 test to decide if there is evidence of linkage. 29. In the two pedigrees diagrammed here, a vertical bar in a symbol stands for steroid sulfatase deficiency, and a horizontal bar stands for ornithine transcarbamylase deficiency. First pedigree I
1
IV
I
2
II
III
Second pedigree
1
1
2
1
2
1
II
2
3
4
5
III
6
1
1
2
2
2
3
3
a. Is there any evidence in these pedigrees that the genes determining the deficiencies are linked? b. If the genes are linked, is there any evidence in the pedigree of crossing over between them? c. Assign genotypes of these individuals as far as possible. 30. In the accompanying pedigree, the vertical lines stand for protan color blindness, and the horizontal lines stand for deutan color blindness. These are separate conditions causing different misperceptions of colors; each is determined by a separate gene. I
1
II
III
1
1
2
2
2
3
3
4
4
5
5
Problems 16 5
a. Does the pedigree show any evidence that the genes are linked? b. If there is linkage, does the pedigree show any evidence of crossing over?
Explain these proportions with the aid of simplified meiosis diagrams. 33. In the tiny model plant Arabidopsis, the recessive allele hyg confers seed resistance to the drug hygromycin, and her, a recessive allele of a different gene, confers seed resistance to herbicide. A plant that was homozygous hyg/hyg ⋅ her/her was crossed with wild type, and the F1 was selfed. Seeds resulting from the F1 self were placed on petri dishes containing hygromycin and herbicide. a. If the two genes are unlinked, what percentage of seeds are expected to grow? b. In fact, 13 percent of the seeds grew. Does this percentage support the hypothesis of no linkage? Explain. If not, calculate the number of map units between the loci. c. Under your hypothesis, if the F1 is testcrossed, what proportion of seeds will grow on the medium containing hygromycin and herbicide?
Explain your answers to parts a and b with the aid of the diagram. c. Can you calculate a value for the recombination between these genes? Is this recombination by independent assortment or by crossing over? 31. In corn, a triple heterozygote was obtained carrying the mutant alleles s (shrunken), w (white aleurone), and y (waxy endosperm), all paired with their normal wildtype alleles. This triple heterozygote was testcrossed, and the progeny contained 116 shrunken, white; 4 fully wild type; 2538 shrunken; 601 shrunken, waxy; 626 white; 2708 white, waxy; 2 shrunken, white, waxy; and 113 waxy. a. Determine if any of these three loci are linked and, if so, show map distances. b. Show the allele arrangement on the chromosomes of the triple heterozygote used in the testcross. c. Calculate interference, if appropriate.
34. In a diploid organism of genotype A/a ; B/b ; D/d, the allele pairs are all on different chromosome pairs. The two diagrams below purport to show anaphases (“pulling apart” stages) in individual cells. State whether each drawing represents mitosis, meiosis I, or meiosis II or is impossible for this particular genotype.
32. a. A mouse cross A/a ⋅ B/b × a/a ⋅ b/b is made, and in the progeny there are 25% A/a ⋅ B/b, 25% a/a ⋅ b/b, 25% A/a ⋅ b/b, 25% a/a ⋅ B/b Explain these proportions with the aid of simplified meiosis diagrams. b. A mouse cross C/c ⋅ D/d × c/c ⋅ d/d is made, and in the progeny there are
35. The Neurospora cross al-2+ × al-2 is made. A linear tetrad analysis reveals that the second-division segregation frequency is 8 percent. a. Draw two examples of second-division segregation patterns in this cross. b. What can be calculated by using the 8 percent value?
45% C/c ⋅ d/d, 45% c/c ⋅ D/d, 5% c/c ⋅ d/d, 5% C/c ⋅ D/d a.
A
B
a
b
A
B
a
b.
d b
A
g.
d
b
d
A
b
d
c.
b
A A
B
d
A
b
a
B
a
b
a
b
b
A
b
a
B
D D
A A a
D b
A
a
h.
d
B
a
d.
d
b
a
f.
d
D
b
A A
e.
D d
i.
A A a
d
d d
B B b
B B b
D D d
D D d
a
a
b
b d
A
a
B
b
D
d
A
a
B
b
D
d
A
a
B
b
D
d
a
A
b
B
d
D
d
D d
B
A
B
A
B
d
j. D D
16 6 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
12. How is a cross made in a fungus such as Neurospora? Explain how to isolate asci and individual ascospores. How does the term tetrad relate to the terms ascus and octad? 13. Where does meiosis take place in the Neurospora life cycle? (Show it on a diagram of the life cycle.) 14. What does Problem 38 have to do with meiosis? 15. Can you write out the genotypes of the two parental strains? 16. Why are only four genotypes shown in each class? 17. Why are there only seven classes? How many ways have you learned for classifying tetrads generally? Which of these classifications can be applied to both linear and unordered tetrads? Can you apply these classifications to the tetrads in this problem? (Classify each class in as many ways as possible.) Can you think of more possibilities in this cross? If so, why are they not shown? 18. Do you think there are several different spore orders within each class? Why would these different spore orders not change the class?
36. From the fungal cross arg-6 ⋅ al-2 × arg-6+ ⋅ al-2+, what will the spore genotypes be in unordered tetrads that are (a) parental ditypes? (b) tetratypes? (c) nonparental ditypes? 37. For a certain chromosomal region, the mean number of crossovers at meiosis is calculated to be two per meiosis. In that region, what proportion of meioses are predicted to have (a) no crossovers? (b) one crossover? (c) two crossovers? 38. A Neurospora cross was made between a strain that carried the mating-type allele A and the mutant allele arg-1 and another strain that carried the mating-type allele a and the wild-type allele for arg-1 (+). Four hundred linear octads were isolated, and they fell into the seven classes given in the table below. (For simplicity, they are shown as tetrads.) a. Deduce the linkage arrangement of the mating-type locus and the arg-1 locus. Include the centromere or centromeres on any map that you draw. Label all intervals in map units. b. Diagram the meiotic divisions that led to class 6. Label clearly. 1
2
3
4
5
6
A arg A arg A A
A A a arg a arg
A arg A a arg a
A arg a arg A a
A arg a A arg a
A a arg A a arg
127
125
100
36
2
4
www Unpacking Problem 38 www
1. Are fungi generally haploid or diploid? 2. How many ascospores are in the ascus of Neurospora? Does your answer match the number presented in this problem? Explain any discrepancy. 3. What is mating type in fungi? How do you think it is determined experimentally? 4. Do the symbols A and a have anything to do with dominance and recessiveness? 5. What does the symbol arg-1 mean? How would you test for this genotype? 6. How does the arg-1 symbol relate to the symbol +? 7. What does the expression wild type mean? 8. What does the word mutant mean? 9. Does the biological function of the alleles shown have anything to do with the solution of this problem? 10. What does the expression linear octad analysis mean? 11. In general, what more can be learned from linear tetrad analysis that cannot be learned from unordered tetrad analysis?
7 A a arg A arg a 6 19. Why is the following class not listed? A ⋅ arg a⋅+ A ⋅ arg a⋅+ 20. What does the expression linkage arrangement mean? 21. What is a genetic interval? 22. Why does the problem state “centromere or centromeres” and not just “centromere”? What is the general method for mapping centromeres in tetrad analysis? 23. What is the total frequency of A ⋅ + ascospores? (Did you calculate this frequency by using a formula or by inspection? Is this a recombinant genotype? If so, is it the only recombinant genotype?) 24. The first two classes are the most common and are approximately equal in frequency. What does this information tell you? What is their content of parental and recombinant genotypes? 39. A geneticist studies 11 different pairs of Neurospora loci by making crosses of the type a ⋅ b × a+ ⋅ b+ and then analyzing 100 linear asci from each cross. For the convenience of making a table, the geneticist organizes the data as if all 11 pairs of genes had the same designation—a and
Problems 167
b—as shown below. For each cross, map the loci in relation to each other and to centromeres. Number of asci of type Cross 1 2 3 4 5 6 7 8 9 10 11
a ⋅ a ⋅ a+ ⋅ a+ ⋅
b b b+ b+
a ⋅ a ⋅ a+ ⋅ a+ ⋅
b+ b+ b b
a ⋅ a ⋅ a+ ⋅ a+ ⋅
34 1 3 1 6 0 0 7 0 14 49
34 84 55 71 9 31 95 6 69 16 51
b b+ b+ b
32 15 40 18 24 1 3 20 10 2 0
a ⋅ a+ ⋅ a+ ⋅ a ⋅
b b b+ b+
0 0 0 1 22 3 2 22 18 60 0
a ⋅ a+ ⋅ a+ ⋅ a ⋅
b b+ b+ b
a ⋅ a+ ⋅ a+ ⋅ a ⋅
0 0 2 8 8 61 0 12 0 1 0
40. Three different crosses in Neurospora are analyzed on the basis of unordered tetrads. Each cross combines a different pair of linked genes. The results are shown in the following table: Tetratypes (%)
Nonparental ditypes (%)
Cross
Parents (%)
Parental ditypes (%)
1
a ⋅ b+ × a+⋅ b
51
45
4
2
c⋅
d+ × c+ ⋅
d
64
34
2
e⋅
f+ × e+ ⋅
f
45
50
5
3
For each cross, calculate a. the frequency of recombinants (RF). b. the uncorrected map distance, based on RF. c. the corrected map distance, based on tetrad frequencies. d the corrected map distance, based on the mapping function. 41. On Neurospora chromosome 4, the leu3 gene is just to the left of the centromere and always segregates at the first division, whereas the cys2 gene is to the right of the centromere and shows a second-division segregation frequency of 16 percent. In a cross between a leu3 strain and a cys2 strain, calculate the predicted frequencies of the following seven classes of linear tetrads where l = leu3 and c = cys2. (Ignore double and other multiple crossovers.) (i) l c (ii) l + (iii) l c (iv) l c (v) l c (vi) l + (vii) l + l c l + l + + c + + + c + c + c + + + + + c + + + + + + + + + c + c l + l c l + l c
b+ b b b+
a ⋅ a+ ⋅ a+ ⋅ a ⋅
0 0 0 0 10 0 0 11 1 2 0
b+ b b+ b
0 0 0 1 20 4 0 22 2 5 0
42. A rice breeder obtained a triple heterozygote carrying the three recessive alleles for albino flowers (al), brown awns (b), and fuzzy leaves (fu), all paired with their normal wild-type alleles. This triple heterozygote was testcrossed. The progeny phenotypes were 170 150 5 3
wild type albino, brown, fuzzy brown albino, fuzzy
710 698 42 38
albino brown, fuzzy fuzzy albino, brown
a. Are any of the genes linked? If so, draw a map labeled with map distances. (Don’t bother with a correction for multiple crossovers.) b. The triple heterozygote was originally made by crossing two pure lines. What were their genotypes? 43. In a fungus, a proline mutant (pro) was crossed with a histidine mutant (his). A nonlinear tetrad analysis gave the following results: +
+
+
+
+
his
+
+
+
his
+
his
pro
his
pro
+
pro
+
pro
his
pro
his
pro
+
6
82
112
a. Are the genes linked or not? b. Draw a map (if linked) or two maps (if not linked), showing map distances based on straightforward recombinant frequency where appropriate. c. If there is linkage, correct the map distances for multiple crossovers (choose one approach only). 44. In the fungus Neurospora, a strain that is auxotrophic for thiamine (mutant allele t) was crossed with a strain that is
16 8 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
auxotrophic for methionine (mutant allele m). Linear asci were isolated and classified into the following groups: Spore pair 1 and 2
Ascus types t+
tm
tm
+m
++
tm
++
++
t+
tm
++
t+
+m
+m
+m
+m
++
+m
260
76
4
54
1
5
t+
t+
3 and 4
t+
tm
5 and 6
+m
7 and 8 Number
t+
a. Determine the linkage relations of these two genes to their centromere(s) and to each other. Specify distances in map units. b. Draw a diagram to show the origin of the ascus type with only one single representative (second from right). 45. A corn geneticist wants to obtain a corn plant that has the three dominant phenotypes: anthocyanin (A), long tassels (L), and dwarf plant (D). In her collection of pure lines, the only lines that bear these alleles are AA LL dd and aa ll DD. She also has the fully recessive line aa ll dd. She decides to intercross the first two and testcross the resulting hybrid to obtain in the progeny a plant of the desired phenotype (which would have to be Aa Ll Dd in this case). She knows that the three genes are linked in the order written, that the distance between the A/a and the L/l loci is 16 m.u., and that the distance between the L/l and the D/d loci is 24 m.u. a. Draw a diagram of the chromosomes of the parents, the hybrid, and the tester. b. Draw a diagram of the crossover(s) necessary to produce the desired genotype. c. What percentage of the testcross progeny will be of the phenotype that she needs? d. What assumptions did you make (if any)? 46. In the model plant Arabidopsis thaliana, the following alleles were used in a cross: T = presence of trichomes t = absence of trichomes D = tall plants
d = dwarf plants
W = waxy cuticle
w = nonwaxy
A = presence of purple anthocyanin pigment
a = absence (white)
The T/t and D/d loci are linked 26 m.u. apart on chromosome 1, whereas the W/w and A/a loci are linked 8 m.u. apart on chromosome 2. A pure-breeding double-homozygous recessive trichomeless nonwaxy plant is crossed with another pure-breeding double-homozygous recessive dwarf white plant. a. What will be the appearance of the F1? b. Sketch the chromosomes 1 and 2 of the parents and the F1, showing the arrangement of the alleles.
c. If the F1 is testcrossed, what proportion of the progeny will have all four recessive phenotypes? 47. In corn, the cross WW ee FF × ww EE ff is made. The three loci are linked as follows: E/e
W/w 8 m.u.
F/f 24 m.u.
Assume no interference. a. If the F1 is testcrossed, what proportion of progeny will be ww ee ff? b. If the F1 is selfed, what proportion of progeny will be ww ee ff? 48. The fungal cross + ⋅ + × c ⋅ m was made, and nonlinear (unordered) tetrads were collected. The results were
Total
++
++
+m
++
+m
+m
c m
c +
c+
c m
c m
c+
112
82
6
a. From these results, calculate a simple recombinant frequency. b. Compare the Haldane mapping function and the Perkins formula in their conversions of the RF value into a “corrected” map distance. c. In the derivation of the Perkins formula, only the possibility of meioses with zero, one, and two crossovers was considered. Could this limit explain any discrepancy in your calculated values? Explain briefly (no calculation needed). 49. In mice, the following alleles were used in a cross: W = waltzing gait G = normal gray color B = bent tail
w = nonwaltzing gait g = albino b = straight tail
A waltzing gray bent-tailed mouse is crossed with a nonwaltzing albino straight-tailed mouse and, over several years, the following progeny totals are obtained: waltzing waltzing nonwaltzing nonwaltzing waltzing waltzing nonwaltzing nonwaltzing Total
gray albino gray albino gray albino gray albino
bent bent straight straight straight straight bent bent
18 21 19 22 4 5 5 6 100
a. What were the genotypes of the two parental mice in the cross? b. Draw the chromosomes of the parents.
Problems 16 9
c. If you deduced linkage, state the map unit value or values and show how they were obtained. 50. Consider the Neurospora cross + ; + × f ; p It is known that the +/f locus is very close to the centromere on chromosome 7—in fact, so close that there are never any second-division segregations. It is also known that the +/p locus is on chromosome 5, at such a distance that there is usually an average of 12 percent seconddivision segregations. With this information, what will be the proportion of octads that are a. parental ditypes showing MI patterns for both loci? b. nonparental ditypes showing MI patterns for both loci? c. tetratypes showing an MI pattern for +/f and an MII pattern for +/p? d. tetratypes showing an MII pattern for +/f and an MI pattern for +/p? 51. In a haploid fungus, the genes al-2 and arg-6 are 30 m.u. apart on chromosome 1, and the genes lys-5 and met-1 are 20 m.u. apart on chromosome 6. In a cross
C h a ll e n g i n g P r obl e m s
54. Use the Haldane map function to calculate the corrected map distance in cases where the measured RF = 5%, 10%, 20%, 30%, and 40%. Sketch a graph of RF against corrected map distance, and use it to answer the question, When should one use a map function? www
Unpacking Problem 55
www
55. An individual heterozygous for four genes, A/a B/b C/c D/d, is testcrossed with a/a b/b c/c d/d, and 1000 progeny are classified by the gametic contribution of the heterozygous parent as follows: •
•
•
42
A b c d
43
A B C d
140
a b c D
145
•
•
what proportion of progeny would be prototrophic + + ; + +? 52. The recessive alleles k (kidney-shaped eyes instead of wild-type round), c (cardinal-colored eyes instead of wildtype red), and e (ebony body instead of wild-type gray) identify three genes on chromosome 3 of Drosophila. Females with kidney-shaped, cardinal-colored eyes were mated with ebony males. The F1 was wild type. When F1 females were testcrossed with kk cc ee males, the following progeny phenotypes were obtained: k k k k + + + + Total
c c + + c c + +
e + e + e + e +
3 876 67 49 44 58 899 4 2000
a. Determine the order of the genes and the map distances between them. b. Draw the chromosomes of the parents and the F1. c. Calculate interference and say what you think of its significance. 53. From parents of genotypes A/A ⋅ B/B and a/a ⋅ b/b, a dihybrid was produced. In a testcross of the dihybrid, the following seven progeny were obtained: A/a ⋅ B/b, a/a ⋅ b/b, A/a ⋅ B/b, A/a ⋅ b/b, a/a ⋅ b/b, A/a ⋅ B/b, and a/a ⋅ B/b Do these results provide convincing evidence of linkage?
•
•
•
•
•
•
•
•
•
•
a B c D
6
A b C d
9
A B c d
305
a b C D
310
•
•
•
•
•
•
•
•
•
•
•
•
•
a B C D
•
al-2 + ; + met-1 × + arg-6 ; lys-5 +
•
a. Which genes are linked? b. If two pure-breeding lines had been crossed to produce the heterozygous individual, what would their genotypes have been? c. Draw a linkage map of the linked genes, showing the order and the distances in map units. d. Calculate an interference value, if appropriate. 56. An autosomal allele N in humans causes abnormalities in nails and patellae (kneecaps) called the nail–patella syndrome. Consider marriages in which one partner has the nail–patella syndrome and blood type A and the other partner has normal nails and patellae and blood type O. These marriages produce some children who have both the nail–patella syndrome and blood type A. Assume that unrelated children from this phenotypic group mature, intermarry, and have children. Four phenotypes are observed in the following percentages in this second generation: nail–patella syndrome, blood type A
66%
normal nails and patellae, blood type O
16%
normal nails and patellae, blood type A
9%
nail–patella syndrome, blood type O
9%
Fully analyze these data, explaining the relative frequencies of the four phenotypes. (See pages 219–220 for the genetic basis of these blood types.) 57. Assume that three pairs of alleles are found in Drosophila: x + and x, y + and y, and z + and z. As shown by the symbols, each non-wild-type allele is recessive to its wild-type
170 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
allele. A cross between females heterozygous at these three loci and wild-type males yields progeny having the following genotypes: 1010 x + y + z + females, 430 x y + z males, 441 x + y z + males, 39 x y z males, 32 x + y + z males, 30 x + y + z + males, 27 x y z + males, 1 x + y z male, and 0 x y + z + males. •
•
•
•
•
•
•
•
•
•
•
•
•
10 m.u.
•
58. The five sets of data given in the following table represent the results of testcrosses using parents with the same alleles but in different combinations. Determine the order of genes by inspection—that is, without calculating recombination values. Recessive phenotypes are symbolized by lowercase letters and dominant phenotypes by pluses.
+ + + + + c + b + + b c a + + a + c a b + a b c
Data sets 1
2
3
4
5
317 58 10 2 0 21 72 203
1 4 31 77 77 31 4 1
30 6 339 137 142 291 3 34
40 232 84 201 194 77 235 46
305 0 28 107 124 30 1 265
59. From the phenotype data given in the following table for two 3-point testcrosses for (1) a, b, and c and (2) b, c, and d, determine the sequence of the four genes a, b, c, and d and the three map distances between them. Recessive phenotypes are symbolized by lowercase letters and dominant phenotypes by pluses. 1 + + + a b + a + + + + c + b c a + c a b c + b +
R 25 m.u.
•
Mr. Spock, first officer of the starship Enterprise, has a Vulcan father and an Earthling mother. If Mr. Spock marries an Earth woman and there is no (genetic) interference, what proportion of their children will have a. Vulcan phenotypes for all three characters? b. Earth phenotypes for all three characters? c. Vulcan ears and heart but Earth adrenals? d. Vulcan ears but Earth heart and adrenals?
a. On what chromosome of Drosophila are the genes carried? b. Draw the relevant chromosomes in the heterozygous female parent, showing the arrangement of the alleles. c. Calculate the map distances between the genes and the coefficient of coincidence.
Phenotypes observed in 3-point testcross
A
P
61. In a certain diploid plant, the three loci A, B, and C are linked as follows: A
B 20 m.u.
C 30 m.u.
One plant is available to you (call it the parental plant). It has the constitution A b c/a B C. a. With the assumption of no interference, if the plant is selfed, what proportion of the progeny will be of the genotype a b c/a b c? b. Again, with the assumption of no interference, if the parental plant is crossed with the a b c/a b c plant, what genotypic classes will be found in the progeny? What will be their frequencies if there are 1000 progeny? c. Repeat part b, this time assuming 20 percent interference between the regions. 62. The following pedigree shows a family with two rare abnormal phenotypes: blue sclerotic (a brittle-bone defect), represented by a black-bordered symbol, and hemophilia, represented by a black center in a symbol. Members represented by completely black symbols have both disorders. The numbers in some symbols are the numbers of individuals with those types.
2 669 139 3 121 2 2280 653 2215
b c d b + + b + d + c d + + + + + d + c + b c +
8 441 90 376 14 153 65 141
60. Vulcans have pointed ears (determined by allele P), absent adrenals (determined by A), and a right-sided heart (determined by R). All these alleles are dominant to normal Earth alleles: rounded ears (p), present adrenals (a),
3
•
•
5
•
and a left-sided heart (r). The three loci are autosomal and linked as shown in this linkage map:
3
4
Problems 171
a. What pattern of inheritance is shown by each condition in this pedigree? b. Provide the genotypes of as many family members as possible. c. Is there evidence of linkage? d. Is there evidence of independent assortment? e. Can any of the members be judged as recombinants (that is, formed from at least one recombinant gamete)? 63. The human genes for color blindness and for hemophilia are both on the X chromosome, and they show a recombinant frequency of about 10 percent. The linkage of a pathological gene to a relatively harmless one can be used for genetic prognosis. Shown here is part of a bigger pedigree. Blackened symbols indicate that the subjects had hemophilia, and crosses indicate color blindness. What information could be given to women III-4 and III-5 about the likelihood of their having sons with hemophilia? I
The geneticist also knows that genes D and E assort independently. a. Draw a map of these genes, showing distances in map units wherever possible. b. Is there any evidence of interference? 65. In the plant Arabidopsis, the loci for pod length (L, long; l, short) and fruit hairs (H, hairy; h, smooth) are linked 16 m.u. apart on the same chromosome. The following crosses were made: (i) L H/L H × l h/l h → F1 If the F1’s from cross i and cross ii are crossed,
1
2
3
4
5
(Problem 63 is adapted from J. F. Crow, Genetics Notes: An Introduction to Genetics. Burgess, 1983.) 64. A geneticist mapping the genes A, B, C, D, and E makes two 3-point testcrosses. The first cross of pure lines is A/A ⋅ B/B ⋅ C/C ⋅ D/D ⋅ E/E × a/a ⋅ b/b ⋅ C/C ⋅ d/d ⋅ E/E The geneticist crosses the F1 with a recessive tester and classifies the progeny by the gametic contribution of the F1: A⋅B⋅C⋅D⋅E
316
a⋅b⋅C⋅d⋅E
314
A⋅B⋅C⋅d⋅E
31
a⋅b⋅C⋅D⋅E
39
A⋅b⋅C⋅d⋅E
130
a⋅B⋅C⋅D⋅E
140
A⋅b⋅C⋅D⋅E
17
a⋅B⋅C⋅d⋅E
13
•
•
8 m.u.
14 m.u.
If the following cross is made
and the F1 is testcrossed with w s e/w s e, and if it is assumed that there is no interference on this region of the chromosome, what proportion of progeny will be of the following genotypes? a. b. c. d.
The second cross of pure lines is A/A B/B C/C D/D E/E × a/a B/B c/c D/D e/e. •
a. what proportion of the progeny are expected to be l h/l h? b. what proportion of the progeny are expected to be L h/l h? 66. In corn (Zea mays), the genetic map of part of chromosome 4 is as follows, where w, s, and e represent recessive mutant alleles affecting the color and shape of the pollen: w s e
+ + +/+ + + × w s e/w s e
1000 •
1000
(ii) L h/L h × l H/l H → F1
II III
243 237 62 58 155 165 46 34
A⋅B⋅C⋅D⋅E a⋅B⋅c⋅D⋅e A⋅B⋅c⋅D⋅e a⋅B⋅C⋅D⋅E A⋅B⋅C⋅D⋅e a⋅B⋅c⋅D⋅E a⋅B⋅C⋅D⋅e A⋅B⋅c⋅D⋅E
•
•
•
•
The geneticist crosses the F1 from this cross with a recessive tester and obtains
+ w + w
+ s s +
+ e e +
e. f. g. h.
+ w w +
+ s + s
e + e +
67. Every Friday night, genetics student Jean Allele, exhausted by her studies, goes to the student union’s bowling lane to relax. But, even there, she is haunted by her genetic studies. The rather modest bowling lane has only four bowling balls: two red and two blue. They are bowled at the pins and are then collected and returned down the
172 CHAPTER 4 Mapping Eukaryote Chromosomes by Recombination
chute in random order, coming to rest at the end stop. As the evening passes, Jean notices familiar patterns of the four balls as they come to rest at the stop. Compulsively, she counts the different patterns. What patterns did she see, what were their frequencies, and what is the relevance of this matter to genetics? 68. In a tetrad analysis, the linkage arrangement of the p and q loci is as follows: (i)
(ii) p
q
Assume that • in region i, there is no crossover in 88 percent of meioses and there is a single crossover in 12 percent of meioses; • in region ii, there is no crossover in 80 percent of meioses and there is a single crossover in 20 percent of meioses; and • there is no interference (in other words, the situation in one region does not affect what is going on in the other region). What proportions of tetrads will be of the following types? (a) MIMI, PD; (b) MIMI, NPD; (c) MIMII, T; (d)MIIMI, T; (e) MIIMII, PD; (f) MIIMII, NPD; (g) MIIMII,T. (Note: Here the M pattern written first is the one that pertains to the p locus.) Hint: The easiest way to do this problem is to start by calculating the frequencies of asci with crossovers in both regions, region i, region ii, and neither region. Then determine what MI and MII patterns result. 69. For an experiment with haploid yeast, you have two different cultures. Each will grow on minimal medium to which arginine has been added, but neither will grow on minimal medium alone. (Minimal medium is inorganic salts plus sugar.) Using appropriate methods, you induce
the two cultures to mate. The diploid cells then divide meiotically and form unordered tetrads. Some of the ascospores will grow on minimal medium. You classify a large number of these tetrads for the phenotypes ARG− (arginine requiring) and ARG+ (arginine independent) and record the following data: Segregation of ARG- : ARG+ 4 : 0 3 : 1 2 : 2
Frequency (%) 40 20 40
a. Using symbols of your own choosing, assign genotypes to the two parental cultures. For each of the three kinds of segregation, assign genotypes to the segregants. b. If there is more than one locus governing arginine requirement, are these loci linked? 70. An RFLP analysis of two pure lines A/A B/B and a/a b/b showed that the former was homozygous for a long RFLP allele (l) and the latter for a short allele (s). The two were crossed to form an F1, which was then backcrossed to the second pure line. A thousand progeny were scored as follows: •
•
Aa Bb ss
9
Aa bb ss
43
Aa Bb ls
362
Aa bb ls
93
aa bb ls
11
aa Bb ls
37
aa bb ss
358
aa Bb ss
87
a. What do these results tell us about linkage? b. Draw a map if appropriate. c. Incorporate the RFLP fragments into your map.
344
The Genetics of Bacteria and Their Viruses
5
C h a p t e r
Learning Outcomes After completing this chapter, you will be able to • Distinguish between the experimental procedures and analyses in the three main ways by which bacteria exchange genes. • Map bacterial genomes using interrupted conjugation. • Map bacterial genomes using recombinant frequency. • Assess the outcome of double transformation experiments in terms of linkage. • Predict the outcomes of transduction experiments using phages capable of generalized or restricted transduction.
Dividing bacterial cells. [ Custom Medical Stock Photo RM/Getty Images.]
outline
• Map phage genomes by recombination in double infections of bacteria. • Design experiments to map a mutation caused by transposon mutagenesis. • Predict the inheritance of genes and functions borne on plasmids in bacterial crosses.
5.1 Working with microorganisms 5.2 Bacterial conjugation 5.3 Bacterial transformation 5.4 Bacteriophage genetics 5.5 Transduction 5.6 Physical maps and linkage maps compared
173
174 CHA P TER 5 The Genetics of Bacteria and Their Viruses
The fruits of DNA technology, made possible by bacterial genetics
F i g u r e 5 -1 The dramatic results of modern DNA technology, such as sequencing the human genome, were possible only because bacterial genetics led to the invention of efficient DNA manipulation vectors. [ Science 291, 2001, pp. 1145–1434. Image by Ann E. Cutting. Reprinted with permission from AAAS.]
D
NA technology is responsible for the rapid advances being made in the genetics of all model organisms. It is also a topic of considerable interest in the public domain. Examples are the highly publicized announcement of the full genome sequences of humans and chimpanzees in recent years and the popularity of DNA-based forensic analysis in television shows and movies (Figure 5-1). Indeed, improvements in technology have led to the sequencing of the genomes of many hundreds of species. Such dramatic results, whether in humans, fish, insects, plants, or fungi, are all based on the use of methods that permit small pieces of DNA to be isolated, carried from cell to cell, and amplified into large pure samples. The sophisticated systems that permit these manipulations of the DNA of any organism are almost all derived from bacteria and their viruses. Hence, the advance of modern genetics to its present state of understanding was entirely dependent on the development of bacterial genetics, the topic of this chapter. However, the goal of bacterial genetics has never been to facilitate eukaryotic molecular genetics. Bacteria are biologically important in their own right. They are the most numerous organisms on our planet. They contribute to the recycling of nutrients such as nitrogen, sulfur, and carbon in ecosystems. Some are agents of human, animal, and plant disease. Others live symbiotically inside our mouths and intestines. In addition, many types of bacteria are useful for the industrial synthesis of a wide range of organic products. Hence, the impetus for the genetic dissection of bacteria has been the same as that for multicellular organisms—to understand their biological function. Bacteria belong to a class of organisms known as prokaryotes, which also includes the blue-green algae (classified as cyanobacteria). A key defining feature of prokaryotes is that their DNA is not enclosed in a membrane-bounded nucleus. Like higher organisms, bacteria have genes composed of DNA arranged in a long series on a “chromosome.” However, the organization of their genetic material is unique in several respects. The genome of most bacteria is a single molecule of double-stranded DNA in the form of a closed circle. In addition, bacteria in nature often contain extra DNA elements called plasmids. Most plasmids also are DNA circles but are much smaller than the main bacterial genome. Bacteria can be parasitized by specific viruses called bacteriophages or, simply, phages. Phages and other viruses are very different from the organisms that we have been studying so far. Viruses have some properties in common with organisms; for example, their genetic material can be DNA or RNA, constituting a short “chromosome.” However, most biologists regard viruses as nonliving because they are not cells and they have no metabolism of their own. Hence, for the study of their genetics, viruses must be propagated in the cells of their host organisms. When scientists began studying bacteria and phages, they were naturally curious about their hereditary systems. Clearly, bacteria and phages must have hereditary systems because they show a constant appearance and function from one generation to the next (they are true to type). But how do these hereditary systems work? Bacteria, like unicellular eukaryotic organisms, reproduce asexually by cell growth and division, one cell becoming two. This asexual reproduction is quite easy to demonstrate experimentally. However, is there ever a union of different types for the purpose of sexual reproduction? Furthermore, how do the much smaller phages reproduce? Do they ever unite for a sex-like cycle? These questions are pursued in this chapter. We will see that there is a variety of hereditary processes in bacteria and phages. These processes are interesting because of the basic biology of these forms, but they also act as models—as sources of insight into genetic processes at work in all organisms. For a geneticist, the attraction of these forms is that they can be cultured in very large numbers because they are so small. Consequently, it is possible to detect and study very rare genetic events that are difficult or impossible to study in eukaryotes.
The Genetics of Bacteria and Their Viruses 175
What hereditary processes are observed in prokaryotes? They can undergo both asexual and sexual reproduction. Mutation occurs in asexual cells in much the same way as it does in eukaryotes, and mutant alleles can be followed through both these processes in an approach analogous to that used in eukaryotes. We shall follow alleles in this way in the chapter ahead. When bacterial cells reproduce asexually, their genomic DNA replicates and is partitioned into daughter cells, but the partitioning method is quite different from mitosis. In sexual reproduction, two DNA molecules from different sources are brought together. However, an important difference from eukaryotes is that, in bacteria, rarely are two complete chromosomes brought together; usually, the union is of one complete chromosome plus a fragment of another. The possibilities are outlined in Figure 5-2. The first process of gene exchange to be examined will be conjugation, which is the contact and fusion of two different bacterial cells. After fusion, one cell, called a donor, sometimes transfers genomic DNA to the other cell. This transferred DNA may be part or (rarely) all of the bacterial genome. In some cases, one or more autonomous extragenomic DNA elements called plasmids, if present, are transferred. Such plasmids are capable of carrying genomic DNA into the recipient cell. Any genomic fragment transferred by whatever route may recombine with the recipient’s chromosome after entry. A bacterial cell can also take up a piece of DNA from the external environment and incorporate this DNA into its own chromosome, a process called transformation. In addition, certain phages can pick up a piece of DNA from one bacterial cell and inject it into another, where it can be incorporated into the chromosome, in a process known as transduction. DNA transfer on a plasmid, by transformation or by transduction, constitutes a process known as horizontal transmission, a type of gene transmission without
Bacteria exchange DNA by several processes Partial genome transfer by DNA uptake
Transformation
Conjugation
Plasmids
Conjugation
Genome Genome Plasmid transfer during conjugation
Virus
Partial genome transfer during conjugation
F i g u r e 5 -2 Bacterial DNA Transduction Transfer as part of viral genome
can be transferred from cell to cell in four ways: conjugation with plasmid transfer, conjugation with partial genome transfer, transformation, and transduction.
176 CHA P TER 5 The Genetics of Bacteria and Their Viruses
Bacterial colonies, each derived from a single cell
Suspension of bacterial cells
Suspension spread on petri plate with agar gel
Incubate from 1 to 2 days
the need for cell division. This term distinguishes this type of DNA transfer from that during vertical transmission, the passage of DNA down thorough the bacterial generations. Horizontal transmission can spread DNA rapidly through a bacterial population by contact in much the same way that a disease spreads. For bacteria, horizontal transmission provides a powerful method by which they can adapt rapidly to changing environmental conditions. Phages themselves can undergo recombination when two different genotypes both infect the same bacterial cell (phage recombination, not shown in Figure 5-2). Before we analyze these modes of genetic exchange, let’s consider the practical ways of handling bacteria, which are much different from those used in handling multicellular organisms.
5.1 Working with Microorganisms Petri plate with agar gel
Single cells (not visible to naked eye)
F i g u r e 5 - 3 Bacterial phenotypes can be assessed in their colonies. A stock of bacterial cells can be grown in a liquid medium containing nutrients, and then a small number of bacteria from the liquid suspension can be spread on solid agar medium. Each cell will give rise to a colony. All cells in a colony have the same genotype and phenotype.
Bacteria are fast-dividing and take up little space; so they are very convenient to use as genetic model organisms. Visible colonies They can be cultured in a liquid medium or on a solid (each a clone of surface such as an agar gel, as long as basic nutrients are the corresponding single cell) supplied. Each bacterial cell divides asexually from 1 → 2 → 4 → 8 → 16 cells, and so on, until the nutrients are exhausted or until toxic waste products accumulate to levels that halt the population growth. A small amount of a liquid culture can be pipetted onto a petri plate containing solid agar medium and spread evenly on the surface with a sterile spreader, in a process called plating (Figure 5-3). The cells divide, but, because they cannot travel far on the surface of the gel, all the cells remain together in a clump. When this mass reaches more than 10 7 cells, it becomes visible to the naked eye as a colony. Each distinct colony on the plate has been derived from a single original cell. Members of a colony that have a single genetic ancestor are known as cell clones. Bacterial mutants are quite easy to obtain. Nutritional mutants are a good example. Wild-type bacteria are prototrophic, which means that they can grow and divide on minimal medium—a substrate containing only inorganic salts, a carbon source for energy, and water. From a prototrophic culture, auxotrophic mutants can be obtained: these mutants are cells that will not grow unless the medium contains one or more specific cellular building blocks such as adenine, threonine, or biotin. Another type of useful mutant differs from wild type in the Table 5-1
Some Genotypic Symbols Used in Bacterial Genetics
Symbol Character or phenotype associated with symbol
Requires biotin added as a supplement to minimal medium bio- - arg Requires arginine added as a supplement to minimal medium met- Requires methionine added as a supplement to minimal medium lac- Cannot utilize lactose as a carbon source gal- Cannot utilize galactose as a carbon source strr Resistant to the antibiotic streptomycin strs Sensitive to the antibiotic streptomycin Note: Minimal medium is the basic synthetic medium for bacterial growth without nutrient supplements.
5.2 Bacterial Conjugation 177
ability to use a specific energy source; for example, the wild type (lac+) can use lactose and grow, whereas a mutant (lac−) cannot. Figure 5-4 shows another way of distinguishing lac+ and lac− colonies by using a dye. In another mutant category, whereas wild types are susceptible to an inhibitor, such as the antibiotic streptomycin, resistant mutants can divide and form colonies in the presence of the inhibitor. All these types of mutants allow the geneticist to distinguish different individual strains, thereby providing genetic markers (marker alleles) to keep track of genomes and cells in experiments. Table 5-1 summarizes some mutant bacterial phenotypes and their genetic symbols. The following sections document the discovery of the various processes by which bacterial genomes recombine. The historical methods are interesting in themselves but also serve to introduce the diverse processes of recombination, as well as analytical techniques that are still applicable today.
Distinguishing lac+ and lac– by using a red dye
5.2 Bacterial Conjugation The earliest studies in bacterial genetics revealed the unexpected process of cell conjugation.
Discovery of conjugation Do bacteria possess any processes similar to sexual reproduction and recombination? The question was answered by the elegantly simple experimental work of Joshua Lederberg and Edward Tatum, who in 1946 discovered a sex-like process in what became the main model for bacterial genetics, Escherichia coli (see the Model Organism box on page 180). They were studying two strains of E. coli with different sets of auxotrophic mutations. Strain A− would grow only if the medium were supplemented with methionine and biotin; strain B− would grow only if it were supplemented with threonine, leucine, and thiamine. Thus, we can designate the strains as strain A-: met- bio- thr+ leu+ thi+ strain B-: met+ bio+ thr- leu- thiFigure 5-5a displays in simplified form the design of their experiment. Strains A− and B− were mixed together, incubated for a while, and then plated on minimal medium, on which neither auxotroph could grow. A small minority of the cells (1 in 107) was found to grow as prototrophs and, hence, must have been wild type, having regained the ability to grow without added nutrients. Some of the dishes were plated only with strain A− bacteria and some only with strain B− bacteria to act as controls, but no prototrophs arose from these platings. Figure 5-5b illustrates the experiment in more detail. These results suggested that some form of recombination of genes had taken place between the genomes of the two strains to produce the prototrophs. It could be argued that the cells of the two strains do not really exchange genes but instead leak substances that the other cells can absorb and use for growing. This possibility of “cross-feeding” was ruled out by Bernard Davis in the following way. He constructed a U-shaped tube in which the two arms were separated by a fine filter. The pores of the filter were too small to allow bacteria to pass through but large enough to allow easy passage of any dissolved substances (Figure 5-6). Strain A− was put in one arm, strain B− in the other. After the strains had been incubated for a while, Davis tested the contents of each arm to see if there were any prototrophic cells, but none were found. In other words, physical contact between the two strains was needed for wild-type cells to form. It looked as though some kind of genome union had taken place, and genuine recombinants had been produced. The physical union of bacterial cells can be confirmed under an electron microscope and is now called conjugation (Figure 5-7).
Figure 5-4 Wild-type bacteria able to use lactose as an energy source ( lac+) stain red in the presence of this indicator dye. The unstained cells are mutants unable to use lactose ( lac−). [ Jeffrey H. Miller.]
178 CHA P TER 5 The Genetics of Bacteria and Their Viruses
F i g u r e 5 - 5 With the use of this
method, Lederberg and Tatum demonstrated that genetic recombination between bacterial genotypes is possible. (a) The basic concept: two auxotrophic cultures (A− and B−) are mixed, yielding prototrophic wild types (WT). (b) Cells of type A− or type B− cannot grow on an unsupplemented (minimal) medium (MM) because A− and B− each carry mutations that cause the inability to synthesize constituents needed for cell growth. When A− and B− are mixed for a few hours and then plated, however, a few colonies appear on the agar plate. These colonies derive from single cells in which genetic material has been exchanged; they are therefore capable of synthesizing all the required constituents of metabolism.
Mixing bacterial genotypes produces rare recombinants –
+
A
B
–
Mix Some progeny
WT (a) A– met – bio – thr + leu + thi +
Mixture
B met + bio + thr – leu – thi –
Wash cells
Wash cells
Wash cells
Plate ~ 10 8 cells
Plate ~ 10 8 cells
Plate ~ 10 8 cells
MM
–
MM No colonies
met +
MM bio +
thr +
leu +
Prototrophic colonies
thi +
No colonies
(b)
Discovery of the fertility factor (F) In 1953, William Hayes discovered that, in the types of “crosses” just described here, the conjugating parents acted unequally (later, we will see ways to demonstrate this unequal participation). One parent (and only that parent) seemed to transfer some or all of its genome into another cell. Hence, one cell acts as a donor, and the other cell acts as a recipient. This “cross” is quite different from eukaryotic crosses in which parents contribute nuclear genomes equally to a progeny individual. K e y C o n c e p t The transfer of genetic material in E. coli conjugation is not reciprocal. One cell, the donor, transfers part of its genome to the other cell, which acts as the recipient.
5.2 Bacterial Conjugation 179
By accident, Hayes discovered a variant of his original No recombinants are produced donor strain that would not produce recombinants on crossing without cell contact with the recipient strain. Apparently, the donor-type strain had lost the ability to transfer genetic material and had Porous Pressure changed into a recipient-type strain. In working with this “stercotton plug or suction ile” donor variant, Hayes found that it could regain the ability to act as a donor by association with other donor strains. Indeed, the donor ability was transmitted rapidly and effectively between strains during conjugation. A kind of “infectious transfer” of some factor seemed to be taking place. He suggested that donor ability is itself a hereditary state, imposed by a fertility factor (F). Strains that carry F can donate and are Strain A– Strain B– designated F+. Strains that lack F cannot donate and are recipients, designated F−. We now know much more about F. It is an example of a small, nonessential circular DNA molecule called a plasmid that can replicate in the cytoplasm independent of the host chromosome. Figure 5-8 shows how bacteria can transfer plasmids such as F. The F plasmid directs the synthesis of pili (sing., Fine filter pilus), projections that initiate contact with a recipient (see Figures 5-7 and 5-8) and draw it closer. The F DNA in the donor F i g u r e 5 - 6 Auxotrophic bacterial cell makes a single-stranded version of itself in a peculiar mechanism called rolling strains A− and B− are grown on either circle replication. The circular plasmid “rolls,” and as it turns, it reels out a singleside of a U-shaped tube. Liquid may be strand “fishing line.” This single strand passes through a pore into the recipient cell, passed between the arms by applying where the other strand is synthesized, forming a double helix. Hence, a copy of F pressure or suction, but the bacterial cells remains in the donor and another appears in the recipient, as shown in Figure 5-8. cannot pass through the filter. After Note that the E. coli genome is depicted as a single circular chromosome in Figure incubation and plating, no recombinant 5-8. (We will examine the evidence for it later.) Most bacterial genomes are circular, colonies grow on minimal medium. a feature quite different from eukaryotic nuclear chromosomes. We will see that this feature leads to the many idiosyncrasies of bacterial genetics.
Hfr strains An important breakthrough came when Luca Cavalli-Sforza discovered a derivative of an F+ strain with two unusual properties:
Bacteria conjugate by using pili
1. On crossing with F− strains, this new strain produced 1000 times as many recombinants as a normal F+ strain. CavalliSforza designated this derivative an Hfr strain to symbolize its ability to promote a high frequency of recombination. 2. In Hfr × F− crosses, virtually none of the F− parents were converted into F+ or into Hfr. This result is in contrast with F+ × F− crosses, in which, as we have seen, infectious transfer of F results in a large proportion of the F− parents being converted into F+. It became apparent that an Hfr strain results from the integration of the F factor into the chromosome, as pictured in Figure 5-9. We can now explain the first unusual property of Hfr strains. During conjugation, the F factor inserted in the chromosome efficiently drives part or all of that chromosome into the F− cell. The chromosomal fragment can then engage in recombination with the recipient chromosome. The rare recombinants observed by Lederberg and Tatum in F+ × F− crosses were due to the spontaneous, but rare, formation of Hfr cells in the
F i g u r e 5 -7 A donor cell extends one
or more projections, or pili, that attach to a recipient cell and pull the two bacteria together. [ Dr. L. Caro/Science Source.]
18 0 CHA P TER 5 The Genetics of Bacteria and Their Viruses
F i g u r e 5 - 8 (a) During conjugation,
the pilus pulls two bacteria together. (b) Next, a pilus forms between the two cells. A single-stranded copy of plasmid DNA is produced in the donor cell and then passes into the recipient bacterium, where the single strand, serving as a template, is converted into the double-stranded helix.
F plasmids transfer during conjugation (a)
(b) Donor F +
Bacterial chromosome Pilus
F Plasmid
Recipient F –
F+ culture. Cavalli-Sforza isolated examples of these rare cells from F+ cultures and found that, indeed, they now acted as true Hfr’s. Does an Hfr cell die after donating its chromosomal material to an F− cell? The answer is no. Just like the F plasmid, the Hfr chromosome replicates and transfers a single strand to the F− cell during conjugation. That the transferred DNA is a single
Model Organism
Escherichia coli
The seventeenth-century microscopist Antony van Leeuwenhoek was probably the first to see bacterial cells and to recognize their small size: “There are more living in the scum on the teeth in a man’s mouth than there are men in the whole kingdom.” However, bacteriology did not begin in earnest until the nineteenth century. In the 1940s, Joshua Lederberg and Edward Tatum made the discovery that launched bacteriology into the burgeoning field of genetics: they discovered that, in a certain bacterium, there was a type of sexual cycle including a crossing-over-like process. The organism that they chose for this experiment has become the model not only for prokaryote genetics but, in a sense, for all of genetics. The organism was Escherichia coli, a bacterium named after its discoverer, the nineteenthcentury German bacteriologist Theodore Escherich. The choice of E. coli was fortunate because it has proved to have many features suitable for genetic research, not the least of which is that it is easily obtained, given that it lives in the gut of humans and other animals. In the gut, it is a benign symbiont, but it occasionally causes urinary tract infections and diarrhea. E. coli has a single circular chromosome 4.6 Mb in length. Of its 4000 intron-free genes, about 35 percent are of unknown function. The sexual cycle is made possible by the action of an extragenomic plasmid called F, which confers a type of “maleness.” Other plasmids carry genes whose functions equip the cell for life in specific environments, such as drug-resistance genes. These plasmids have been adapted as gene vectors, which are gene carriers that
form the basis of the gene transfers at the center of modern genetic engineering. E. coli is unicellular and grows by simple cell division. Because of its small size (~1 µm in length), E. coli can be grown in large numbers and subjected to intensive selection and screening for rare genetic events. E. coli research represents the beginning of “black box” reasoning in genetics: through the selection and analysis of mutants, the workings of the genetic machinery could be deduced even though it was too small to be seen. Phenotypes such as colony size, drug resistance, carbon-source utilization, and colored-dye production took the place of the visible phenotypes of eukaryotic genetics.
An electron micrograph of an E. coli cell showing long flagella, used for locomotion, and fimbriae, proteinaceous hairs that are important in anchoring the cells to animal tissues. (Sex pili are not shown in this micrograph.) [ Biophoto Associates/Science Photo Library.]
5.2 Bacterial Conjugation 181
strand can be demonstrated visually with the use of special strains and antibodies, as shown in Figure 5-10. The replication of the chromosome ensures a complete chromosome for the donor cell after mating. The transferred strand is converted into a double helix in the recipient cell, and donor genes may become incorporated in the recipient’s chromosome through crossovers, creating a recombinant cell (Figure 5-11). If there is no recombination, the transferred fragments of DNA are simply lost in the course of cell division.
Integration of the F plasmid creates an Hfr strain
F+
Linear transmission of the Hfr genes from a fixed point A clearer view of the behavior of Hfr strains was obtained in 1957, when Elie Wollman and François Jacob investigated the pattern of transmission of Hfr genes to F− cells during a cross. They crossed
F
Hfr
Hfr azir tonr lac+ gal+ strs × F- azis tons lac- gal- strr
Integrated F
(Superscripts “r” and “s” stand for resistant and sensitive, respectively.) At specific times after mixing, they removed samples, which were each put in a kitchen blender for a few seconds to separate the mating cell pairs. This procedure is called interrupted mating. The sample was then plated onto a medium containing streptomycin to kill the Hfr donor cells, which bore the sensitivity allele strs. The surviving strr cells then were tested for the presence of alleles from the donor Hfr genome. Any strr cell bearing a donor allele must have taken part in conjugation; such cells are called exconjugants. The results are plotted in Figure 5-12a, showing a time course of entry of each donor allele azir, tonr, lac+, and gal+. Figure 5-12b portrays the transfer of Hfr alleles. The key elements in these results are
F i g u r e 5 - 9 In an F+ strain the free
F plasmid occasionally integrates into the E. coli chromosome, creating an Hfr strain.
1. Each donor allele first appears in the F− recipients at a specific time after mating began. 2. The donor alleles appear in a specific sequence. 3. Later donor alleles are present in fewer recipient cells.
Donor DNA is transferred as a single strand
F i g u r e 5 -10 The photographs show a visualization of single-stranded DNA transfer in conjugating E. coli cells, with the use of special fluorescent antibodies. Parental Hfr strains (A) are black with red DNA. The red is from the binding of an antibody to a protein normally attached to DNA. The recipient F− cells (B) are green due to the presence of the gene for a jellyfish protein that fluoresces green, and, because they are mutant for a certain gene, their DNA protein does not bind to antibody. When Hfr donor single-stranded DNA enters the recipient, it promotes atypical binding of this protein, which fluoresces yellow in this background. Part C shows Hfr’s (unchanged) and exconjugants (cells that have undergone conjugation) with yellow transferred DNA. A few unmated F− cells are visible. [ From M. Kohiyama, S. Hiraga, I. Matic, and M. Radman, “Bacterial Sex: Playing Voyeurs 50 Years Later,” Science 301, 2003, p. 803, Fig. 1. Reprinted with permission from AAAS.]
182 CHA P TER 5 The Genetics of Bacteria and Their Viruses
F i g u r e 5 -11 After conjugation,
crossovers are needed to integrate genes from the donor fragment into the recipient’s chromosome and, hence, become a stable part of its genome. ANIMATED ART: Bacterial conjugation and recombination
Crossovers integrate parts of the transferred donor fragment Hfr
c+
b+
c+ c–
F– b+ b–
a+ a–
F
a+
Exconjugant c+ b+ a+
Exogenote
c–
Endogenote
b–
a–
Recombinant c+ b – a– c–
b+
a+
Transfer of single-stranded DNA copy
Transferred fragment converted into double helix
Lost Double crossover inserts donor DNA
Putting all these observations together, Wollman and Jacob deduced that, in the conjugating Hfr, single-stranded DNA transfer begins from a fixed point on the donor chromosome, termed the origin (O), and continues in a linear fashion. The point O is now known to be the site at which the F plasmid is inserted. The farther a gene is from O, the later it is transferred to the F−. The transfer process will generally stop before the farthermost genes are transferred, and, as a result, these genes are included in fewer exconjugants. Note that a type of chromosome map can be produced in units of minutes, based on time of entry of marked genes. In the example in Figure 5-12, the map would be: azi r 10
0 10
tonr 12 2
lac + 17 5
gal + 25 8
How can we explain the second unusual property of Hfr crosses, that F− exconjugants are rarely converted into Hfr or F+? When Wollman and Jacob allowed Hfr × F− crosses to continue for as long as 2 hours before disruption, they found that in fact a few of the exconjugants were converted into Hfr. In other words, the part of F that confers donor ability was eventually transmitted but at a very low frequency. The rareness of Hfr exconjugants suggested that the inserted F was transmitted as the last element of the linear chromosome. We can summarize the order of transmission with the following general type of map, in which the arrow indicates the direction of transfer, beginning with O: O
a
b
c
F
Thus, almost none of the F− recipients are converted, because the fertility factor is the last element transmitted and usually the transmission process will have stopped before getting that far. K e y C o n c e p t The Hfr chromosome, originally circular, unwinds a copy of itself that is transferred to the F− cell in a linear fashion, with the F factor entering last.
5.2 Bacterial Conjugation 18 3
Hfr strain H 1 2 3 AB 312
O thr pro lac pur gal his gly thi F O thr thi gly his gal pur lac pro F O pro thr thi gly his gal pur lac F O pur lac pro thr thi gly his gal F O thi thr pro lac pur gal his gly F
Each line can be considered a map showing the order of alleles on the chromosome. At first glance, there seems to be a random shuffling of genes. However, when some of the Hfr maps are inverted, the relation of the sequences becomes clear. H (written backward) 1 2 3 AB 312 (written backward)
F thi gly his gal pur lac pro thr O O thr thi gly his gal pur lac pro F O pro thr thi gly his gal pur lac F O pur lac pro thr thi gly his gal F F gly his gal pur lac pro thr thi O
The relation of the sequences to one another is explained if each map is the segment of a circle. It was the first indication that bacterial chromosomes are circular. Furthermore, Allan Campbell proposed a startling hypothesis that accounted for the different Hfr maps. He proposed that, if F is a ring, then insertion might be by a simple crossover between F and the bacterial chromosome (Figure 5-13). That being the case, any of the linear Hfr chromosomes could be generated simply by the insertion of F into the ring in the appropriate place and orientation (Figure 5-14). Several hypotheses—later supported—followed from Campbell’s proposal. 1. One end of the integrated F factor would be the origin, where transfer of the Hfr chromosome begins. The terminus would be at the other end of F. 2. The orientation in which F is inserted would determine the order of entry of donor alleles. If the circle contains genes A, B, C, and D, then insertion between
Tracking time of marker entry generates a chromosome map (a) (a) (a) (a)
100 100 100 100
Frequency (%) of genetic Frequency (%)(%) of rHfr Hfr genetic Frequency (%) of Hfr Hfr genetic Frequency of genetic r exconjugants characters among str rr exconjugants exconjugants characters among strstr characters among str characters among exconjugants
Inferring integration sites of F and chromosome circularity Wollman and Jacob went on to shed more light on how and where the F plasmid integrates to form an Hfr cell and, in doing so, deduced that the chromosome is circular. They performed interruptedmating experiments with different, separately derived Hfr strains. Significantly, the order of transmission of the alleles differed from strain to strain, as in the following examples:
r azi azi r r r azi azi r ton ton r tonr r ton
80 80 80 80 60 60 60 60 40 40 40 40
lac lac + lac++ lac + gal gal + gal++ gal +
20 20 20 20
(b) (b) (b) (b)
0 0 0 00 0 00
10 10 10 10
20 30 40 20 30 40 20 Time 30 40 20 30 40 Time (minutes) (minutes) Time(minutes) (minutes) Time
50 50 50 50
60 60 60 60
F factor factor F factor FFfactor
10 10 min min 10min min 10 s Hfr Hfr str str s Hfrstr strss Hfr
Origin Origin Origin Origin
Origin Origin Origin Origin
− r F str r F − str strr r FF−−str
17 17 min min 17min min 17
25 25 min min 25min min 25
Figure 5-12 In this interrupted-mating conjugation experiment,
F− streptomycin-resistant cells with mutations in azi, ton, lac, and gal are incubated for varying times with Hfr cells that are sensitive to streptomycin and carry wild-type alleles for these genes. (a) A plot of the frequency of donor alleles in exconjugants as a function of time after mating. (b) A schematic view of the transfer of markers (shown in different colors) with the passage of time. [ (a) Data from E. L. Wollman, F. Jacob, and W. Hayes, Cold Spring Harbor Symp. Quant. Biol. 21, 1956, 141.]
18 4 CHA P TER 5 The Genetics of Bacteria and Their Viruses
A single crossover inserts F at a specific locus, which then determines the order of gene transfer O F 1 a
2 b
Homologous regions where pairing can take place
2
1
a
b 1
b
c
d
a
2
Hfr Transferred last
c d E. coli chromosome
d
Direction of transfer
Transferred first
c
F i g u r e 5 -13 The insertion of F creates an Hfr cell. Hypothetical markers 1 and 2 are shown on F to depict the direction of insertion. The origin (O) is the mobilization point where insertion into the E. coli chromosome occurs; the pairing region is homologous with a region on the E. coli chromosome; a through d are representative genes in the E. coli chromosome. Pairing regions (hatched) are identical in plasmid and chromosome. They are derived from mobile elements called insertion sequences (see Chapter 15). In this example, the Hfr cell created by the insertion of F would transfer its genes in the order a, d, c, b.
A and D would give the order ABCD or DCBA, depending on orientation. Check the different orientations of the insertions in Figure 5-14. How is it possible for F to integrate at different sites? If F DNA had a region homologous to any of several regions on the bacterial chromosome, any one of them could act as a pairing region at which pairing could be followed by a crossover. These regions of homology are now known to be mainly segments of transposable elements called insertion sequences. For a full explanation of insertion sequences, see Chapter 15. The fertility factor thus exists in two states: 1. The plasmid state: As a free cytoplasmic element, F is easily transferred to F — recipients. 2. The integrated state: As a contiguous part of a circular chromosome, F is transmitted only very late in conjugation. The E. coli conjugation cycle is summarized in Figure 5-15.
Mapping of bacterial chromosomes Broad-scale chromosome mapping by using time of entry Wollman and Jacob realized that the construction of linkage maps from the interrupted-mating results would be easy by using as a measure of “distance” the times at which the donor alleles first appear after mating. The units of map distance in this case are minutes. Thus, if b+ begins to enter the F− cell 10 minutes after a+ begins to enter, then a+ and b+ are 10 units apart (see map on p. 182). Like eukaryotic maps based on crossovers, these linkage maps were originally purely genetic constructions. At the time they were originally devised, there was no way of testing their physical basis. Fine-scale chromosome mapping by using recombinant frequency For an exconjugant to acquire donor genes as a permanent feature of its genome, the
5.2 Bacterial Conjugation 18 5
The F integration site determines the order of gene transfer in Hfrs thr
thi
thi
pro
gly
gly
lac
his
gal
thi
gal
pur
2
thr
pro
gly
thi
lac
gal
thr
gly
pur
pro lac
gal
pur
pro lac pur gal his gly thi
thr
1 pro lac
his
gal
pur Fertility factor
gly his gal pur lac pro thr thi
gal his gly thi thr pro lac pur
thr
gly his
lac pur gal his gly thi thr pro
H
his
pro lac
his
pur
thi gly his gal pur lac pro thr
thi
thr
Origin (first to enter) Terminus (last to enter)
312
3
F i g u r e 5 -14 The five E. coli Hfr strains shown each have different F plasmid insertion points and orientations. All strains have the same order of genes on the E. coli chromosome. The orientation of the F factor determines which gene enters the recipient cell first. The gene closest to the terminus enters last.
Two types of DNA transfer can take place during conjugation Chromosome transfer
Plasmid transfer a+
F F+ a +
Conjugation and transfer of F factor
Insertion of F factor F
F+ a +
a+
F
a+
Hfr a + F– a –
a–
Conjugation and chromosome transfer a+
F
Hfr a +
F
a+
a+ F+ a +
a–
F– a –
F – a +/ a – Recombination No recombination F– a +
F– a –
F
a– F+ a –
F i g u r e 5 -15 Conjugation can take
place by partial transfer of a chromosome containing the F factor or by transfer of an F plasmid that remains a separate entity.
18 6 CHA P TER 5 The Genetics of Bacteria and Their Viruses
donor fragment must recombine with the recipient chromosome. However, note that time-of-entry mapping is not based on recombinant frequency. Indeed, the units are minutes, not RF. Nevertheless, recombinant frequency can a+ be used for a more fine-scale type of mapping in bacteria, a method to which we now turn. a+ a– a– First, we need to understand some special features of Nonviable the recombination event in bacteria. Recall that recombination does not take place between two whole genomes, as it does in eukaryotes. In contrast, it takes place between one complete genome, F i g u r e 5 -16 A single crossover between exogenote and endogenote in a from the F− recipient cell, called the endogenote, and an incomplete one, derived merozygote would lead to a linear, partly from the Hfr donor cell and called the exogenote. The cell at this stage has two diploid chromosome that would not copies of one segment of DNA: one copy is part of the endogenote and the other survive. copy is part of the exogenote. Thus, at this stage, the cell is a partial diploid, called a merozygote. Bacterial genetics is merozygote genetics. A single crossover in a merozygote would break the ring and thus not produce viable recombinants, as shown in Figure 5-16. To keep the circle intact, there must be an even number of crossovers. An even number of crossovers produces a circular, intact chromosome and a fragment. Although such recombination events are represented in a shorthand way as double crossovers, the actual molecular mechanism is somewhat different, more like an invasion of the endogenote by an internal section of the exogenote. The other product of the “double crossover,” the fragment, is generally lost in subsequent cell growth. Hence, only one of the reciprocal products of recombination survives. Therefore, another unique feature of bacterial recombination is that we must forget about reciprocal exchange products in most cases. A single crossover cannot produce a viable recombinant
K e y C o n c e p t Recombination during conjugation results from a doublecrossover-like event, which gives rise to reciprocal recombinants of which only one survives.
With this understanding, we can examine recombination mapping. Suppose that we want to calculate map distances separating three close loci: met, arg, and leu. To examine the recombination of these genes, we need “trihybrids,” exconjugants that have received all three donor markers. Assume that an interrupted-mating experiment has shown that the order is met, arg, leu, with met transferred first and leu last. To obtain a trihybrid, we need the merozygote diagrammed here:
leu
arg
met
Transferred fragment of Hfr chromosome leu
arg
met
F chromosome To obtain this merozygote, we must first select stable exconjugants bearing the last donor allele, which, in this case, is leu+. Why? In leu+ exconjugants, we know all three markers were transferred into the recipient because leu is the last donor allele. We also know that at least the leu+ marker was integrated into the endogenote. We want to know how often the other two markers were also integrated so that we can determine the number of recombination events in which arg+ or met+ was omitted due to double crossover. The goal now is to count the frequencies of crossovers at different locations. Note that we now have a different situation from the analysis of interrupted conjugation. In mapping by interrupted conjugation, we measure the time of entry of
5.2 Bacterial Conjugation 187
individual loci; to be stably inherited, each marker has to recombine into the recipient chromosome by a double crossover spanning it. However, in the recombinant frequency analysis, we have specifically selected trihybrids as a starting point, and now we have to consider the various possible combinations of the three donor alleles that can be inserted by double crossing over in the various intervals. We know that leu+ must have entered and inserted because we selected it, but the leu+ recombinants that we select may or may not have incorporated the other donor markers, depending on where the double crossover took place. Hence, the procedure is to first select leu+ exconjugants and then isolate and test a large sample of them to see which of the other markers were integrated. Let’s look at an example. In the cross Hfr met+ arg+ leu+ strs × F− met− arg− leu− strr, we would select leu+ recombinants and then examine them for the arg+ and met+ alleles, called the unselected markers. Figure 5-17 depicts the types of double-crossover events expected. One crossover must be on the left side of the leu marker and the other must be on the right side. Let’s assume that the leu+ exconjugants are of the following types and frequencies: leu+ arg- met- 4% leu+ arg+ met- 9% leu+ arg+ met+ 87%
F i g u r e 5 -17 The diagram shows how genes can be mapped by recombination in E. coli. In exconjugants, selection is made for merozygotes bearing the leu+ marker, which is donated late. The early markers (arg+ and met +) may or may not be inserted, depending on the site where recombination between the Hfr fragment and the F− chromosome takes place. The frequencies of events diagrammed in parts a and b are used to obtain the relative sizes of the leu–arg and arg–met regions. Note that, in each case, only the DNA inserted into the F− chromosome survives; the other fragment is lost. ANIMATED ART: Bacterial conjugation and mapping by recombination
The generation of various recombinants by crossing over in different regions (a) Insertion of late marker only leu
arg
met
Hfr fragment leu
arg
met
leu arg met F
(b) Insertion of late marker and one early marker arg leu
arg leu
arg
arg
chromosome
met
met
(c) Insertion of all markers leu
leu arg met
met
met
leu arg met
leu
(d) Insertion of late and early markers, but not of marker in between arg met leu
leu
arg
met
leu arg met
18 8 CHA P TER 5 The Genetics of Bacteria and Their Viruses
The double crossovers needed to produce these genotypes are shown in Figure 5-17. The first two types are the key because they require a crossover between leu and arg in the first case and between arg and met in the second. Hence, the relative frequencies of these types correspond to the sizes of these two regions between the genes. We would conclude that the leu−arg region is 4 m.u. and that the arg− met is 9 m.u. In a cross such as the one just described, one type of potential recombinants of genotype leu+ arg− met+ requires four crossovers instead of two (see the bottom of Figure 5-17). These recombinants are rarely recovered because their frequency is very low compared with that of the other types of recombinants.
F plasmids that carry genomic fragments The F factor in Hfr strains is generally quite stable in its inserted position. However, occasionally an F factor cleanly exits from the chromosome by a reversal of the recombination process that inserted it in the first place. The two homologous pairing regions on either side re-pair, and a crossover takes place to liberate the F plasmid. However, sometimes the exit is not clean, and the plasmid carries with it a part of the bacterial chromosome. An F plasmid carrying bacterial genomic DNA is called an F′ (F prime) plasmid. The first evidence of this process came from experiments in 1959 by Edward Adelberg and François Jacob. One of their key observations was of an Hfr in which the F factor was integrated near the lac+ locus. Starting with this Hfr lac+ strain, Jacob and Adelberg found an F+ derivative that, in crosses, transferred lac+ to F− lac− recipients at a very high frequency. (These transferrants could be detected by plating on medium containing lactose, on which only lac+ can grow.) The transferred lac+ is not incorporated into the recipient’s main chromosome, which we know retains the allele lac− because these F+ lac+ exconjugants occasionally gave rise to F− lac− daughter cells, at a frequency of 1 × 10−3. Thus, the genotype of these recipients appeared to be F ′ lac+/F− lac−. In other words, the lac+ exconjugants seemed to carry an F ′ plasmid with a piece of the donor chromosome incorporated. The origin of this F ′ plasmid is shown in Figure 5-18. Note that the faulty excision occurs because there is another homologous region nearby that pairs with the original. The F ′ in our example is called F ′ lac because the piece of host chromosome that it picked up has the lac gene on it. F ′ factors have been found carrying many different chromosomal genes and have been named accordingly. For example, F ′ factors carrying gal or trp are called F ′ gal and F ′ trp, respectively. Because F lac+/F− lac− cells are lac+ in phenotype, we know that lac+ is dominant over lac−. Partial diploids made with the use of F′ strains are useful for some aspects of routine bacterial genetics, such as the study of dominance or of allele interaction. Some F′ strains can carry very large parts (as much as one-quarter) of the bacterial chromosome. K e y C o n c e p t The DNA of an F′ plasmid is part F factor and part bacterial genome. Like F plasmids, F′ plasmids transfer rapidly. They can be used to establish partial diploids for studies of bacterial dominance and allele interaction.
R plasmids An alarming property of pathogenic bacteria first came to light through studies in Japanese hospitals in the 1950s. Bacterial dysentery is caused by bacteria of the genus Shigella. This bacterium was initially sensitive to a wide array of antibiotics that were used to control the disease. In the Japanese hospitals, however, Shigella
5.2 Bacterial Conjugation 18 9
Figure 5-18 An F factor can pick up
Faulty outlooping produces F′, an F plasmid that contains chromosomal DNA
chromosomal DNA as it exits a chromosome. (a) F is inserted in an Hfr strain at a repetitive element identified as IS1 (insertion sequence 1) between the ton and lac+ alleles. (b) The inserted F factor. (c) Abnormal “outlooping” by crossing over with a different element, IS2, to include the lac locus. (d ) The resulting F ′ lac+ particle. (e) F ′ lac+/ F− lac− partial diploid produced by the transfer of the F ′ lac+ particle to an F− lac− recipient. [ Data from G. S. Stent and
F
(a) Insertion
IS1
ton
lac
IS2
Integrated F factor
R. Calendar, Molecular Genetics, 2nd ed.]
(b) lac
Hfr chromosome
lac (c) Excision
F' lac (d)
(e) F' lac /lac partial diploid
lac
lac
lac
isolated from patients with dysentery proved to be simultaneously resistant to many of these drugs, including penicillin, tetracycline, sulfanilamide, streptomycin, and chloramphenicol. This resistance to multiple drugs was inherited as a single genetic package, and it could be transmitted in an infectious manner—not only to other sensitive Shigella strains, but also to other related species of bacteria. This talent, which resembles the mobility of the E. coli F plasmid, is an extraordinarily useful one for the pathogenic bacterium because resistance can rapidly spread throughout a population. However, its implications for medical science are dire because the bacterial disease suddenly becomes resistant to treatment by a large range of drugs. From the point of view of the geneticist, however, the mechanism has proved interesting and is useful in genetic engineering. The vectors carrying these multiple resistances proved to be another group of plasmids called R plasmids. They are transferred rapidly on cell conjugation, much like the F plasmid in E. coli. In fact, the R plasmids in Shigella proved to be just the first of many similar genetic elements to be discovered. All exist in the plasmid state in the cytoplasm.
19 0 CHA P TER 5 The Genetics of Bacteria and Their Viruses
Genetic Determinants Borne by Plasmids
Table 5-2
Characteristic
Plasmid examples
Fertility Bacteriocin production Heavy-metal resistance Enterotoxin production Metabolism of camphor Tumorigenicity in plants
F, R1, Col Col E1 R6 Ent Cam T1 (in Agrobacterium tumefaciens)
These elements have been found to carry many different kinds of genes in bacteria. Table 5-2 shows some of the characteristics that can be borne by plasmids. Figure 5-19 shows an example of a well-traveled plasmid isolated from the dairy industry. Engineered derivatives of R plasmids, such as pBR 322 and pUC (see Chapter 10), have become the preferred vectors for the molecular cloning of the DNA of all organisms. The genes on an R plasmid that confer resistance can be used as markers to keep track of the movement of the vectors between cells. On R plasmids, the alleles for antibiotic resistance are often contained within a unit called a transposon (Figure 5-20). Transposons are unique segments of DNA
F i g u r e 5 -19 The diagram shows the origins of genes of the Lactococcus lactis plasmid pK214. The genes are from many different bacteria. [ Data from Table 1 in V. Perreten, F. Schwarz, L. Cresta, M. Boeglin, G. Dasen, and M. Teuber, Nature 389, 1997, 801–802.]
A plasmid with segments from many former bacterial hosts Lactococcus lactis
Enterococcus faecium Listeria monocytogenes
1
29
Mycoplasma
2
3
4
28
5 6 Lactococcus lactis
27
Enterococcus faecalis
7
26 25 Enterococcus faecium
8
24
Listeria monocytogenes
Plasmid pk214
23 22
Staphylococcus aureus
Lactobacillus plantarum
9
Streptococcus agalactiae
21 10 20
Lactococcus lactis
11 12
19 Staphylococcus aureus Enterococcus faecium
Streptococcus pyogenes
13 14
18 17
16
Escherichia coli
15
Escherichia coli Staphylococcus aureus
Escherichia coli Lactococcus lactis
5.3 Bacterial Transformation 191
that can move around to different sites in the genome, a process called transposition. (The mechanisms for transposition, which occurs in most species studied, will be detailed in Chapter 15.) When a transposon in the genome moves to a new location, it can occasionally embrace between its ends various types of genes, including alleles for drug resistance, and carry them along to their new locations as passengers. Sometimes, a transposon carries a drug-resistance allele to a plasmid, creating an R plasmid. Like F plasmids, many R plasmids are conjugative; in other words, they are effectively transmitted to a recipient cell during conjugation. Even R plasmids that are not conjugative and never leave their own cells can donate their R alleles to a conjugative plasmid by transposition. Hence, through plasmids, antibiotic-resistance alleles can spread rapidly throughout a population of bacteria. Although the spread of R plasmids is an effective strategy for the survival of bacteria, it presents a major problem for medical practice, as mentioned earlier, because bacterial populations rapidly become resistant to any new antibiotic drug that is invented and applied to humans.
5.3 Bacterial Transformation Some bacteria can take up fragments of DNA from the external medium, and such uptake constitutes another way in which bacteria can exchange their genes. The source of the DNA can be other cells of the same species or cells of other species. In some cases, the DNA has been released from dead cells; in other cases, the DNA has been secreted from live bacterial cells. The DNA taken up integrates into the recipient’s chromosome. If this DNA is of a different genotype from that of the recipient, the genotype of the recipient can become permanently changed, a process aptly termed transformation.
The nature of transformation Transformation was discovered in the bacterium Streptococcus pneumoniae in 1928 by Frederick Griffith. Later, in 1944, Oswald T. Avery, Colin M. MacLeod, and Maclyn McCarty demonstrated that the “transforming principle” was DNA. Both results are milestones in the elucidation of the molecular nature of genes. We consider this work in more detail in Chapter 7. The transforming DNA is incorporated into the bacterial chromosome by a process analogous to the double-recombination events observed in Hfr × F− crosses. Note, however, that, in conjugation, DNA is transferred from one living cell to another through close contact, whereas in transformation, isolated pieces of external DNA are taken up by a cell through the cell wall and plasma membrane. Figure 5-21 shows one way in which this process can take place. Transformation has been a handy tool in several areas of bacterial research because the genotype of a strain can be deliberately changed in a very specific way by transforming with an appropriate DNA fragment. For example, transformation is used widely in genetic engineering. It has been found that even eukaryotic cells can be transformed, by using quite similar procedures, and this technique has been invaluable for modifying eukaryotic cells (see Chapter 10).
Chromosome mapping using transformation Transformation can be used to measure how closely two genes are linked on a bacterial chromosome. When DNA (the bacterial chromosome) is extracted for transformation experiments, some breakage into smaller pieces is inevitable. If two donor genes are located close together on the chromosome, there is a good chance that sometimes they will be carried on the same piece of transforming DNA. Hence, both will be taken up, causing a double transformation. Conversely, if genes are
An R plasmid with resistance genes carried in a transposon
Conjugative plasmid
IS50
kan R neoR
IS50
Transposon Tn5 F i g u r e 5 -2 0 A transposon such as
Tn5 can acquire several drug-resistance genes (in this case, those for resistance to the drugs kanamycin and neomycin) and transmit them rapidly on a plasmid, leading to the infectious transfer of resistance genes as a package. Insertion sequence 50 (IS50) forms the flanks of TN5.
192 CHA P TER 5 The Genetics of Bacteria and Their Viruses
Mechanism of DNA uptake by bacteria Free DNA
DNA-binding complex Nucleotide
Cell wall Cytoplasmic membrane
Free DNA from dead bacterium Chromosome
(a) F i g u r e 5 -2 1 A bacterium undergoing
transformation (a) picks up free DNA released from a dead bacterial cell. As DNA-binding complexes on the bacterial surface take up the DNA (inset), enzymes break down one strand into nucleotides; a derivative of the other strand may integrate into the bacterium’s chromosome (b).
DNA-degrading enzyme Transformed bacterium
(b)
Transferred DNA
widely separated on the chromosome, they will most likely be carried on separate transforming segments. A genome could possibly take up both segments independently, creating a double transformant, but that outcome is not likely. Hence, in widely separated genes, the frequency of double transformants will equal the product of the single-transformant frequencies. Therefore, testing for close linkage by testing for a departure from the product rule should be possible. In other words, if genes are linked, then the proportion of double transformants will be greater than the product of single-transformant frequencies. Unfortunately, the situation is made more complex by several factors—the most important being that not all cells in a population of bacteria are competent to be transformed. Nevertheless, at the end of this chapter, you can sharpen your skills in transformation analysis in one of the problems, which assumes that 100 percent of the recipient cells are competent. K e y C o n c e p t Bacteria can take up DNA fragments from the surrounding medium. Inside the cell, these fragments can integrate into the chromosome.
5.4 Bacteriophage Genetics The word bacteriophage, which is a name for bacterial viruses, means “eater of bacteria.” These viruses parasitize and kill bacteria. Pioneering work on the genetics of bacteriophages in the middle of the twentieth century formed the foundation of more recent research on tumor-causing viruses and other kinds of animal and plant viruses. In this way, bacterial viruses have provided an important model system. These viruses can be used in two different types of genetic analysis. First, two distinct phage genotypes can be crossed to measure recombination and hence map the viral genome. Mapping of the viral genome by this method is the topic of this section. Second, bacteriophages can be used as a way of bringing bacterial genes together for linkage and other genetic studies. We will study the use of phages in bacterial studies in Section 5.5. In addition, as we will see in Chapter 10, phages are used in DNA technology as carriers, or vectors, of foreign DNA. Before we can understand phage genetics, we must first examine the infection cycle of phages.
Infection of bacteria by phages Most bacteria are susceptible to attack by bacteriophages. A phage consists of a nucleic acid “chromosome” (DNA or RNA) surrounded by a coat of protein molecules. Phage types are identified not by species names but by symbols—for example, phage T4, phage λ, and so forth. Figures 5-22 and 5-23 show the structure of phage T4. During infection, a phage attaches to a bacterium and injects its genetic material into the bacterial cytoplasm, as diagrammed in Figure 5-22. An electron micrograph of the process is shown in Figure 5-24. The phage genetic information then takes over the machinery of the bacterial cell by turning off the synthesis of bacterial components and redirecting the bacterial synthetic machinery to make phage components. Newly made phage heads are individually stuffed with replicates of the phage chromosome. Ultimately, many phage descendants are made
5.4 Bacteriophage Genetics 19 3
and are released when the bacterial cell wall breaks open. This breaking-open process is called lysis. The population of phage progeny is called the phage lysate. How can we study inheritance in phages when they are so small that they are visible only under the electron microscope? In this case, we cannot produce a visible colony by plating, but we can produce a visible manifestation of a phage by taking advantage of several phage characters. Let’s look at the consequences of a phage infecting a single bacterial cell. Figure 5-25 shows the sequence of events in the infectious cycle that leads to the release of progeny phages from the lysed cell. After lysis, the progeny phages infect neighboring bacteria. This cycle is repeated through progressive rounds of infection, and, as these cycles repeat, the number of lysed cells increases exponentially. Within 15 hours after one single phage particle infects a single bacterial cell, the effects are visible to the naked eye as a clear area, or plaque, in the opaque lawn of bacteria covering the surface of a plate of solid medium (Figure 5-26). Such plaques can be large or small, fuzzy or sharp, and so forth, depending on the phage genotype. Thus, plaque morphology is a phage character that can be analyzed at the genetic level. Another phage phenotype that we can analyze genetically is host range, because phages may differ in the spectra of bacterial strains that they can infect and lyse. For example, a specific strain of bacteria might be immune to phage 1 but susceptible to phage 2.
Electron micrograph of phage T4
Structure and function of phage T4 Free phage
T4 phage components DNA Head Neck and collar
Infecting phage
Core
Sheath End plate
Injected DNA
Cell wall Fibers
F i g u r e 5 -2 2 An infecting phage injects DNA through its core structure into the cell. (Left) Bacteriophage T4 is shown as a free phage and then in the process of infecting an E. coli cell. (Right ) The major structural components of T4.
Electron micrograph of phage infection
F i g u r e 5 -2 4 Bacteriophages are shown in several stages of the infection process, which includes attachment and DNA injection. [ © Eye of Science/Science Source.] F i g u r e 5 -2 3 Enlargement of the E. coli phage T4 reveals details of head, tail, and tail fibers. [ Science Source.]
19 4 CHA P TER 5 The Genetics of Bacteria and Their Viruses
Cycle of phage that lyses the host cells
A plaque is a clear area in which all bacteria have been lysed by phages Clear areas, or plaques
Uninfected cell
Lysis of host cell
Adsorption of phage to host cell
Free phages
F i g u r e 5 -2 6 Through repeated infection and production of progeny phage, a single phage produces a clear area, or plaque, on the opaque lawn of bacterial cells. [ D. Sue Katz, Rogers State University, Claremore, OK.]
Assembly of phages within host cell
Mapping phage chromosomes by using phage crosses
Lytic cycle
Phage nucleic acid
Entry of phage nucleic acid
Two phage genotypes can be crossed in much the same way that we cross organisms. A phage cross can be illustrated by a cross of T2 phages originally studied by Alfred Hershey. The genotypes of the two parental strains in Hershey’s cross were h− r+ × h+ r−. The alleles correspond to the following phenotypes: h− : can infect two different E. coli strains (which we can call strains 1 and 2)
Phage protein
h+ : can infect only strain 1 r− : rapidly lyses cells, thereby producing large plaques
Degraded host chromosome
Phage proteins synthesized and genetic material replicated; host chromosome then degraded
F i g u r e 5 -2 5 Infection by a single
phage redirects the cell’s machinery into making progeny phages, which are released at lysis.
r+ : slowly lyses cells, producing small plaques To make the cross, E. coli strain 1 is infected with both parental T2 phage genotypes. This kind of infection is called a mixed infection or a double infection (Figure 5-27). After an appropriate incubation period, the phage lysate (containing the progeny phages) is analyzed by spreading it onto a bacterial lawn composed of a mixture of E. coli strains 1 and 2. Four plaque types are then distinguishable (Figure 5-28). Large plaques indicate rapid lysis (r−), and small plaques indicate slow lysis (r+). Phage plaques with the allele h− will infect both hosts, forming a clear plaque, whereas phage plaques with the allele h+ will infect only one host, forming a cloudy plaque. Thus, the four genotypes can be easily classi-
5.4 Bacteriophage Genetics 19 5
fied as parental (h− r+ and h+ r−) and recombinant (h+ r+ and h− r−), and a recombinant frequency can be calculated as follows: RF =
(h+ r+) + (h- r-) total plaques
A phage cross made by doubly infecting the host cell with parental phages h– r +
If we assume that the recombining phage chromosomes are linear, then single crossovers produce viable reciprocal products. However, phage crosses are subject to some analytical complications. First, several rounds of exchange can take place within the host: a recombinant produced shortly after infection may undergo further recombination in the same cell or in later infection cycles. Second, recombination can take place between genetically similar phages as well as between different types. Thus, if we let P1 and P2 refer to general parental genotypes, crosses of P1 × P1 and P2 × P2 take place in addition to P1 × P2. For both these reasons, E. coli strain 1 recombinants from phage crosses are a consequence of a population of events rather than defined, single-step exchange events. Nevertheless, all other things being equal, F i g u r e 5 -2 7 the RF calculation does represent a valid index of map distance in phages. Because astronomically large numbers of phages can be used in phage-recombination analyses, very rare crossover events can be detected. In the 1950s, Seymour Benzer made use of such rare crossover events to map the mutant sites within the rII gene of phage T4, a gene that controls lysis. For different rII mutant alleles arising spontaneously, the mutant site is usually at different positions within the gene. Therefore, when two different rII mutants are crossed, a few rare crossovers may take place between the mutant sites, producing wild-type recombinants, as shown here: Plaques from recombinant and rII gene parental phage progeny
h+ r –
Parent 1 Parent 2 Wild type Double mutant
As distance between two mutant sites increases, such a crossover event is more likely. Thus, the frequency of rII+ recombinants is a measure of that distance within the gene. (The reciprocal product is a double mutant and indistinguishable from the parentals.) Benzer used a clever approach to detect the very rare rII+ recombinants. He made use of the fact that rII mutants will not infect a strain of E. coli called K. Therefore, he made the rII × rII cross on another strain and then plated the phage lysate on a lawn of strain K. Only rII+ recombinants will form plaques on this lawn. This way of finding a rare genetic event (in this case, a recombinant) is a selective system: only the desired rare event can produce a certain visible outcome. In contrast, a screen is a system in which large numbers of individuals are visually scanned to seek the rare “needle in the haystack.” This same approach can be used to map mutant sites within genes for any organism from which large numbers of cells can be obtained and for which wild-type and mutant phenotypes can be distinguished. However, this sort of intragenic
F i g u r e 5 -2 8 These plaque phenotypes were produced by progeny of the cross h− r + × h+ r −. Four plaque phenotypes can be differentiated, representing two parental types and two recombinants. [ From G. S. Stent, Molecular Biology of Bacterial Viruses. Copyright 1963 by W. H. Freeman and Company.]
19 6 CHA P TER 5 The Genetics of Bacteria and Their Viruses
mapping has been largely superseded by the advent of inexpensive chemical methods for DNA sequencing, which identify the positions of mutant sites directly. K e y C o n c e p t Recombination between phage chromosomes can be studied by bringing the parental chromosomes together in one host cell through mixed infection. Progeny phages can be examined for both parental and recombinant genotypes.
5.5 Transduction Some phages are able to pick up bacterial genes and carry them from one bacterial cell to another, a process known as transduction. Thus, transduction joins the battery of modes of transfer of genomic material between bacteria—along with Hfr chromosome transfer, F ′ plasmid transfer, and transformation.
Discovery of transduction In 1951, Joshua Lederberg and Norton Zinder were testing for recombination in the bacterium Salmonella typhimurium by using the techniques that had been successful with E. coli. The researchers used two different strains: one was phe− trp− tyr−, and the other was met− his−. We won’t worry about the nature of these alleles except to note that all are auxotrophic. When either strain was plated on a minimal medium, no wild-type cells were observed. However, after the two strains were mixed, wild-type prototrophs appeared at a frequency of about 1 in 105. Thus far, the situation seems similar to that for recombination in E. coli. However, in this case, the researchers also recovered recombinants from a U-tube experiment, in which conjugation was prevented by a filter separating the two arms (recall Figure 5-6). They hypothesized that some agent was carrying genes from one bacterium to another. By varying the size of the pores in the filter, they found that the agent responsible for gene transfer was the same size as a known phage of Salmonella, called phage P22. Furthermore, the filterable agent and P22 were identical in sensitivity to antiserum and in immunity to hydrolytic enzymes. Thus, Lederberg and Zinder had discovered a new type of gene transfer, mediated by a virus. They were the first to call this process transduction. As a rarity in the lytic cycle, virus particles sometimes pick up bacterial genes and transfer them when they infect another host. Transduction has subsequently been demonstrated in many bacteria. To understand the process of transduction, we need to distinguish two types of phage cycle. Virulent phages are those that immediately lyse and kill the host. Temperate phages can remain within the host cell for a period without killing it. Their DNA either integrates into the host chromosome to replicate with it or replicates separately in the cytoplasm, as does a plasmid. A phage integrated into the bacterial genome is called a prophage. A bacterium harboring a quiescent phage is described as lysogenic and is itself called a lysogen. Occasionally, the quiescent phage in a lysogenic bacterium becomes active, replicates itself, and causes the spontaneous lysis of its host cell. A resident temperate phage confers resistance to infection by other phages of that type. There are two kinds of transduction: generalized and specialized. Generalized transducing phages can carry any part of the bacterial chromosome, whereas specialized transducing phages carry only certain specific parts. K e y C o n c e p t Virulent phages cannot become prophages; they replicate and lyse a cell immediately. Temperate phages can exist within the bacterial cell as prophages, allowing their hosts to survive as lysogenic bacteria; they are also capable of occasional bacterial lysis.
5.5 Transduction 197
Generalized transduction By what mechanisms can a phage carry out generalized transduction? In 1965, H. Ikeda and J. Tomizawa threw light on this question in some experiments on the E. coli phage P1. They found that, when a donor cell is lysed by P1, the bacterial chromosome is broken up into small pieces. Occasionally, the newly forming phage particles mistakenly incorporate a piece of the bacterial DNA into a phage head in place of phage DNA. This event is the origin of the transducing phage. A phage carrying bacterial DNA can infect another cell. That bacterial DNA can then be incorporated into the recipient cell’s chromosome by recombination (Figure 5-29). Because genes on any of the cut-up parts of the host genome can be transduced, this type of transduction is by necessity of the generalized type. Phages P1 and P22 both belong to a phage group that shows generalized transduction. P22 DNA inserts into the host chromosome, whereas P1 DNA remains free, like a large plasmid. However, both transduce by faulty head stuffing. Generalized transduction can be used to obtain bacterial linkage information when genes are close enough that the phage can pick them up and transduce them in a single piece of DNA. For example, suppose that we wanted to find the linkage distance between met and arg in E. coli. We could grow phage P1 on a donor met+ arg+ strain and then allow P1 phages from lysis of this strain to infect a met− arg− strain. First, one donor allele is selected, say, met+. Then the percentage of met+ colonies that are also arg+ is measured. Strains transduced to both Generalized transduction by random incorporation of bacterial DNA into phage heads
a+
a+
b+ b+
a+
Donor bacterium
b+ b+ Phages carrying donor genes a+
a+ a+
a–
a+
a–
Transduced bacterium
F i g u r e 5 -2 9 A newly forming phage may pick up DNA from its host cell’s chromosome
(top) and then inject it into a new cell ( bottom right). The injected DNA may insert into the new host’s chromosome by recombination ( bottom left). In reality, only a very small minority of phage progeny (1 in 10,000) carry donor genes.
Recipient bacterium
19 8 CHA P TER 5 The Genetics of Bacteria and Their Viruses
F i g u r e 5 - 3 0 The diagram shows a genetic map of the purB-to-cysB region of E. coli determined by P1 cotransduction. The numbers given are the averages in percent for cotransduction frequencies obtained in several experiments. The values in parentheses are considered unreliable. [ Data from J. R. Guest, Mol. Gen.
From high cotransduction frequencies, close linkage is inferred 2.8
(77), 13 42, 40
(70), 46 68, 74 70
Genet. 105, 1969, p. 285.]
purB
hemA
narC
5.1, 5.0
supF, C
galU
attf80
tonB
trp
cysB
65, 66
35 21, 16 2.0
met+ and arg+ are called cotransductants. The greater the cotransduction frequency, the closer two genetic markers must be (the opposite of most mapping measurements). Linkage values are usually expressed as cotransduction frequencies (Figure 5-30). By using an extension of this approach, we can estimate the size of the piece of host chromosome that a phage can pick up, as in the following type of experiment, which uses P1 phage: donor leu+ thr+ azir → recipient leu- thr- azis In this experiment, P1 phage grown on the leu+ thr+ azir donor strain infect the thr− azis recipient strain. The strategy is to select one or more donor alleles in the recipient and then test these transductants for the presence of the unselected alleles. Results are outlined in Table 5-3. Experiment 1 in Table 5-3 tells us that leu is relatively close to azi and distant from thr, leaving us with two possibilities: leu−
thr
leu
azi
thr
or
azi
leu
Experiment 2 tells us that leu is closer to thr than azi is, and so the map must be thr
leu
azi
By selecting for thr+ and leu+ together in the transducing phages in experiment 3, we see that the transduced piece of genetic material never includes the azi locus because the phage head cannot carry a fragment of DNA that big. P1 can only cotransduce genes less than approximately 1.5 minutes apart on the E. coli chromosome map.
Specialized transduction A generalized transducer, such as phage P22, picks up fragments of broken host DNA at random. How are other phages, which act as specialized transducers, able Table 5-3 Experiment
1 2 3
Accompanying Markers in Specific P1 Transductions Selected marker
Unselected markers
leu+ thr+ leu+ and thr+
50% are azir; 2% are thr+ 3% are leu+; 0% are arir 0% are azir
5.5 Transduction 19 9
to carry only certain host genes to recipient cells? The short answer is that a specialized transducer inserts into the bacterial chromosome at one position only. When it exits, a faulty outlooping occurs (similar to the type that produces F ′ plasmids). Hence, it can pick up and transduce only genes that are close by. The prototype of specialized transduction was provided by studies undertaken by Joshua and Esther Lederberg on a temperate E. coli phage called lambda (λ). Phage λ has become the most intensively studied and best-characterized phage. Behavior of the prophage Phage λ has unusual effects when cells lysogenic for it are used in crosses. In the cross of an uninfected Hfr with a lysogenic F− recipient [Hfr × F-(l)], lysogenic F− exconjugants with Hfr genes are readily recovered. However, in the reciprocal cross Hfr(λ) × F−, the early genes from the Hfr chromosome are recovered among the exconjugants, but recombinants for late genes are not recovered. Furthermore, lysogenic exconjugants are almost never recovered from this reciprocal cross. What is the explanation? The observations make sense if the λ prophage is behaving as a bacterial gene locus behaves (that is, as part of the bacterial chromosome). Thus, in the Hfr(λ) × F− cross, the prophage would enter the F− cell at a specific time corresponding to its position in the chromosome. Earlier genes are recovered because they enter before the prophage. Later genes are not recovered because lysis destroys the recipient cell. In interrupted-mating experiments, the λ prophage does in fact always enter the F− cell at a specific time, closely linked to the gal locus. In an Hfr(λ) × F− cross, the entry of the λ prophage into the cell immediately triggers the prophage into a lytic cycle; this process is called zygotic induction (Figure 5-31). However, in the cross of two lysogenic cells Hfr(λ) × F−(λ), there is no zygotic induction. The presence of any prophage prevents another infecting virus from causing lysis. This is because the prophage produces a cytoplasmic factor that represses the multiplication of the virus. (The phage-directed cytoplasmic repressor nicely explains the immunity of the lysogenic bacteria, because a phage would immediately encounter a repressor and be inactivated.) λ insertion The interrupted-mating experiments heretofore described showed that the λ prophage is part of the lysogenic bacterium’s chromosome. How is the λ prophage inserted into the bacterial genome? In 1962, Allan Campbell proposed that it inserts by a single crossover between a circular λ phage chromosome and the circular E. coli chromosome, as shown in Figure 5-32. The crossover point would be between a specific site in λ, the λ attachment site, and an attachment site in the bacterial chromosome located between the genes gal and bio, because λ integrates at that position in the E. coli chromosome. An attraction of Campbell’s proposal is that from it follow predictions that geneticists can test. For example, integration of the prophage into the E. coli
F i g u r e 5 - 31 A λ prophage can be
transferred to a recipient during conjugation, but the prophage triggers lysis, a process called zygotic induction, only if the recipient has no prophage already—that is, in the case shown in part a but not in part b.
Transfer of l prophage during conjugation can trigger lysis (a)
(b)
Hfr() F –
gal
gal
Hfr
F–
Nonimmune
lysis (zygotic induction)
Hfr() F –()
Hfr
gal
gal
F–
Immune
no lysis
20 0 CHA P TER 5 The Genetics of Bacteria and Their Viruses
F i g u r e 5 - 3 2 Reciprocal recombination takes place between a specific attachment site on the circular DNA and a specific region called the attachment site on the E. coli chromosome between the gal and bio genes.
l phage inserts by a crossover at a specific site phage
Attachment site
gal
....
bio
. . . . E. coli chromosome
Integration enzymes
d into E. coli chromosome integrate
bio
gal E. coli chromosome
chromosome should increase the genetic distance between flanking bacterial genes, as can be seen in Figure 5-32 for gal and bio. In fact, studies show that lysogeny does increase time-of-entry or recombination distances between the bacterial genes. This unique location of λ accounts for its specialized transduction.
Mechanism of specialized transduction As a prophage, λ always inserts between the gal region and the bio region of the host chromosome (Figure 5-33), and, in transduction experiments, as expected, λ can transduce only the gal and bio genes. How does λ carry away neighboring genes? The explanation lies, again, in an imperfect reversal of the Campbell insertion mechanism, like that for F ′ formation. The recombination event between specific regions of λ and the bacterial chromosome is catalyzed by a specialized phage-encoded enzyme system that uses the λ attachment site as a substrate. The enzyme system dictates that λ integrates only at a specific point between gal and bio in the chromosome (see Figure 5-33a). Furthermore, during lysis, the λ prophage normally excises at precisely the correct point to produce a normal circular λ chromosome, as seen in Figure 5-33b(i). Very rarely, excision is abnormal owing to faulty outlooping. In this case, the outlooping phage DNA can pick up a nearby gene and leave behind some phage genes, as seen in Figure 5-33b(ii). The resulting phage genome is defective because of the genes left behind, but it has also gained a bacterial gene, gal or bio. The abnormal DNA carrying nearby genes can be packaged into phage heads to produce phage particles that can infect other bacteria. These phages are referred to as λdgal (λ-defective gal) or λdbio. In the presence of a second, normal phage particle in a double infection, the λdgal can integrate into the chromosome at the λ attachment site (Figure 5-33c). In this manner, the gal genes in this case are transduced into the second host. K e y C o n c e p t Transduction occurs when newly forming phages acquire host genes and transfer them to other bacterial cells. Generalized transduction can transfer any host gene. It occurs when phage packaging accidentally incorporates bacterial DNA instead of phage DNA. Specialized transduction is due to faulty outlooping of the prophage from the bacterial chromosome, and so the new phage includes both phage and bacterial genes. The transducing phage can transfer only specific host genes.
5.6 Physical Maps and Linkage Maps Compared 201
Faulty outlooping produces l phage containing bacterial DNA (a) Production of lysogen
2
3
Attachment sites
1
gal
gal
bio
1
2
3
bio
(b) Production of initial lysate 3
2
1
gal bio (i) Normal outlooping
1
gal
bio Mixture
1
1 2
2
3
gal
dgal gal
2
3
3
bio (ii) Rare abnormal outlooping
bio
(c) Transduction by initial lysate helper
dgal 1
gal –
2
gal
1
2
3
bio
(i) Lysogenic transductants
2
1 gal gal –
bio (ii) Transductants produced by recombination
gal
bio
F i g u r e 5 - 3 3 The diagram shows how specialized transduction operates in phage λ.
(a) A crossover at the specialized attachment site produces a lysogenic bacterium. (b) The lysogenic bacterium can produce a normal λ (i) or, rarely, λdgal (ii), a transducing particle containing the gal gene. (c) gal + transductants can be produced by either (i) the coincorporation of λdgal and λ (acting as a helper) or (ii) crossovers flanking the gal gene, a rare event. The blue double boxes are the bacterial attachment site, the purple double boxes are the λ attachment site, and the pairs of blue and purple boxes are hybrid integration sites, derived partly from E. coli and partly from λ.
5.6 Physical Maps and Linkage Maps Compared Some very detailed chromosomal maps for bacteria have been obtained by combining the mapping techniques of interrupted mating, recombination mapping, transformation, and transduction. Today, new genetic markers are typically mapped first into a segment of about 10 to 15 map minutes by using interrupted mating. Then additional, closely linked markers can be mapped in a more finescale analysis with the use of P1 cotransduction or recombination.
202 CHA P TER 5 The Genetics of Bacteria and Their Viruses
F i g u r e 5 - 3 4 The 1963
B C A
0 / 90
85
tE
pyrE
xyl
A
Y
O
P
Z lac la pr cl o ph C p oA lo hoR * n *
* ,B pE u s s* rn pG* 11.5 su A* 15 glt c su L* x p ts l su ,B* mb E r tolA u K G p 16 aro T nicA E gal ) O (mglR
10
75
(chlB
15
)
72
glyS
70
gltH* aroA* pyrD pyrC purB (cat)
20
mtl 71
65
aroE A *spc argR d s A a *lin 60 yA glpD *er sp a 66 G glpR arg np lA p ) ma laS (a fda B * tC aro ) e B o m (bi 65 G s y *c A ab ) *p D 64 rg a ( 56 67
fuc
48
49
50
39
gua purG*
argA
(re
54
38
glyA tyrA aroF* uraP* pheA
*a ra su E pT cB ) galR lysA thyA
55
40
45
purC *ctr *nicB supN
rA
)
pS (tr
50
24
g ha pH ) su pD (su iA sh his d gn
*a r se gP rA *re c l *m ysC (da utS (da pA) pB *pr ) *rec d A
35
18
(phr) 2) (att434.8 att l chlA bioA urvB*
) sB (ft gS ar d) d (e f) (zw m) (so lP) (mg ) (tolC glpT purF aroC dsdC dsdA
st
30
17
supC,O supF* tdk* (chlC gaIU ) 25 att f 80 pab tonB B A aro trp B H aro * C cys 26 pp D B D pyr m s* E F a 36 ph n* O eS mo m tB 37 ot A uv rC
25
55
*
in
m
11
5
80
me
A
10
pr tfr oB A
76
74
2
argF
77
bgl (gad) *gltC
1
pro
78
m m etF e glp tB rh K a
P B C ilv O rb A pho s D S (da E rA) *tna R *tna A 73
80
79
(mutT) tonA (ast)
Stent, Molecular Biology of Bacterial Viruses.]
A
*serB *thyR (tp p) uvrA (lex) (trpR) hs p malB pgl *valS metA C) (gua pil *purH D B *purthl pyr fdp ) A,D (ace upM urA A *p *s c p m *cy H *a arg B arg C g ar gE ar c pp rts * d) (ra
D
B
C
thrA,D pyrA pdxA D A ara B leu I azi C aceE aceF ftsA * pan
A map of the E. coli genome obtained genetically
genetic map of E. coli genes with mutant phenotypes. Units are minutes, based on interrupted-mating and recombination experiments. Asterisks refer to map positions that are not as precise as the other positions. [ Data from G. S.
B A O
1
optA* dapD dapC* rpsB tsf gInD* sefA* cdsA* hlpA firA orf IpxA IpxB orf polC orf
tadE*
proS*
metD
pyrH*
acrC*
sefA* (tdi)
(envN )
pcnB garB* mrcB fhu popC*
panBCD
hpt
ssyD
spe
prlD* Irs* guaC nadC aroP orf aceE aceF Ipd
leu ilv mafB* fruR* ftsM* serR*
dadB
ara
ACBD
DE
DABC DBCA IH
AB
chlG* dapB car (rimG) rimF* mafA kefC folA apaH orf ksgA pdxA brnS* ilvJ
KJ
thr tolJ (toll) (popD) gprB dna ant* gprA*
ABC
rpsT orf ileS lspA
polB mraA,B* ftsl murE murF murG murC ddl frsQ ftsA ftsZ erivA orf secA mutT
Part of the physical map of the E. coli genome, obtained by sequencing
2 3 4 5 F i g u r e 5 - 3 5 A linear scale drawing of a sequenced 5-minute section of the 100-minute
1990 E. coli linkage map. The parentheses and asterisks indicate markers for which the exact location was unknown at the time of publication. Arrows above genes and groups of genes indicate the direction of transcription. [ Data from B. J. Bachmann, “Linkage Map of Escherichia coli K-12, Edition 8,” Microbiol. Rev. 54, 1990, 130–197.]
5.6 Physical Maps and Linkage Maps Compared 20 3
F i g u r e 5 - 3 6 This map was obtained from sequencing DNA and
Physical map of the E. coli genome
plotting gene positions. Key to components from the outside in: • The DNA replication origin and terminus are marked.
Re Repl plich ic o hore re1 1
O Or rigi igi n n
• The two scales are in DNA base pairs and in minutes. • The orange and yellow histograms show the distribution of genes on the two different DNA strands. • The arrows represent genes for rRNA (red) and tRNA (green). • The central “starburst” is a histogram of each gene with lines of length that reflect predicted level of transcription.
E. Coli E. Coli
[ F. R. Blattner et al., “The Complete Genome Sequence of Escherichia coli K-12,” Science 277, 1997, 1453–1462. DOI: 10.1126/science.277.5331.1453. Reprinted with permission from AAAS. Image courtesy of Dr. Guy Plunkett III.]
Te r Te min rm us inu s
Re Re plic pl ho ic re ho 2 re 2
By 1963, the E. coli map (Figure 5-34) already detailed the positions of approximately 100 genes. After 27 years of further refinement, the 1990 map depicted the positions of more than 1400 genes. Figure 5-35 shows a 5-minute section of the 1990 map (which is adjusted to a scale of 100 minutes). The complexity of these maps illustrates the power and sophistication of genetic analysis. How well do these maps correspond to physical reality? In 1997, the DNA sequence of the entire E. coli genome of 4,632,221 base pairs was completed, allowing us to compare the exact position of genes on the genetic map with the position of the corresponding coding sequence on the linear DNA sequence (the physical map). The full map is represented in Figure 5-36. Figure 5-37 makes a comparison for a segment of both maps. Clearly, the genetic map is a close match to the physical map. Chapter 4 considered some ways in which the physical map (usually the full genome sequence) can be useful in mapping new mutations. In bacteria, the technique of insertional mutagenesis is another way to zero in rapidly on a mutation’s position on a known physical map. The technique causes mutations through the random insertion of “foreign” DNA fragments. The inserts inactivate any gene in which they land by interrupting the transcriptional unit. Transposons are particularly useful inserts for this purpose in several model organisms, including bacteria. To map a new mutation, the procedure is as follows. The DNA of a transposon
F i g u r e 5 - 3 7 An alignment of the
genetic and physical maps. (a) Markers on the 1990 genetic map in the region near 60 and 61 minutes. (b) The exact positions of every gene, based on the complete sequence of the E. coli genome. (Not every gene is named in this map, for simplicity.) The elongated boxes are genes and putative genes. Each color represents a different type of function. For example, red denotes regulatory functions, and dark blue denotes functions in DNA replication, recombination, and repair. Lines between the maps in parts a and b connect the same gene in each map. [ Data from F. R. Blattner et al., “The Complete Science 277, l997, 1453–1462.]
Proportions of the genetic and physical maps are similar but not identical (a) cysC
cysH
eno
relA
argA
60
recC
ptr
mutH
thyA
61
(b) mutS
rpoS pcm
cysC
iap
cysH
eno
relA barA
syd
sdaC exo
gcvA
mltA argA
ptr
recC
thyA ptsP mutH
aas galR
araE
glyU
20 4 CHA P TER 5 The Genetics of Bacteria and Their Viruses
Transposon mutagenesis can be used to map a mutation in the genome sequence Wild-type cell
Transposon
Mutant phenotype induced by transposon insertion
Primed synthesis
Whole gene identified from genome sequence
F i g u r e 5 - 3 8 The insertion of a transposon inserts a
mutation into a gene of unknown position and function. The segment next to the transposon is replicated, sequenced, and matched to a segment in the complete genome sequence.
carrying a resistance allele or other selectable marker is introduced by transformation into bacterial recipients that have no active transposons. The transposons insert more or less randomly, and any that land in the middle of a gene cause a mutation. A subset of all mutants obtained will have phenotypes relevant to the bacterial process under study, and these phenotypes become the focus of the analysis. The beauty of inserting transposons is that, because their sequence is known, the mutant gene can be located and sequenced. DNA replication primers are created that match the known sequence of the transposon (see Chapter 10). These primers are used to initiate a sequencing analysis that proceeds outward from the transposon into the surrounding gene. The short sequence obtained can then be fed into a computer and compared with the complete genome sequence. From this analysis, the position of the gene and its full sequence are obtained. The function of a homolog of this gene might already have been deduced in other organisms. Hence, you can see that this approach (like that introduced in Chapter 4) is another way of uniting mutant phenotype with map position and potential function. Figure 5-38 summarizes the approach. As an aside in closing, it is interesting that many of the historical experiments revealing the circularity of bacterial and plasmid genomes coincided with the publication and popularization of J. R. R. Tolkien’s The Lord of the Rings. Consequently, a review of bacterial genetics at that time led off with the following quotation from the trilogy: One Ring to rule them all, One Ring to find them, One Ring to bring them all and in the darkness bind them.
s u m m a ry Advances in bacterial and phage genetics within the past 50 years have provided the foundation for molecular biology and cloning (discussed in later chapters). Early in this period, gene transfer and recombination were found to take place between different strains of bacteria. In bacteria, however, genetic material is passed in only one direction—for example, in Escherichia coli, from a donor cell (F+ or Hfr) to a recipient cell (F−). Donor ability is determined by the presence in the cell of a fertility factor (F), a type of plasmid. On occasion, the F factor present in the free state in F+ cells can integrate into the E. coli chromosome and form an Hfr cell. When this occurs, a fragment of donor chromosome can transfer into a recipient cell and subsequently recombine with the recipient chromosome. Because the F factor can insert at different places on the host chromosome, early investigators were able to piece the transferred fragments together to show that the E. coli chromosome is a single circle, or ring. Interruption of the transfer at different times has provided geneticists with an unconventional method (interrupted mating) for constructing a linkage map of the single chromosome of E. coli and other similar bacteria, in
which the map unit is a unit of time (minutes). In an extension of this technique, the frequency of recombinants between markers known to have entered the recipient can provide a finer-scale map distance. Several types of plasmids other than F can be found. R plasmids carry antibiotic-resistance alleles, often within a mobile element called a transposon. Rapid plasmid spread causes population-wide resistance to medically important drugs. Derivatives of such natural plasmids have become important cloning vectors, useful for gene isolation and study in all organisms. Genetic traits can also be transferred from one bacterial cell to another in the form of pieces of DNA taken into the cell from the extracellular environment. This process of transformation in bacterial cells was the first demonstration that DNA is the genetic material. For transformation to occur, DNA must be taken into a recipient cell, and recombination must then take place between a recipient chromosome and the incorporated DNA. Bacteria can be infected by viruses called bacteriophages. In one method of infection, the phage chromosome may enter
Solved Problems 20 5
the bacterial cell and, by using the bacterial metabolic machinery, produce progeny phages that burst the host bacterium. The new phages can then infect other cells. If two phages of different genotypes infect the same host, recombination between their chromosomes can take place. In another mode of infection, lysogeny, the injected phage lies dormant in the bacterial cell. In many cases, this dormant phage (the prophage) incorporates into the host chromosome and replicates with it. Either spontaneously or under appropriate stimulation, the prophage can leave its dormant state and lyse the bacterial host cell. A phage can carry bacterial genes from a donor to a recipient. In generalized transduction, random host DNA is incorporated
alone into the phage head during lysis. In specialized transduction, faulty excision of the prophage from a unique chromosomal locus results in the inclusion of specific host genes as well as phage DNA in the phage head. Today, a physical map in the form of the complete genome sequence is available for many bacterial species. With the use of this physical genome map, the map position of a mutation of interest can be precisely located. First, appropriate mutations are produced by the insertion of transposons (insertional mutagenesis). Then, the DNA sequence surrounding the inserted transposon is obtained and matched to a sequence in the physical map. This technique provides the locus, the sequence, and possibly the function of the gene of interest.
key terms auxotroph (p. 176) bacteriophage (phage) (p. 174) cell clone (p. 176) colony (p. 176) conjugation (p. 177) cotransductant (p. 198) λ attachment site (p. 199) donor (p. 178) double (mixed) infection (p. 194) double transformation (p. 191) endogenote (p. 186) exconjugant (p. 181) exogenote (p. 186) F+ (donor) (p. 179) F − (recipient) (p. 179) F ′ plasmid (p. 188) fertility factor (F ) (p. 179) generalized transduction (p. 197)
genetic marker (p. 177) Hfr (high frequency of recombination) (p. 179) horizontal transmission (p. 175) insertional mutagenesis (p. 203) interrupted mating (p. 181) lysate (p. 193) lysis (p. 193) lysogen (lysogenic bacterium) (p. 196) merozygote (p. 186) minimal medium (p. 176) mixed (double) infection (p. 194) origin (O) (p. 182) phage (bacteriophage) (p. 174) phage recombination (p. 176) plaque (p. 193) plasmid (p. 179) plating (p. 176)
prokaryote (p. 174) prophage (p. 196) prototroph (p. 176) R plasmid (p. 189) recipient (p. 178) resistant mutant (p. 177) rolling circle replication (p. 179) screen (p. 195) selective system (p. 195) specialized transduction (p. 199) temperate phage (p. 196) terminus (p. 183) transduction (p. 196) transformation (p. 191) unselected marker (p. 187) vertical transmission (p. 176) virulent phage (p. 196) virus (p. 174) zygotic induction (p. 199)
s olv e d p r obl e m s SOLVED PROBLEM 1. Suppose that a cell were unable to
carry out generalized recombination (rec−). How would this cell behave as a recipient in generalized and in specialized transduction? First, compare each type of transduction, and then determine the effect of the rec− mutation on the inheritance of genes by each process. Solution Generalized transduction entails the incorporation of chromosomal fragments into phage heads, which then infect recipient strains. Fragments of the chromosome are incorporated randomly into phage heads, and so any marker on the bacterial host chromosome can be transduced to another strain by generalized transduction. In contrast, specialized
transduction entails the integration of the phage at a specific point on the chromosome and the rare incorporation of chromosomal markers near the integration site into the phage genome. Therefore, only those markers that are near the specific integration site of the phage on the host chromosome can be transduced. Markers are inherited by different routes in generalized and specialized transduction. A generalized transducing phage injects a fragment of the donor chromosome into the recipient. This fragment must be incorporated into the recipient’s chromosome by recombination, with the use of the recipient’s recombination system. Therefore, a rec− recipient will not be able to incorporate fragments of DNA and cannot inherit markers by generalized transduction. On the other
20 6 CHA P TER 5 The Genetics of Bacteria and Their Viruses
hand, the major route for the inheritance of markers by specialized transduction is by integration of the specialized transducing particle into the host chromosome at the specific phage integration site. This integration, which sometimes requires an additional wild-type (helper) phage, is mediated by a phage-specific enzyme system that is independent of the normal recombination enzymes. Therefore, a rec− recipient can still inherit genetic markers by specialized transduction. SOLVED PROBLEM 2. In E. coli, four Hfr strains donate the following genetic markers, shown in the order donated:
Strain 1: Strain 2: Strain 3: Strain 4:
Q A B B
W X N Q
D P C W
M T A D
T M X M
All these Hfr strains are derived from the same F+ strain. What is the order of these markers on the circular chromosome of the original F+? Solution A two-step approach works well: (1) determine the underlying principle and (2) draw a diagram. Here the principle is clearly that each Hfr strain donates genetic markers from a fixed point on the circular chromosome and that the earliest markers are donated with the highest frequency. Because not all markers are donated by each Hfr, only the early markers must be donated for each Hfr. Each strain allows us to draw the following circles: Q
B Q
W D M T
Strain 1
A
X P
M T
Strain 2
B N C A
W D M
X
Strain 3
Strain 4
From this information, we can consolidate each circle into one circular linkage map of the order Q, W, D, M, T, P, X, A, C, N, B, Q. SOLVED PROBLEM 3. In an Hfr × F − cross, leu+ enters as the
first marker, but the order of the other markers is unknown. If the Hfr is wild type and the F − is auxotrophic for each marker in question, what is the order of the markers in a cross where leu+ recombinants are selected if 27 percent are ile+, 13 percent are mal+, 82 percent are thr +, and 1 percent are trp+? Solution Recall that spontaneous breakage creates a natural gradient of transfer, which makes it less and less likely for a recipient to receive later and later markers. Because we have selected for the earliest marker in this cross, the frequency of recombinants is a function of the order of entry for each marker. Therefore, we can immediately determine the order of the genetic markers simply by looking at the percentage of
recombinants for any marker among the leu+ recombinants. Because the inheritance of thr + is the highest, thr + must be the first marker to enter after leu. The complete order is leu, thr, ile, mal, trp. SOLVED PROBLEM 4. A cross is made between an Hfr that is
met + thi + pur + and an F − that is met − thi − pur−. Interruptedmating studies show that met + enters the recipient last, and so met + recombinants are selected on a medium containing supplements that satisfy only the pur and thi requirements. These recombinants are tested for the presence of the thi + and pur + alleles. The following numbers of individuals are found for each genotype: met+ thi+ pur+ 280 met+ thi+ pur- 0 met+ thi- pur+ 6 met+ thi- pur- 52 a. Why was methionine (Met) left out of the selection medium? b. What is the gene order? c. What are the map distances in recombination units? Solution a. Methionine was left out of the medium to allow selection for met + recombinants because met + is the last marker to enter the recipient. The selection for met + ensures that all the loci that we are considering in the cross will have already entered each recombinant that we analyze. b. Here, a diagram of the possible gene orders is helpful. Because we know that met enters the recipient last, there are only two possible gene orders if the first marker enters on the right: met, thi, pur or met, pur, thi. How can we distinguish between these two orders? Fortunately, one of the four possible classes of recombinants requires two additional crossovers. Each possible order predicts a different class that arises by four crossovers rather than two. For instance, if the order were met, thi, pur, then met + thi − pur + recombinants would be very rare. On the other hand, if the order were met, pur, thi, then the four-crossover class would be met + pur− thi +. From the information given in the table, the met + pur− thi + class is clearly the four-crossover class and therefore the gene order met, pur, thi is correct. c. Refer to the following diagram: met
15.4 m.u.
pur
1.8 m.u.
thi Hfr
met
pur
thi F
To compute the distance between met and pur, we compute the percentage of met + pur− thi −, which is 52/338 =
Problems 207
15.4 m.u. Similarly, the distance between pur and thi is 6/338 =1.8 m.u. SOLVED PROBLEM 5. Compare the mechanism of transfer
and inheritance of the lac + genes in crosses with Hfr, F +, and F ′ lac + strains. How would an F − cell that cannot undergo normal homologous recombination (rec−) behave in crosses with each of these three strains? Would the cell be able to inherit the lac + gene? Solution Each of these three strains donates genes by conjugation. In the Hfr and F + strains, the lac + genes on the host chromosome are donated. In the Hfr strain, the F factor is integrated into the chromosome in every cell, and so chromosomal markers can be efficiently donated, particularly if a marker is near the integration site of F and is donated early. The F + cell population contains a small percentage of Hfr cells, in which F is integrated into the chromosome. These cells are
responsible for the gene transfer displayed by cultures of F + cells. In the Hfr−- and F +-mediated gene transfer, inheritance requires the incorporation of a transferred fragment by recombination (recall that two crossovers are needed) into the F − chromosome. Therefore, an F − strain that cannot undergo recombination cannot inherit donor chromosomal markers even though they are transferred by Hfr strains or Hfr cells in F + strains. The fragment cannot be incorporated into the chromosome by recombination. Because these fragments do not possess the ability to replicate within the F − cell, they are rapidly diluted out during cell division. Unlike Hfr cells, F ′ cells transfer genes carried on the F ′ factor, a process that does not require chromosome transfer. In this case, the lac + genes are linked to the F ′ factor and are transferred with it at a high efficiency. In the F− cell, no recombination is required because the F ′ lac + strain can replicate and be maintained in the dividing F− cell population. Therefore, the lac + genes are inherited even in a rec− strain.
p r obl e m s Most of the problems are also available for review/grading through the launchpad/iga11e.
http://www.whfreeman.com/
Working with the Figures
1. In Figure 5-2, in which of the four processes shown can a complete bacterial genome be transferred from one cell to another? 2. In Figure 5-3, if the concentration of bacterial cells in the original suspension is 200/ml and 0.2 ml is plated onto each of 100 petri dishes, what is the expected average number of colonies per plate? 3. In Figure 5-5, a. Why do A− and B− cells, by themselves, not form colonies on the plating medium? b. What genetic event do the purple colonies in the middle plate represent? 4. In Figure 5-10c, what do the yellow dots represent? 5. In Figure 5-11, which donor alleles become part of the recombinant genome produced? 6. In Figure 5-12, a. Which Hfr gene enters the recipient last? (Which diagram shows it actually entering?) b. What is the maximum percentage of cases of transfer of this gene? c. Which genes have entered at 25 minutes? Could they all become part of a stable exconjugant genome? 7. In Figure 5-14, which is the last gene to be transferred into the F– from each of the five Hfr strains?
8. In Figure 5-15, how are each of the following genotypes produced? a. F+ a− c. F− a+ − − b. F a d. F+ a+ 9. In Figure 5-17, how many crossovers are required to produce a completely prototrophic exconjugant? 10. In Figure 5-18c, why is the crossover shown occurring in the orange segments of DNA? 11. In Figure 5-19, how many different bacterial species are shown as having contributed DNA to the plasmid pk214? 12. In Figure 5-25, can you point to any phage progeny that could transduce? 13. In Figure 5-28, what are the physical features of the plaques of recombinant phages? 14. In Figure 5-29, do you think that b+ could be transduced instead of a+? As well as a+? 15. In Figure 5-30, which genes show the highest frequencies of cotransduction? 16. In Figure 5-32, what do the half-red, half-blue segments represent? 17. In Figure 5-33, which is the rarest λ genotype produced in the initial lysate? 18. In Figure 5-38, precisely which gene is eventually identified from the genome sequence?
20 8 CHA P TER 5 The Genetics of Bacteria and Their Viruses
B a s i c P r obl e m s
19. Describe the state of the F factor in an Hfr, F+, and F− strain. 20. How does a culture of F+ cells transfer markers from the host chromosome to a recipient? 21. With respect to gene transfer and the integration of the transferred gene into the recipient genome, compare a. Hfr crosses by conjugation and generalized transduction. ′ lac and specialized b. F ′ derivatives such as F transduction. 22. Why is generalized transduction able to transfer any gene, but specialized transduction is restricted to only a small set? 23. A microbial geneticist isolates a new mutation in E. coli and wishes to map its chromosomal location. She uses interrupted-mating experiments with Hfr strains and generalized-transduction experiments with phage P1. Explain why each technique, by itself, is insufficient for accurate mapping. 24. In E. coli, four Hfr strains donate the following markers, shown in the order donated:
Strain 1: Strain 2: Strain 3: Strain 4:
M L A Z
Z A L M
X N B U
W C R R
C W U B
All these Hfr strains are derived from the same F+ strain. What is the order of these markers on the circular chromosome of the original F+? 25. You are given two strains of E. coli. The Hfr strain is arg+ ala+ glu+ pro+ leu+ T s; the F − strain is arg − ala − glu − pro − leu − T r. All the markers are nutritional except T, which determines sensitivity or resistance to phage T1. The order of entry is as given, with arg+ entering the recipient first and T s last. You find that the F − strain dies when exposed to penicillin (pens), but the Hfr strain does not (penr). How would you locate the locus for pen on the bacterial chromosome with respect to arg, ala, glu, pro, and leu? Formulate your answer in logical, well-explained steps, and draw explicit diagrams where possible. www www
Unpacking the Problem
26. A cross is made between two E. coli strains: Hfr arg + bio + leu+ × F− arg − bio − leu−. Interrupted mating studies show that arg+ enters the recipient last, and so arg+ recombinants are selected on a medium containing bio and leu only. These recombinants are tested for the presence of bio + and leu+. The following numbers of individuals are found for each genotype: arg+ bio+ leu+ 320 arg+ bio- leu+ 0 + + + arg bio leu 8 arg bio leu 48
a. What is the gene order? b. What are the map distances in recombination percentages? 27. Linkage maps in an Hfr bacterial strain are calculated in units of minutes (the number of minutes between genes indicates the length of time that it takes for the second gene to follow the first in conjugation). In making such maps, microbial geneticists assume that the bacterial chromosome is transferred from Hfr to F − at a constant rate. Thus, two genes separated by 10 minutes near the origin end are assumed to be the same physical distance apart as two genes separated by 10 minutes near the F − attachment end. Suggest a critical experiment to test the validity of this assumption. 28. A particular Hfr strain normally transmits the pro+ marker as the last one in conjugation. In a cross of this strain with an F − strain, some pro+ recombinants are recovered early in the mating process. When these pro+ cells are mixed with F −cells, the majority of the F− cells are converted into pro+ cells that also carry the F factor. Explain these results. 29. F ′ strains in E. coli are derived from Hfr strains. In some cases, these F ′ strains show a high rate of integration back into the bacterial chromosome of a second strain. Furthermore, the site of integration is often the site occupied by the sex factor in the original Hfr strain (before production of the F ′ strains). Explain these results. 30. You have two E. coli strains, F− str s ala− and Hfr str s ala+, in which the F factor is inserted close to ala+. Devise a screening test to detect strains carrying F ′ ala+. 31. Five Hfr strains A through E are derived from a single F + strain of E. coli. The following chart shows the entry times of the first five markers into an F− strain when each is used in an interrupted-conjugation experiment:
A
B
C
D
E
mal+ (1) ade+ (13) pro+ (3) pro+ (10) his+ (7) strs (11) his+ (28) met+ (29) gal+ (16) gal+ (17) ser+ (16) gal+ (38) xyl+ (32) his+ (26) pro+ (23) ade+ (36) pro+ (44) mal+ (37) ade+ (41) met+ (49) his+ (51) met+ (70) strs (47) ser+ (61) xyl+ (52) a. Draw a map of the F+ strain, indicating the positions of all genes and their distances apart in minutes. b. Show the insertion point and orientation of the F plasmid in each Hfr strain. c. In the use of each of these Hfr strains, state which allele you would select to obtain the highest proportion of Hfr exconjugants. 32. Streptococcus pneumoniae cells of genotype str s mtl - are transformed by donor DNA of genotype strr mtl + and (in a separate experiment) by a mixture of two DNAs with genotypes strr mtl - and str s mtl+. The accompanying table shows the results.
Problems 20 9
Percentage of cells transformed into
Transforming DNA
strr mtl-
strs mtl+
strr mtl+
strr mtl+ 4.3 0.40 0.17 strr mtl- + strs mtl+ 2.8 0.85 0.0066 a. What does the first row of the table tell you? Why? b. What does the second row of the table tell you? Why? 33. Recall that, in Chapter 4, we considered the possibility that a crossover event may affect the likelihood of another crossover. In the bacteriophage T4, gene a is 1.0 m.u. from gene b, which is 0.2 m.u. from gene c. The gene order is a, b, c. In a recombination experiment, you recover five double crossovers between a and c from 100,000 progeny viruses. Is it correct to conclude that interference is negative? Explain your answer.
d. Based on your answer to part c, explain the relative proportions of genotypes observed in experiment II. 36. Although most λ-mediated gal+ transductants are inducible lysogens, a small percentage of these transductants in fact are not lysogens (that is, they contain no integrated λ). Control experiments show that these transductants are not produced by mutation. What is the likely origin of these types? 37. An ade+ arg+ cys+ his+ leu+ pro+ bacterial strain is known to be lysogenic for a newly discovered phage, but the site of the prophage is not known. The bacterial map is arg
ade pro
34. E. coli cells were infected with two strains of T4 virus. One strain is minute (m), rapid lysis (r), and turbid (t); the other is wild type for all three markers. The lytic products of this infection were plated and classified. The resulting 10,342 plaques were distributed among eight genotypes as follows: m r t 3469 + + + 3727 m r + 854 m + t 163
m + + 521 + r t 475 + r + 171 + + t 963
35. With the use of P22 as a generalized transducing phage grown on a pur+ pro+ his+ bacterial donor, a recipient strain of genotype pur− pro− his− was infected and incubated. Afterward, transductants for pur+, pro+, and his+ were selected individually in experiments I, II, and III, respectively. a. What medium is used in each of these selection experiments? b. The transductants were examined for the presence of unselected donor markers, with the following results: I
pro- his- 86% pro+ his- 0% pro- his+ 10% pro+ his+ 4%
II pur- his- 44% pur+ his- 0% pur- his+ 54% pur+ his+ 2%
What is the order of the bacterial genes? c. Which two genes are closest together?
The lysogenic strain is used as a source of the phage, and the phages are added to a bacterial strain of genotype ade - arg - cys - his- leu- pro-. After a short incubation, samples of these bacteria are plated on six different media, with the supplementations indicated in the following table. The table also shows whether colonies were observed on the various media. Nutrient supplementation in medium Presence Medium Ade Arg Cys His Leu Pro of colonies
a. What are the linkage distances between m and r, between r and t, and between m and t. b. Determine the linkage order for the three genes. c. What is the coefficient of coincidence (see Chapter 4) in this cross? What does it signify?
his
cys leu
III pur- pro- 20% pur+ pro- 14% pur- pro+ 61% pur+ pro+ 5%
1 2 3 4 5 6
- + + + + +
+ - + + + +
+ + - + + +
+ + + - + +
+ + + + - +
+ N + N + C + N + C - N
(In this table, a plus sign indicates the presence of a nutrient supplement, a minus sign indicates that a supplement is not present, N indicates no colonies, and C indicates colonies present.) a. What genetic process is at work here? b. What is the approximate locus of the prophage? 38. In a generalized-transduction system using P1 phage, the donor is pur + nad + pdx − and the recipient is pur − nad − pdx +. The donor allele pur + is initially selected after transduction, and 50 pur + transductants are then scored for the other alleles present. Here are the results: Genotype
Number of colonies
nad+ pdx+ 3 nad+ pdx- 10 nad- pdx+ 24 nad- pdx- 13 50
210 CHA P TER 5 The Genetics of Bacteria and Their Viruses
a. What is the cotransduction frequency for pur and nad? b. What is the cotransduction frequency for pur and pdx? c. Which of the unselected loci is closest to pur? d. Are nad and pdx on the same side or on opposite sides of pur? Explain. (Draw the exchanges needed to produce the various transformant classes under either order to see which requires the minimum number to produce the results obtained.) 39. In a generalized-transduction experiment, phages are collected from an E. coli donor strain of genotype cys + leu + thr + and used to transduce a recipient of genotype cys- leu- thr -. Initially, the treated recipient population is plated on a minimal medium supplemented with leucine and threonine. Many colonies are obtained. a. What are the possible genotypes of these colonies? b. These colonies are then replica plated onto three different media: (1) minimal plus threonine only, (2) minimal plus leucine only, and (3) minimal. What genotypes could, in theory, grow on these three media? c. Of the original colonies, 56 percent are observed to grow on medium 1, 5 percent on medium 2, and no colonies on medium 3. What are the actual genotypes of the colonies on media 1, 2, and 3? d. Draw a map showing the order of the three genes and which of the two outer genes is closer to the middle gene. 40. Deduce the genotypes of the following E. coli strains 1 through 4: Minimal
Minimal plus arginine
1
2
1
2
3
4
3
4
Minimal plus methionine
Minimal plus arginine and methionine
1
2
1
2
3
4
3
4
41. In an interrupted-conjugation experiment in E. coli, the pro gene enters after the thi gene. A pro + thi + Hfr is crossed with a pro - thi - F - strain, and exconjugants are plated on medium containing thiamine but no proline. A total of 360 colonies are observed, and they are isolated and cultured on fully supplemented medium. These cultures are then tested for their ability to grow on medium containing no proline or thiamine (minimal medium),
and 320 of the cultures are found to be able to grow but the remainder cannot. a. Deduce the genotypes of the two types of cultures. b. Draw the crossover events required to produce these genotypes. c. Calculate the distance between the pro and thi genes in recombination units. www Unpacking Problem 41 www
1. What type of organism is E. coli? 2. What does a culture of E. coli look like? 3. On what sort of substrates does E. coli generally grow in its natural habitat? 4. What are the minimal requirements for E. coli cells to divide? 5. Define the terms prototroph and auxotroph. 6. Which cultures in this experiment are prototrophic, and which are auxotrophic? 7. Given some strains of unknown genotype regarding thiamine and proline, how would you test their genotypes? Give precise experimental details, including equipment. 8. What kinds of chemicals are proline and thiamine? Does it matter in this experiment? 9. Draw a diagram showing the full set of manipulations performed in the experiment. 10. Why do you think the experiment was done? 11. How was it established that pro enters after thi? Give precise experimental steps. 12. In what way does an interrupted-mating experiment differ from the experiment described in this problem? 13. What is an exconjugant? How do you think that exconjugants were obtained? (It might include genes not described in this problem.) 14. When the pro gene is said to enter after thi, does it mean the pro allele, the pro+ allele, either, or both? 15. What is “fully supplemented medium” in the context of this question? 16. Some exconjugants did not grow on minimal medium. On what medium would they grow? 17. State the types of crossovers that take part in Hfr × Frecombination. How do these crossovers differ from crossovers in eukaryotes? 18. What is a recombination unit in the context of the present analysis? How does it differ from the map units used in eukaryote genetics?
Problems 211
42. A generalized transduction e xperiment uses a metE+ pyrD+ strain as donor and metE - pyrD- as recipient. metE+ transductants are selected and then tested for the pyrD+ allele. The following numbers were obtained: metE+ pyrD- 857 metE+ pyrD+ 1 Do these results suggest that these loci are closely linked? What other explanations are there for the lone “double”? 43. An strain was infected with transducing phage, and the lysate was used to transduce metF - recipients on medium containing arginine but no methionine. The metF+ transductants were then tested for arginine requirement: most were argC+ but a small percentage were found to be argC -. Draw diagrams to show the likely origin of the argC+ and argC - strains. argC -
C h a ll e n g i n g P r obl e m s
44. Four E. coli strains of genotype a+ b- are labeled 1, 2, 3, and 4. Four strains of genotype a- b+ are labeled 5, 6, 7, and 8. The two genotypes are mixed in all possible combinations and (after incubation) are plated to determine the frequency of a+ b+ recombinants. The following results are obtained, where M = many recombinants, L = low numbers of recombinants, and 0 = no recombinants: 1 2 3 4 5 6 7 8
b. The following table shows the number of colonies on each type of agar for samples taken at various times after the strains are mixed. Use this information to determine the order of genes a, b, and c. Time of sampling (minutes)
Number of colonies on agar of type 1 2 3
5
7.5
102
10
202
12.5
301
74
15
400
151
17.5
404
49
225
20
401
101
253
25
398
103
252
c. From each of the 25-minute plates, 100 colonies are picked and transferred to a petri dish containing agar with all the nutrients except D. The numbers of colonies that grow on this medium are 90 for the sample from agar type 1, 52 for the sample from agar type 2, and 9 for the sample from agar type 3. Using these data, fit gene d into the sequence of a, b, and c. d. At what sampling time would you expect colonies to first appear on agar containing C and streptomycin but no A or B?
0 M M 0 0 M M 0 L 0 0 M 0 L L 0
On the basis of these results, assign a sex type (either Hfr, F+, or F-) to each strain. 45. An Hfr strain of genotype a+ b+ c+ d - str s is mated with a female strain of genotype a- b- c- d+ str r. At various times, mating pairs are separated by vigorously shaking the culture. The cells are then plated on three types of agar, as shown below, where nutrient A allows the growth of a - cells; nutrient B, of b - cells; nutrient C, of c - cells; and nutrient D, of d - cells. (A plus indicates the presence of streptomycin or a nutrient, and a minus indicates its absence.)
Agar type
Str
A
B
C
D
1 2 3
+ + +
+ - +
+ + -
- + +
+ + +
a. What donor genes are being selected on each type of agar?
46. In the cross Hfr aro+ arg + ery r str s × F - aro - arg - ery s str r, the markers are transferred in the order given (with aro + entering first), but the first three genes are very close together. Exconjugants are plated on a medium containing Str (streptomycin, to kill Hfr cells), Ery (erythromycin), Arg (arginine), and Aro (aromatic amino acids). The following results are obtained for 300 colonies isolated from these plates and tested for growth on various media: on Ery only, 263 strains grow; on Ery + Arg, 264 strains grow; on Ery + Aro, 290 strains grow; on Ery + Arg + Aro, 300 strains grow. a. Draw up a list of genotypes, and indicate the number of individuals in each genotype. b. Calculate the recombination frequencies. c. Calculate the ratio of the size of the arg-to-aro region to the size of the ery-to-arg region. 47. A bacterial transformation is performed with a donor strain that is resistant to four drugs, A, B, C, and D, and a recipient strain that is sensitive to all four drugs. The resulting recipient cell population is divided and plated on
212 CHA P TER 5 The Genetics of Bacteria and Their Viruses
media containing various combinations of the drugs. The following table shows the results. Drugs added
Number of colonies
Drugs added
Number of colonies
None 10,000 BC 50 A 1155 BD 48 B 1147 CD 785 C 1162 ABC 31 D 1140 ABD 43 AB 47 ACD 631 AC 641 BCD 35 AD 941 ABCD 29 a. One of the genes is distant from the other three, which appear to be closely linked. Which is the distant gene? b. What is the likely order of the three closely linked genes? 48. You have two strains of λ that can lysogenize E. coli; their linkage maps are as follows: Strain X c
d
Strain Y b
1
2
3
a
c
d
b
1
2
3
a
The segment shown at the bottom of the chromosome, designated 1–2–3, is the region responsible for pairing and crossing over with the E. coli chromosome. (Keep the markers on all your drawings.) a. Diagram the way in which λ strain X is inserted into the E. coli chromosome (so that the E. coli is lysogenized). b. The bacteria that are lysogenic for strain X can be superinfected by using strain Y. A certain percentage of these superinfected bacteria become “doubly” lysogenic (that is, lysogenic for both strains). Diagram how it will take place. (Don’t worry about how double lysogens are detected.) c. Diagram how the two λ prophages can pair d. Crossover products between the two prophages can be recovered. Diagram a crossover event and the consequences. 49. You have three strains of E. coli. Strain A is F ′ cys+ trp1/ cys + trp1 (that is, both the F ′ and the chromosome carry cys + and trp1, an allele for tryptophan requirement). Strain B is F -cys - trp2 Z (this strain requires cysteine for growth and carries trp2, another allele causing a tryptophan requirement; strain B is lysogenic for the generalized transducing phage Z ). Strain C is F - cys + trp1 (it is
an F - derivative of strain A that has lost the F ′ ). How would you determine whether trp1 and trp2 are alleles of the same locus? (Describe the crosses and the results expected.) 50. A generalized transducing phage is used to transduce an a - b - c - d - e - recipient strain of E. coli with an a + b + c + d + e + donor. The recipient culture is plated on various media with the results shown in the following table. (Note that a - indicates a requirement for A as a nutrient, and so forth.) What can you conclude about the linkage and order of the genes? Compounds added to minimal medium
Presence (+) or absence (-) of colonies
CDE BDE BCE BCD ADE ACE ACD ABE ABD ABC
+ + + -
51. In 1965, Jon Beckwith and Ethan Signer devised a method of obtaining specialized transducing phages carrying the lac region. They knew that the integration site, designated att80, for the temperate phage φ80 (a relative of phage λ) was located near tonB, a gene that confers resistance to the virulent phage T1: tonB
att80
They used an F ′ lac + plasmid that could not replicate at high temperatures in a strain carrying a deletion of the lac genes. By forcing the cell to remain lac + at high temperatures, the researchers could select strains in which the plasmid had integrated into the chromosome, thereby allowing the F ′ lac to be maintained at high temperatures. By combining this selection with a simultaneous selection for resistance to T1 phage infection, they found that the only survivors were cells in which the F ′ lac had integrated into the tonB locus, as shown here: tonB
F lac
att80
This result placed the lac region near the integration site for phage φ80. Describe the subsequent steps that the researchers must have followed to isolate the specialized
Problems 213
transducing particles of phage φ80 that carried the lac region. 52. Wild-type E. coli takes up and concentrates a certain red food dye, making the colonies blood red. Transposon mutagenesis was used, and the cells were plated on food dye. Most colonies were red, but some colonies did not take up dye and appeared white. In one white colony, the DNA surrounding the transposon insert was sequenced,
with the use of a DNA replication primer identical with part of the end of the transposon sequence, and the sequence adjacent to the transposon was found to correspond to a gene of unknown function called atoE, spanning positions 2.322 through 2.324 Mb on the map (numbered from an arbitrary position zero). Propose a function for atoE. What biological process could be investigated in this way, and what other types of white colonies might be expected?
This page intentionally left blank
344
6
C h a p t e r
Gene Interaction
Learning Outcomes After completing this chapter, you will be able to • Design experiments to test two or more mutations for allelism, using progeny ratios or using complementation tests. • Infer various types of dominance based on the phenotypes of heterozygotes. • Recognize the diagnostics for the presence of a lethal allele. • Infer interaction of different genes, based on modified Mendelian ratios. • Formulate reasonable molecular hypotheses to explain various types of gene interaction. • Recognize the diagnostics for variations in penetrance and expressivity of genotypes. The colors of peppers are determined by the interaction of several genes. An allele Y promotes the early elimination of chlorophyll (a green pigment), whereas y does not. Allele R determines red and r determines yellow carotenoid pigments. Alleles c1 and c2 of two different genes down-regulate the amounts of carotenoids, causing the lighter shades. Orange is down-regulated red. Brown is green plus red. Pale yellow is down-regulated yellow. [ Anthony Griffiths.]
• Predict progeny of crosses in which genes show one or more of the above types of interaction.
outline 6.1 Interactions between the alleles of a single gene: variations on dominance 6.2 Interaction of genes in pathways 6.3 Inferring gene interactions 6.4 Penetrance and expressivity
215
216 CHA P TER 6 Gene Interaction
T
he thrust of our presentation in the book so far has been to show how geneticists identify a gene that affects some biological property of interest. We have seen how the approaches of forward genetics can be used to identify individual genes. The researcher begins with a set of mutants, and then crosses each mutant with the wild type to see if the mutant shows single-gene inheritance. The cumulative data from such a research program would reveal a set of genes that all have roles in the development of the property under investigation. In some cases, the researcher may be able to identify specific biochemical functions for many of the genes by comparing gene sequences with those of other organisms. The next step, which is a greater challenge, is to deduce how the genes in a set interact to influence phenotype. How are the gene interactions underlying a property deduced? One molecular approach is to analyze protein interactions directly in vitro by using one protein as “bait” and observing which other cellular proteins attach to it. Proteins that are found to bind to the bait are candidates for interaction in the living cell. Another molecular approach is to analyze mRNA transcripts. The genes that collaborate in some specific developmental process can be defined by the set of RNA transcripts present when that process is going on, a type of analysis now carried out with the use of genome chips (see Chapter 14). Finally, gene interactions and their significance in shaping phenotype can be deduced by genetic analysis, which is the focus of this chapter. Gene interactions can be classified broadly into two categories. The first category consists of interactions between alleles of one locus, broadly speaking variations on dominance. In earlier chapters, we dealt with full dominance and full recessiveness, but as we shall see in this chapter, there are other types of dominance, each with their own underlying cell biology. Although this information does not address the range of genes affecting a function, a great deal can be learned of a gene’s role by considering allelic interactions. The second category consists of interactions between two or more loci. These interactions reveal the number and types of genes in the overall program underlying a particular biological function.
6.1 Interactions Between the Alleles of a Single Gene: Variations on Dominance There are thousands of different ways to alter the sequence of a gene, each producing a mutant allele, although only some of these mutant alleles will appear in a real population. The known mutant alleles of a gene and its wild-type allele are referred to as multiple alleles or an allelic series. One of the tests routinely performed on a new mutant allele is to see if it is dominant or recessive. Basic information about dominance and recessiveness is useful in working with the new mutation and can be a source of insight into the way the gene functions, as we will see in the examples. Dominance is a manifestation of how the alleles of a single gene interact in a heterozygote. In any experiment the interacting alleles may be wild type and mutant alleles (+/m) or two different mutant alleles (m1/m2). Several types of dominance have been discovered, each representing a different type of interaction between alleles.
Complete dominance and recessiveness The simplest type of dominance is full, or complete, dominance, which we examined in Chapter 2. A fully dominant allele will be expressed in the phenotype when only one copy is present, as in a heterozygote, whereas the alternative allele will be fully recessive. In full dominance, the homozygous dominant cannot
6.1 Interactions Between the Alleles of a Single Gene: Variations on Dominance 217
Mutations of haplosufficient genes are recessive
Protein
Homozygous wild type
Heterozygote
Homozygous recessive mutant
+/+
+/m
m /m
Functional
Functional
Nonfunctional
+
+
m
+
m
m
Functional
Nonfunctional
Nonfunctional
mRNA
Chromosome
Chromosome
mRNA
Protein
be distinguished from the heterozygote; that is, at the phenotypic level, A/A = A/a. As mentioned earlier, phenylketonuria (PKU) and many other single-gene human diseases are fully recessive, whereas their wild-type alleles are dominant. Other single-gene diseases such as achondroplasia are fully dominant, whereas, in those cases, the wild-type allele is recessive. How can these dominance relations be interpreted at the cellular level? The disease PKU is a good general model for recessive mutations. Recall that PKU is caused by a defective allele of the gene encoding the enzyme phenylalanine hydroxylase (PAH). In the absence of normal PAH, the phenylalanine entering the body in food is not broken down and hence accumulates. Under such conditions, phenylalanine is converted into phenylpyruvic acid, which is transported to the brain through the bloodstream and there impedes normal development, leading to mental retardation. The reason that the defective allele is recessive is that one “dose” of the wild-type allele P produces enough PAH to break down the phenylalanine entering the body. Thus, the PAH wild-type allele is said to be haplosufficient. Haplo means a haploid dose (one) and sufficient refers to the ability of that single dose to produce the wild-type phenotype. Hence, both P/P (two doses) and P/p (one dose) have enough PAH activity to result in the normal cellular chemistry. People with p/p have zero doses of PAH activity. Figure 6-1 illustrates this general notion. How can we explain fully dominant mutations? There are several molecular mechanisms for dominance. A regularly encountered mechanism is that the wildtype allele of a gene is haploinsufficient. In haploinsufficiency, one wild-type dose is not enough to achieve normal levels of function. Assume that 16 units of a gene’s product are needed for normal chemistry and that each wild-type allele
F i g u r e 6 -1 In the heterozygote, even
though the mutated copy of the gene produces nonfunctional protein, the wild-type copy generates enough functional protein to produce the wild-type phenotype.
218 CHA P TER 6 Gene Interaction
can make 10 units. Two wild-type alleles will produce 20 units of product, well over the minimum. But consider what happens if one of the mutations is a null mutation, which produces a nonfunctional protein. A null mutation in combination with a single wild-type allele would produce 10 + 0 = 10 units, well below the minimum. Hence, the heterozygote (wild type/null) is mutant, and the mutation is, by definition, dominant. In mice, the gene Tbx1 is haploinsufficient. This gene encodes a transcription-regulating protein (a transcription factor) that acts on genes responsible for the development of the pharynx. A knockout of one wildtype allele results in an inadequate concentration of the regulatory protein, which results in defects in the development of the pharyngeal arteries. The same haploinsufficiency is thought to be responsible for DiGeorge syndrome in humans, a condition with cardiovascular and Two models for dominance craniofacial abnormalities. of a mutation Another important type of dominant mutation is called a dominant negative. Polypeptides with this type of mutation act as “spoilers” or “rogues.” In some cases, the gene Model 1: Phenotype Model 2: Haploinsufficiency Dominant negative product is a unit of a homodimeric protein, a protein composed of two units of the same type. In the heterozygote (+/M), the mutant polypeptide binds to the wild-type polypeptide and acts as a spoiler by distorting it or otherwise +/+ interfering with its function. The same type of spoiling can also hinder the functioning of a heterodimer composed of 2 “doses” of product Dimer Wild type polypeptides from different genes. In other cases, the gene product is a monomer, and, in these situations, the mutant binds the substrate, and acts as a spoiler by hindering the M/M Mutant ability of the wild-type protein to bind to the substrate. An example of mutations that can act as dominant negatives is found in the gene for collagen protein. Some muta0 “dose” tions in this gene give rise to the human phenotype osteogenesis imperfecta (brittle-bone disease). Collagen is a connective-tissue protein formed of three monomers inter+/M Mutant twined (a trimer). In the mutant heterozygote, the abnormal protein wraps around one or two normal ones and distorts 1 “dose” (inadequate) the trimer, leading to malfunction. In this way, the defective collagen acts as a spoiler. The difference between haploinsufficiency and the action of a dominant negative as causes of dominance of a F i g u r e 6 -2 A mutation may be mutation is illustrated in Figure 6-2. dominant because (left) a single wild-type gene does not produce enough protein product for proper function or (right) the mutant allele acts as a dominant negative that produces a “spoiler” protein product.
K e y C o n c e p t For most genes, a single wild-type copy is adequate for full expression (such genes are haplosufficient), and their null mutations are fully recessive. Harmful mutations of haploinsufficient genes are often dominant. Mutations in genes that encode units in homo- or heterodimers can behave as dominant negatives, acting through “spoiler” proteins.
Incomplete dominance Four-o’clocks are plants native to tropical America. Their name comes from the fact that their flowers open in the late afternoon. When a pure-breeding wild-type four-o’clock line having red petals is crossed with a pure line having white petals, the F1 has pink petals. If an F2 is produced by selfing the F1, the result is
1 4
of the plants have red petals
1 2
of the plants have pink petals
1 4
of the plants have white petals
6.1 Interactions Between the Alleles of a Single Gene: Variations on Dominance 219
Figure 6-3 shows these phenotypes. Incomplete dominance From this 1 : 2 : 1 ratio in the F2, we can deduce that the inheritance pattern is based on two alleles of a single gene. However, the heterozygotes (the F1 and half the F2) are intermediate in phenotype. By inventing allele symbols, we can list the genotypes of the four-o’clocks in this experiment as c+/c+ (red), c/c (white), and c+/c (pink). The occurrence of the intermediate phenotype suggests an incomplete dominance, the term used to describe the general case in which the phenotype of a heterozygote is intermediate between those of the two homozygotes, on some quantitative scale of measurement. How do we explain incomplete dominance at the molecular level? In incomplete dominance, each wild-type allele generally produces a set dose of its proF i g u r e 6 - 3 In snapdragons, a tein product. The number of doses of a wild-type allele determines the concentraheterozygote is pink, intermediate tion of a chemical made by the protein, such as pigment. In the four-o’clock plant, between the two homozygotes red and two doses produce the most copies of transcript, thus producing the greatest white. The pink heterozygote amount of protein and, hence, the greatest amount of pigment, enough to make demonstrates incomplete dominance. the flower petals red. One dose produces less pigment, and so the petals are pink. [ John Kaprielian/Science Source.] A zero dose produces no pigment.
ANIMATED ART: Molecular allele interactions
Codominance Another variation on the theme of dominance is codominance, the expression of both alleles of a heterozygote. A clear example is seen in the human ABO blood groups, where there is codominance of antigen alleles. The ABO blood groups are determined by three alleles of one gene. These three alleles interact in several ways to produce the four blood types of the ABO system. The three major alleles are i, IA, and IB, but a person can have only two of the three alleles or two copies of one of them. The combinations result in six different genotypes: the three homozygotes and three different types of heterozygotes, as follows. Genotype
Blood type
IA/IA, IA/i A IB/IB, IB/i B IA/IB AB i/i O In this allelic series, the alleles determine the presence and form of a complex sugar molecule present on the surface of red blood cells. This sugar molecule is an antigen, a cell-surface molecule that can be recognized by the immune system. The alleles IA and IB determine two different forms of this cell-surface molecule. However, the allele i results in no cell-surface molecule of this type (it is a null allele). In the genotypes IA/i and I B/i, the alleles IA and I B are fully dominant over i. However, in the genotype IA/I B, each of the alleles produces its own form of the cell-surface molecule, and so the A and B alleles are codominant. The human disease sickle-cell anemia illustrates the somewhat arbitrary ways in which we classify dominance. The gene concerned encodes the molecule hemoglobin, which is responsible for transporting oxygen in blood vessels and is the
220 CHA P TER 6 Gene Interaction
major constituent of red blood cells. There are two main alleles HbA and HbS, and the three possible genotypes have different phenotypes, as follows:
Sickled and normal red blood cells
HbA/HbA: normal; red blood cells never sickle S S Hb /Hb : severe, often fatal anemia; abnormal hemoglobin causes red blood cells to have sickle shape A S Hb /Hb : no anemia; red blood cells sickle only under low oxygen concentrations Figure 6-4 shows an electron micrograph of blood cells including some sickled cells. In regard to the presence or absence of anemia, the Hb A allele is dominant. In the heterozygote, a single Hb A allele produces enough functioning hemoglobin to prevent anemia. In regard to blood-cell shape, however, there is incomplete dominance, as shown by the fact that, in the heterozygote, many of the cells have a slight sickle shape. Finally, in regard to hemoglobin itself, there is codominance. The alleles Hb A and Hb S encode two different forms of hemoglobin that differ by a single amino acid, and both forms are synthesized in the heterozygote. The A and S forms of hemoglobin can be separated by electrophoresis because it happens that they have different charges (Figure 6-5). We see that homozygous Hb A/HbA people have one type of hemoglobin (A), and anemics have another (type S), which moves more slowly in the electric field. The heterozygotes have both types, A and S. In other words, there is codominance at the molecular level. The fascinating population genetics of the Hb A and Hb S alleles will be considered in Chapter 20. Sickle-cell anemia illustrates the arbitrariness of the terms dominance, incomplete dominance, and codominance. The type of dominance inferred depends on the phenotypic level at which the assay is made—organismal, cellular, or molecular. Indeed, caution should be applied to many of the categories that scientists use to classify structures and processes; these categories are devised by humans for the convenience of analysis.
F i g u r e 6 - 4 The sickle-shaped cell is caused by a single mutation in the gene for hemoglobin. [ Eye of Science/Science Source.]
K e y C o n c e p t In general, three main types of dominance can be distinguished: full dominance, incomplete dominance, and codominance. The type of dominance is determined by the molecular functions of the alleles of a gene and by the investigative level of analysis.
Heterozygotes can express the protein product of both alleles Phenotype Genotype
Normal HbA / HbA
Sickle-cell Sickle-cell anemia trait HbS / HbS HbS / HbA
Origin
Migration
Positions to which hemoglobins have migrated Hemoglobin types present
A
S
S and A
F i g u r e 6 - 5 The electrophoresis of normal and mutant hemoglobins. Shown are results produced by hemoglobin from a person with sickle-cell trait (a heterozygote), a person with sicklecell anemia, and a normal person. The smudges show the positions to which the hemoglobins migrate on the starch gel.
Introduction to Genetic Analysis, 11e
The leaves of clover plants show several variations on the dominance theme. Clover is the common name for plants of the genus Trifolium. There are many species. Some are native to North America, whereas others grow there as introduced weeds. Much genetic research has been done with white clover, which shows considerable variation among individual plants in the curious V, or chevron, pattern on the leaves. The different chevron forms (and the absence of chevrons) are determined by a series of seven alleles, as seen in Figure 6-6, which shows the many different types of interactions possible for even one allele. In most practical cases many alleles of a gene can be found together in a population, constituting an allelic series. The phenotypes shown by the allelic combinations are many and varied, reflecting the relative nature of dominance: an allele can show dominance with one partner but not with another. Hence, the complexity illustrated by the ABO blood type system is small compared with that in a case such as clover chevrons.
Recessive lethal alleles An allele that is capable of causing the death of an organism is called a lethal allele. In the characterization of a set of
6.1 Interactions Between the Alleles of a Single Gene: Variations on Dominance 221
newly discovered mutant alleles, a recessive mutation is sometimes found to be lethal. This information is potentially useful in that it shows that the newly discovered gene (of yet unknown function) is essential to the organism’s operation. Essential genes are those without which an organism dies. (An example of an essential gene might be a ribosomal gene without which no protein would be made.) Indeed, with the use of modern DNA technology, a null mutant allele of a gene of interest can now be made intentionally and made homozygous to see if it is lethal and under which environmental conditions. Lethal alleles are also useful in determining the developmental stage at which the gene normally acts. In this case, geneticists look for whether death from a lethal mutant allele occurs early or late in the development of a zygote. The phenotype associated with death can also be informative in regard to gene function; for example, if a certain organ appears to be abnormal, the gene is likely to be expressed in that organ. What is the diagnostic test for lethality? The test is well illustrated by one of the prototypic examples of a lethal allele, a coat-color allele in mice (see the Model Organism box on page 222). Normal wild-type mice have coats with a rather dark overall pigmentation. A mutation called yellow (a lighter coat color) shows a curious inheritance pattern. If any yellow mouse is mated with a homozygous wild-type mouse, a 1 : 1 ratio of yellow to wild-type mice is always observed in the progeny. This result suggests that a yellow mouse is always heterozygous for the yellow allele and that the yellow allele is dominant over wild type. However, if any two yellow mice are crossed with each other, the result is always as follows: yellow × yellow →
2 3
yellow,
1 3
wild type
Figure 6-7 shows a typical litter from a cross between yellow mice. How can the 2 : 1 ratio be explained? The results make sense if the yellow allele is assumed to be lethal when homozygous. The yellow allele is known to be of a coat-color gene called A. Let’s call it AY. Hence, the results of crossing two yellow mice are
Seven alleles and their interactions in leaf patterning of clover
vv
V lV l
V hV h
V l Vh
V fV f
V lV f
V baV ba
V lV ba
V hV ba
V fV ba
V bV b
V lV b
V hV b
V fV b
V by V by
V l V by
V hV f
V h V by
V f V by
V ba V b
V ba V by
V b V by
F i g u r e 6 - 6 Multiple alleles determine the chevron pattern on the leaves of white clover. The genotype of each plant is shown below it. There is a variety of dominance interactions. [ Research by W. Ellis Davies.]
AY/A × AY/A 1 Y Progeny A /AY lethal 4 1 Y A /A yellow 2 1 A/A 4
A recessive lethal allele, yellow coat
wild type
The expected monohybrid ratio of 1 : 2 : 1 would be found among the zygotes, but it is altered to a 2 : 1 ratio in the progeny actually seen at birth because zygotes with a lethal AY/AY genotype do not survive to be counted. This hypothesis is supported by the removal of uteri from pregnant females of the yellow × yellow cross; one-fourth of the embryos are found to be dead. The AY allele produces effects on two characters: coat color and survival. It is entirely possible, however, that both effects of the AY allele result from the same basic cause, which promotes yellowness of coat in a single dose and death in a
F i g u r e 6 -7 A litter from a cross between two mice
heterozygous for the dominant yellow coat-color allele. The allele is lethal in a double dose. Not all progeny are visible. [ Anthony Griffiths.]
222 CHA P TER 6 Gene Interaction
Model Organism
Mouse
The laboratory mouse is descended from the house mouse Mus musculus. The pure lines used today as standards are derived from mice bred in past centuries by mouse “fanciers.” Among model organisms, it is the one whose genome most closely resembles the human genome. Its diploid chromosome number is 40 (compared with 46 in humans), and the genome is slightly smaller than that of humans (the human genome being 3000 Mb) and contains approximately the same number of genes (current estimate 25,000). Furthermore, all mouse genes seem to have counterparts in humans. A large proportion of genes are arranged in blocks in exactly the same positions as those of humans. Research on the Mendelian genetics of mice began early in the twentieth century. One of the most important early contributions was the elucidation of the genes that control coat color and pattern. Genetic control of the mouse coat has provided a model for all mammals, including cats, dogs, horses, and cattle. A great deal of work was also done on mutations induced by radiation and chemicals. Mouse genetics has been of great significance in medicine. A large proportion of human genetic diseases have mouse counterparts useful for experimental study (they are called “mouse models”). The mouse has played a particularly important role in the development of our current understanding of the genes underlying cancer. The mouse genome can be modified by the insertion of specific fragments of DNA into a fertilized egg or into somatic cells. The mice in the photograph have received a jellyfish gene for green fluorescent protein (GFP) that makes them glow green under special lights. Gene knockouts and replacements also are possible.
A major limitation of mouse genetics is its cost. Whereas working with a million individuals of E. coli or S. cerevisiae is a trivial matter, working with a million mice requires a factory-size building. Furthermore, although mice do breed rapidly (compared with humans), they cannot compete with microorganisms for speedy life cycle. Hence, the large-scale selections and screens necessary to detect rare genetic events are not possible.
Green-glowing genetically modified mice. The jellyfish gene for green fluorescent protein has been inserted into the chromosomes of the glowing mice. The other mice are normal. [ Eye of Science/Science Source.]
double dose. In general, the term pleiotropic is used for any allele that affects several properties of an organism. The tailless Manx phenotype in cats (Figure 6-8) also is produced by an allele that is lethal in the homozygous state. A single dose of the Manx allele, ML, severely interferes with normal spinal development, resulting in the absence of a tail in the M L/M heterozygote. But in the M L/M L homozygote, the double dose of the gene produces such an extreme abnormality in spinal development that the embryo does not survive. The yellow and ML alleles have their own phenotypes in a heterozygote, but most recessive lethals are silent in the heterozygote. In such a situation, recessive lethality is diagnosed by observing the death of 25 percent of the progeny at some stage of development. Whether an allele is lethal or not often depends on the environment in which the organism develops. Whereas certain alleles are lethal in virtually any environment, others are viable in one environment but lethal in another. Human hereditary diseases provide some examples. Cystic fibrosis and sickle-cell anemia are diseases that would be lethal without treatment. Furthermore, many of the alleles favored and selected by animal and plant breeders would almost certainly be eliminated in nature as a result of competition with the members of the natural
6.2 Interaction of Genes in Pathways 223
population. The dwarf mutant varieties of grain, which are very high yielding, provide good examples; only careful nurturing by farmers has maintained such alleles for our benefit. Geneticists commonly encounter situations in which expected phenotypic ratios are consistently skewed in one direction because a mutant allele reduces viability. For example, in the cross A/a × a/a, we predict a progeny ratio of 50 percent A/a and 50 percent a/a, but we might consistently observe a ratio such as 55 percent : 45 percent or 60 percent : 40 percent. In such a case, the recessive allele is said to be sublethal because the lethality is expressed in only some but not all of the homozygous individuals. Thus, lethality may range from 0 to 100 percent, depending on the gene itself, the rest of the genome, and the environment. We have seen that lethal alleles are useful in diagnosing the time at which a gene acts and the nature of the phenotypic defect that kills. However, maintaining stocks bearing lethal alleles for laboratory use is a challenge. In diploids, recessive lethal alleles can be maintained as heterozygotes. In haploids, heat-sensitive lethal alleles are useful. They are members of a general class of temperature-sensitive (ts) mutations. Their phenotype is wild type at the permissive temperature (often room temperature) but mutant at some higher restrictive temperature. Temperature-sensitive alleles are thought to be caused by mutations that make the protein prone to twist or bend its shape to an inactive conformation at the restrictive temperature. Research stocks can be maintained easily under permissive conditions, and the mutant phenotype can be assayed in a subset of individuals by a switch to the restrictive conditions. Temperature-sensitive dominant lethal mutations also are useful. This type of mutation is expressed even when present in a single dose but only when the experimenter switches the organism to the restrictive temperature. Null alleles for genes identified through genomic sequencing can be made by using a variety of “reverse genetic” procedures that specifically knock out the function of that gene. These will be described in Chapter 14.
Tailless, a recessive lethal allele in cats
F i g u r e 6 - 8 A Manx cat. A dominant allele causing taillessness is lethal in the homozygous state. The phenotype of two eye colors is unrelated to taillessness. [ Gerard Lacz/NHPA/ Photoshot.]
Discoverer of inborn errors of metabolism
K e y C o n c e p t To see if a gene is essential, a null allele is tested for lethality.
We now turn to the approaches that can be used to detect the interaction between two or more loci.
6.2 Interaction of Genes in Pathways Genes act by controlling cellular chemistry. Early in the twentieth century, Archibald Garrod, an English physician (Figure 6-9), made the first observation supporting this insight. Garrod noted that several recessive human diseases show defects in what is called metabolism, the general set of chemical reactions taking place in an organism. This observation led to the notion that such genetic diseases are “inborn errors of metabolism.” Garrod worked on a disease called alkaptonuria (AKU), or black urine disease. He discovered that the substance
F i g u r e 6 - 9 British physician Archibald Garrod (1857–1936). [ Science Photo Library/Science Source.]
224 CHA P TER 6 Gene Interaction
responsible for black urine was homogentisic acid, which is present in high amounts and secreted into the urine in AKU patients. He knew that, in unaffected people, homogentisic acid is converted into maleylacetoacetic acid; so he proposed that, in AKU, there is a defect in this conversion. Consequently, homogentisic acid builds up and is excreted. Garrod’s observations raised the possibility that the cell’s chemical pathways were under the control of a large set of interacting genes. However, the direct demonstration of this control was provided by the later work of Beadle and Tatum on the fungus Neurospora.
Biosynthetic pathways in Neurospora The landmark study by George Beadle and Edward Tatum in the 1940s not only clarified the role of genes, but also demonstrated the interaction of genes in biochemical pathways. They later received a Nobel Prize for their study, which marks the beginning of all molecular biology. Beadle and Tatum did their work on the haploid fungus Neurospora, which we have met in earlier chapters. Their plan was to investigate the genetic control of cellular chemistry. In what has become the standard forward genetic approach, they first irradiated Neurospora cells to produce mutations and then tested cultures grown from ascospores for interesting mutant phenotypes relevant to biochemical function. They found numerous mutants that had defective nutrition. Specifically, these mutants were auxotrophic mutants, of the type described for bacteria in Chapter 5. Whereas wild-type Neurospora can use its cellular biochemistry to synthesize virtually all its cellular components from the inorganic nutrients and a carbon source in the medium, auxotrophic mutants cannot. In order to grow, such mutants require a nutrient to be supplied (a nutrient that a wild-type fungus is able to synthesize for itself), suggesting that the mutant is defective for some normal synthetic step. As their first step, Beadle and Tatum confirmed that each mutation that generated a nutrient requirement was inherited as a single-gene mutation because each gave a 1 : 1 ratio when crossed with a wild type. Letting aux represent an auxotrophic mutation,
+ × aux ↓
Arginine and its analogs
NH2 (CH2)3
NH2
NH2
C"O
C " NH
NH
NH
(CH2)3
(CH2)3
CHNH2
CHNH2
CHNH2
COOH Ornithine
COOH Citrulline
COOH Arginine
F i g u r e 6 -10 The chemical structures
of arginine and the structurally related compounds citrulline and ornithine.
progeny :
1 2
+ and
1 2
aux
Their second step was to classify the specific nutritional requirement of each auxotroph. Some would grow only if proline was supplied, others methionine, others pyridoxine, others arginine, and so on. Beadle and Tatum decided to focus on arginine auxotrophs. They found that the genes that mutated to give arginine auxotrophs mapped to three different loci on three separate chromosomes. Let’s call the genes at the three loci the arg-1, arg-2, and arg-3 genes. A key breakthrough was Beadle and Tatum’s discovery that the auxotrophs for each of the three loci differed in their response to the structurally related compounds ornithine and citrulline (Figure 6-10). The arg-1 mutants grew when supplied with any one of the chemicals ornithine, citrulline, or arginine. The arg-2 mutants grew when given arginine or citrulline but not ornithine. The arg-3 mutants grew only when arginine was supplied. These results are summarized in Table 6-1. Cellular enzymes were already known to interconvert such related compounds. On the basis of the properties of the arg mutants, Beadle and Tatum and their colleagues proposed a biochemical pathway for such conversions in Neurospora: precursor
enzyme X
ornithine
enzyme Y
citrulline
enzyme Z
arginine
6.2 Interaction of Genes in Pathways 225
Table 6-1
Growth of arg Mutuants in Response to Supplements Supplement
Mutant arg-1 arg-2 arg-3
Ornithine + - -
Citrulline Arginine + + -
+ + +
Note: A plus sign means growth; a minus sign means no growth.
This pathway nicely explains the three classes of mutants shown in Table 6-1. Under the model, the arg-1 mutants have a defective enzyme X, and so they are unable to convert the precursor into ornithine as the first step in producing arginine. However, they have normal enzymes Y and Z, and so the arg-1 mutants are able to produce arginine if supplied with either ornithine or citrulline. Similarly, the arg-2 mutants lack enzyme Y, and the arg-3 mutants lack enzyme Z. Thus, a mutation at a particular gene is assumed to interfere with the production of a single enzyme. The defective enzyme creates a block in some biosynthetic pathway. The block can be circumvented by supplying to the cells any compound that normally comes after the block in the pathway. We can now diagram a more complete biochemical model: arg-2
arg-1
precursor
enzyme X
ornithine
enzyme Y
arg-3
citrulline
enzyme Z
arginine
This brilliant model, which was initially known as the one-gene–one-enzyme hypothesis, was the source of the first exciting insight into the functions of genes: genes somehow were responsible for the function of enzymes, and each gene apparently controlled one specific enzyme in a series of interconnected steps in a biochemical pathway. Other researchers obtained similar results for other biosynthetic pathways, and the hypothesis soon achieved general acceptance. All proteins, whether or not they are enzymes, also were found to be encoded by genes, and so the phrase was refined to become the one-gene–one-polypeptide hypothesis. (Recall that a polypeptide is the simplest type of protein, a single chain of amino acids.) It soon became clear that a gene encodes the physical structure of a protein, which in turn dictates its function. Beadle and Tatum’s hypothesis became one of the great unifying concepts in biology because it provided a bridge that brought together the two major research areas of genetics and biochemistry. We must add parenthetically that, although the great majority of genes encode proteins, some are now known to encode RNAs that have special functions. All genes are transcribed to make RNA. Protein-encoding genes are transcribed to messenger RNA (mRNA), which is then translated into protein. However, the RNA encoded by a minority of genes is never translated into protein because the RNA itself has a unique function. These are called functional RNAs. Some examples are transfer RNAs, ribosomal RNAs, and small cytoplasmic RNAs—more about them in later chapters. K e y C o n c e p t Chemical synthesis in cells is by pathways of sequential steps catalyzed by enzymes. The genes encoding the enzymes of a specific pathway constitute a functionally interacting subset of the genome.
226 CHA P TER 6 Gene Interaction
A synthetic pathway and associated diseases
Gene interaction in other types of pathways
The notion that genes interact through pathways is a powerful one that finds application in all organisms. The Neurospora arginine pathway is an example of a synthetic pathway, a chain of enzymatic Phenylalanine conversions that synthesizes essential molecules. If [Phe] high (Phe) We can extend the idea again to a human case Phenylpyruvic already introduced, the disease phenylketonuria Phe hydroxylase PKU acid (PKU), which is caused by an autosomal recessive allele. This disease results from an inability to conAlbinism Cretinism vert phenylalanine into tyrosine. As a result of the Tyrosine block, phenylalanine accumulates and is spontane(Tyr) ously converted into a toxic compound, phenylpyMelanin Thyroxine ruvic acid. The PKU gene is part of a metabolic pathway like the Neurospora arginine pathway, and Tyr aminotransferase part of it is shown in Figure 6-11. The illustration includes several other diseases caused by blockages Hydroxyphenylpyruvic acid in steps in this pathway (including alkaptonuria, (HPA) the disease investigated by Garrod). Another type of pathway is a signal-transducTyrosinosis HPA oxidase tion pathway. This type of pathway is a chain of complex signals from the environment to the genome and from one gene to another. These Homogentisic acid (HA) pathways are crucial to the proper function of an organism. One of the best understood signalAlkaptonuria HA oxidase transduction pathways was worked out from a genetic analysis of the mating response in baker’s yeast. Two mating types, determined by the alleles Maleylacetoacetic MATa and MATα, are necessary for yeast mating acid to occur. When a cell is in the presence of another cell of opposite mating type, it undergoes a series of changes in shape and behavior preparatory to CO2 + H2O mating. The mating response is triggered by a signal-transduction pathway requiring the sequential action of a set of genes. This set of genes was F i g u r e 6 -11 A section of the discovered through a standard interaction analysis of mutants with aberrant matphenylalanine metabolic pathway in ing response (most were sterile). The steps were pieced together by using the humans, including diseases associated approaches in the next section. The signal that sets things in motion is a mating with enzyme blockages. The disease PKU pheromone (hormone) released by the opposite mating type; the pheromone is produced when the enzyme binds to a membrane receptor, which is coupled to a G protein inside the memphenylalanine hydroxylase malfunctions. brane and activates the protein. The G protein, in turn, sets in motion a series of Accumulation of phenylalanine results in sequential protein phosphorylations called a kinase cascade. Ultimately, the casan increase in phenylpyruvic acid, which interferes with the development of the cade activates the transcription of a set of mating-specific genes that enable the nervous system. cell to mate. A mutation at any one of these steps may disrupt the mating process. Developmental pathways comprise the steps by which a zygote becomes an adult organism. This process involves many genetically controlled steps, including establishment of the anterior-posterior and dorsal-ventral axes, laying down the basic body plan of organs, and tissue differentiation and movement. These steps can require gene regulation and signal transduction. Developmental pathways will be taken up in detail in Chapter 13, but the interaction of genes in these pathways is analyzed in the same way, as we will see next. Dietary protein
6.3 Inferring Gene Interactions 227
6.3 Inferring Gene Interactions The genetic approach that reveals the interacting genes for a particular biological property is briefly as follows: Step 1. Obtain many single-gene mutants and test for dominance. Step 2. Test the mutants for allelism—are they at one or several loci? Step 3. Combine the mutants in pairs to form double mutants to see if the genes interact. Gene interaction is inferred from the phenotype of the double mutant: if the genes interact, then the phenotype differs from the simple combination of both single-gene mutant phenotypes. If mutant alleles from different genes interact, then we infer that the wild-type genes interact normally as well. In cases in which the two mutants interact, a modified 9 : 3 : 3 : 1 Mendelian ratio will often result. A procedure that must be carried out before testing interactions is to determine whether each mutation is of a different locus (step 2 above). The mutant screen could have unintentionally favored certain genes. Thus, the set of gene loci needs to be defined, as shown in the next section.
Sorting mutants using the complementation test How is it possible to decide whether two mutations belong to the same gene? There are several ways. First, each mutant allele could be mapped. Then, if two mutations map to two different chromosomal loci, they are likely of different genes. However, this approach is time consuming on a large set of mutations. A quicker approach often used is the complementation test. In a diploid, the complementation test is performed by intercrossing two individuals that are homozygous for different recessive mutations. The next step is to observe whether the progeny have the wild-type phenotype. If the progeny are wild type, the two recessive mutations must be in different genes because the respective wild-type alleles provide wild-type function. In this case, the two mutations are said to have complemented. Here, we will name the genes a1 and a2, after their mutant alleles. We can represent the heterozygotes as follows, depending on whether the genes are on the same chromosome or are on different chromosomes: Different chromosomes: a1
1
1
a2
Same chromosome (shown in the trans configuration): a1
1
1
a2
However, if the progeny are not wild type, then the recessive mutations must be alleles of the same gene. Because both alleles of the gene are mutants, there is no wild-type allele to help distinguish between two different mutant alleles of a gene whose wild-type allele is a+. These alleles could have different mutant sites
228 CHA P TER 6 Gene Interaction
within the same gene, but they would both be nonfunctional. The heterozygote a′/a″ would be a a
= mutation
At the operational level, complementation is defined as the production of a wildtype phenotype when two haploid genomes bearing different recessive mutations are united in the same cell. Let’s illustrate the complementation test with an example from harebell plants (genus Campanula). The wild-type flower color of this plant is blue. Let’s assume that, from a mutant hunt, we have obtained three white-petaled mutants and that they are available as homozygous pure-breeding strains. They all look the same, and so we do not know a priori whether they are genetically identical. We will call the mutant strains $, £, and ¥ to avoid any symbolism using letters, which might imply dominance. When crossed with wild type, each mutant gives the same results in the F1 and F2 as follows: blue,
white £ × blue → F1, all blue →
blue,
white ¥ × blue → F1, all blue →
Harebell plant
3 4 F2, 43 F2, 43
white $ × blue → F1, all blue → F2,
blue,
1 4 1 4 1 4
white white white
In each case, the results show that the mutant condition is determined by the recessive allele of a single gene. However, are they three alleles of one gene, of two genes, or of three genes? Because the mutants are recessive, the question can be answered by the complementation test, which asks if the mutants complement one another. Let us intercross the mutants to test for complementation. Assume that the results of intercrossing mutants $, £, and ¥ are as follows:
white $ × white £ → F1, all white white $ × white ¥ → F1, all blue white £ × white ¥ → F1, all blue
From this set of results, we can conclude that mutants $ and £ must be caused by alleles of one gene (say, w1) because they do not complement, but ¥ must be caused by a mutant allele of another gene (w2) because ¥ complements both $ and £. K e y C o n c e p t When two independently derived recessive mutant alleles producing similar recessive phenotypes fail to complement, they must be alleles of the same gene.
Flowers of the harebell plant (Campanula species). [ Gregory G. Dimijian/Science Source.]
How does complementation work at the molecular level? The normal blue color of the harebell flower is caused by a blue pigment called anthocyanin. Pigments are chemicals that absorb certain colors of light; in regard to the harebell, the anthocyanin absorbs all wavelengths except blue, which is reflected into the eye of the observer. However, this anthocyanin is made from chemical precursors that are not pigments; that is, they do not absorb light of any specific wavelength and simply reflect back the white light of the sun to the observer, giving a white appearance. The blue pigment is the end product of a series of biochemical conversions of nonpigments. Each step is catalyzed by a specific enzyme encoded by a specific gene. We can explain the results with a pathway as follows:
6.3 Inferring Gene Interactions 229
gene w1
gene w2
enzyme 1
enzyme 2
precursor 1
precursor 2
blue anthocyanin
A homozygous mutation in either of the genes will lead to the accumulation of a precursor that will simply make the plant white. Now the mutant designations could be written as follows: $ w1$/w1$ • w2+/w2+ £ w1£/w1£ • w2+/w2+ ¥ w1+/w1+ • w2¥/w2¥ However, in practice, the subscript symbols would be dropped and the genotypes would be written as follows: $ w1/w1 • w2+/w2+ £ w1/w1 • w2+/w2+ ¥ w1+/w1+ • w2/w2 Hence, an F1 from $ × £ will be w1/w1 • w2+/w2+ These F1 plants will have two defective alleles for w1 and will therefore be blocked at step 1. Even though enzyme 2 is fully functional, it has no substrate on which to act; so no blue pigment will be produced and the phenotype will be white. The F1 plants from the other crosses, however, will have the wild-type alleles for both of the enzymes needed to take the intermediates to the final blue product. Their genotypes will be w1+/w1 • w2+/w2 Hence, we see that complementation is actually a result of the cooperative interaction of the wild-type alleles of the two genes. Figure 6-12 summarizes the interaction of the complementing and noncomplementing white mutants at the genetic and cellular levels. In a haploid organism, the complementation test cannot be performed by intercrossing. In fungi, an alternative method brings mutant alleles together to test complementation: fusion resulting in a heterokaryon (Figure 6-13). Fungal cells fuse readily. When two different strains fuse, the haploid nuclei from the different strains occupy one cell, which is the heterokaryon (Greek; different kernels). The nuclei in a heterokaryon do not generally fuse. In one sense, this condition is a “mimic” diploid. Assume that, in different strains, there are mutations in two different genes conferring the same mutant phenotype—for example, an arginine requirement. We will call these genes arg-1 and arg-2. The genotypes of the two strains can be represented as arg-1 • arg-2+ and arg-1+ • arg-2. These two strains can be fused to form a heterokaryon with the two nuclei in a shared cytoplasm: Nucleus 1 is arg-1 • arg-2+ Nucleus 2 is arg-1+ • arg-2
23 0 CHA P TER 6 Gene Interaction
F i g u r e 6 -12 Three phenotypically identical white harebell mutants—$, £, and ¥—are intercrossed. Mutations in the same gene (such as $ and £) cannot complement because the F1 has one gene with two mutant alleles. The pathway is blocked and the flowers are white. When the mutations are in different genes (such as £ and ¥), there is complementation by the wild-type alleles of each gene in the F1 heterozygote. Pigment is synthesized and the flowers are blue. (What would you predict to be the result of crossing $ and ¥ ?)
The molecular basis of genetic complementation Wild type
+
+ +
+
w1 gene
w2 gene
Mutant "£"
Mutant "$"
"$"
+
"$"
+ w2 gene
w1 gene
P
"£" "£" w1 gene
White $
×
White £
Mutant "¥"
+
+
"¥"
+ w2 gene
+ w1 gene
"¥" w2 gene
White £
×
White ¥
F1
No complementation $ +
Colorless precursor 1
+
+
£ No substrate
Complementation
£
Enzyme 2
No precursor 2
White
Block (no enzyme 1)
Mutation in the same gene
+
Enzyme 1 Colorless precursor 1
¥
Enzyme 2
Colorless precursor 2
Blue
Mutation in different genes
Because gene products are made in a common cytoplasm, the two wild-type alleles can exert their dominant effect and cooperate to produce a heterokaryon of wild-type phenotype. In other words, the two mutations complement, just as they would in a diploid. If the mutations had been alleles of the same gene, there would have been no complementation.
6.3 Inferring Gene Interactions 231
Testing complementation by using a heterokaryon arg-1 cells, defective for one specific enzyme in arginine synthetic pathway
arg-2 cells, defective for a different enzyme in arginine synthetic pathway
Fusion
Heterokaryon grows without arginine
F i g u r e 6 -13 A heterokaryon of Neurospora and similar fungi mimics a diploid state. When vegetative cells fuse, haploid nuclei share the same cytoplasm in a heterokaryon. In this example, haploid nuclei with mutations in different genes in the arginine synthetic pathway complement to produce a Neurospora that no longer requires arginine.
Analyzing double mutants of random mutations Recall that, to learn whether two genes interact, we need to assess the phenotype of the double mutant to see if it is different from the combination of both single mutations. The double mutant is obtained by intercrossing. The F1 is obtained as part of the complementation test; so with the assumption that complementation has been observed, suggesting different genes, the F1 is selfed or intercrossed to obtain an F2 homozygous for both mutations. This double mutant may then be identified by looking for Mendelian ratios. For example, if a standard 9 : 3 : 3 : 1 Mendelian ratio is obtained, the phenotype present in only 1/16 of the progeny represents the double mutant (the “1” in 9 : 3 : 3 : 1). In cases of gene interaction, however, the phenotype of the double mutant may not be distinct but will match that of one of the single mutants. In this case, a modified Mendelian ratio will result, such as 9 : 3 : 4 or 9 : 7. The standard 9 : 3 : 3 : 1 Mendelian ratio is the simplest case, expected if there is no gene interaction and if the two mutations under test are on different chromosomes. This 9 : 3 : 3 : 1 ratio is the null hypothesis: any modified Mendelian ratio representing a departure from this null hypothesis would be informative, as the following examples will show. The 9 : 3 : 3 : 1 ratio: no gene interaction As a baseline, let’s start with the case in which two mutated genes do not interact, a situation where we expect the 9 : 3 : 3 : 1 ratio. Let’s look at the inheritance of skin coloration in corn snakes. The snake’s natural color is a repeating black-and-orange camouflage pattern, as shown in Figure 6-14a. The 11e phenotype is produced by two separate pigments, Introduction to Genetic Analysis, Figureof 06.13 #619are under genetic control. One gene determines the orange pigboth which 04/29/14 ment, and the alleles that we will consider are o+ (presence of orange pigment) Dragonfly Media Group and o (absence of orange pigment). Another gene determines the black pigment, and its alleles are b+ (presence of black pigment) and b (absence of black pigment). These two genes are unlinked. The natural pattern is produced by the genotype o+/− ; b+/−. (The dash represents the presence of either allele.) A snake that is o/o ; b+/− is black because it lacks the orange pigment (Figure 6-14b), and a snake that is o+/− ; b/b is orange because it lacks the black pigment (Figure 6-14c). The double homozygous recessive o/o ; b/b is albino (Figure 6-14d). Notice, however,
232 CHA P TER 6 Gene Interaction
Independently synthesized and inherited pigments
(a)
that the faint pink color of the albino is from yet another pigment, the hemoglobin of the blood that is visible through this snake’s skin when the other pigments are absent. The albino snake also clearly shows that there is another element to the skin-pigmentation pattern in addition to pigment: the repeating motif in and around which pigment is deposited. If a homozygous orange and a homozygous black snake are crossed, the F1 is wild type (camouflaged), demonstrating complementation: / o+/o+ ; b/b × ? o/o ; b+/b+ (orange) (black) ↓ F1 o+/o ; b+/b (camouflaged)
(b)
Here, however, an F2 shows a standard 9 : 3 : 3 : 1 ratio: / o+/o ; b+/b × ? o+/o ; b+/b (camouflaged) (camouflaged) ↓ F2 9 o+/- ; b+/- (camouflaged) 3 o+/- ; b/b (orange) 3 o/o ; b+/- (black) 1 o/o ; b/b (albino) (c)
(d)
F i g u r e 6 -14 In corn snakes,
combinations of orange and black pigments determine the four phenotypes shown. (a) A wild-type black-and-orange camouflaged snake synthesizes both black and orange pigments. (b) A black snake does not synthesize orange pigment. (c) An orange snake does not synthesize black pigment. (d) An albino snake synthesizes neither black nor orange pigment. [ Anthony Griffiths.]
The 9 : 3 : 3 : 1 ratio is produced because the two pigment genes act independently at the cellular level. precursor precursor
b o
black pigment orange pigment
camouflaged
If the presence of one mutant makes one pathway fail, the other pathway is still active, producing the other pigment color. Only when both mutants are present do both pathways fail, and no pigment of any color is produced. The 9 : 7 ratio: genes in the same pathway The F2 ratio from the harebell dihybrid cross shows both blue and white plants in a ratio of 9 : 7. How can such results be explained? The 9 : 7 ratio is clearly a modification of the dihybrid 9 : 3 : 3 : 1 ratio with the 3 : 3 : 1 combined to make 7; hence, some kind of interaction is inferred. The cross of the two white lines and subsequent generations can be represented as follows: w1/w1 ; w2+/w2+ (white) × w1+/w1+ ; w2/w2 (white) ↓ F1 w1+/w1 ; w2+/w2 (blue) w1+/w1 ; w2+/w2 × w1+/w1 ; w2+/w2 ↓ + F2 9 w1 /- ; w2+/- (blue) 9 3 w1+/- ; w2/w2 (white) 3 w1/w1 ; w2+/- (white) u 7 1 w1/w1 ; w2/w2 (white)
6.3 Inferring Gene Interactions 23 3
Interaction between a regulatory protein and its target Regulatory gene r+
Gene for protein A a+
(a) Normal
Wild-type protein A produced
(b) Mutation in the gene that encodes the regulatory protein
r
(c) Mutation in the gene that encodes the structural protein
r+
a+ No protein A produced Nonfunctional regulatory protein a Mutant protein A produced
r (d) Mutation in both genes
Protein product of gene a
a No protein A produced
F i g u r e 6 -15 The r + gene encodes a regulatory protein, and the a + gene encodes a
structural protein. Both must be normal for a functional (“active”) structural protein to be synthesized.
Clearly, in this case, the only way in which a 9 : 7 ratio is possible is if the double mutant has the same phenotypes as the two single mutants. Hence, the modified ratio constitutes a way of identifying the double mutant’s phenotype. Furthermore, the identical phenotypes of the single and double mutants suggest that each mutant allele controls a different step in the same pathway. The results show that a plant will have white petals if it is homozygous for the recessive mutant allele of either gene or both genes. To have the blue phenotype, a plant must have at least one copy of the dominant allele of both genes because both are needed to complete the sequential steps in the pathway. No matter which is absent, the same pathway fails, producing the same phenotype. Thus, three of the genotypic classes will produce the same phenotype, and so, overall, only two pheIntroduction to Genetic Analysis, 11e notypes result. Figure 06.15 #620 in harebells entailed different steps in a synthetic pathway. Similar The example 04/29/14 results can come from gene regulation. A regulatory gene often functions by producDragonfly Media Group ing a protein that binds to a regulatory site upstream of a target gene, facilitating the transcription of the gene (Figure 6-15). In the absence of the regulatory protein, the target gene would be transcribed at very low levels, inadequate for cellular needs. Let’s cross a pure line r/r defective for the regulatory protein to a pure line a/a defective for the target protein. The cross is r/r ; a+/a+ × r+/r+ ; a/a. The r+/r ; a+/a dihybrid will show complementation between the mutant genotypes because both r+ and
23 4 CHA P TER 6 Gene Interaction
a+ are present, permitting normal transcription of the wild-type allele. When selfed, the F1 dihybrid will also result in a 9 : 7 phenotypic ratio in the F2: Functional a+ Proportion Genotype protein
9 16 3 16 3 16 1 16
Ratio
r+/- ; a+/- Yes 9 r+/- ; a/a No r/r ; a+/- No 7 u r/r ; a/a No K e y C o n c e p t A 9 : 7 F2 ratio suggests interacting
genes in the same pathway; absence of either gene function leads to absence of the end product of the pathway.
A model for recessive epistasis
Dihybrid w +/ w ; m + /m
Selfed
9 16
w + / – ; m + / – Both enzymes active w+ m+ Enzyme 1
3 16
Enzyme 2
w + / – ; m/m Blocked at second enzyme w+ Enzyme 1
3 16
9
3
w/w ; m + / – Blocked at first enzyme m+ Enzyme 2 No substrate
1 16
The 9 : 3 : 4 ratio: recessive epistasis A 9 : 3 : 4 ratio in the F2 suggests a type of gene interaction called epistasis. This word means “stand upon,” referring to the situation in which a double mutant shows the phenotype of one mutation but not the other. The overriding mutation is epistatic, whereas the overridden one is hypostatic. Epistasis also results from genes being in the same pathway. In a simple synthetic pathway, the epistatic mutation is carried by a gene that is farther upstream (earlier in the pathway) than the gene of the overridden mutation (Figure 6-16). The mutant phenotype of the upstream gene takes precedence, no matter what is taking place later in the pathway. Let’s look at an example concerning petal-pigment synthesis in the plant blue-eyed Mary (Collinsia parviflora). From the blue wild type, we’ll start with two pure mutant lines, one with white (w/w) and the other with magenta petals (m/m). The w and m genes are not linked. The F1 and F2 are as follows: w/w ; m+/m+ (white) × w+/w+ ; m/m (magenta) F1 w+/w ; m+/m (blue) ↓ w+/w ; m+/m × w+/w ; m+/m ↓ F2 9 w+/- ; m+/- (blue) 9 3 w+/- ; m/m (magenta) 3 3 w/w ; m+/- (white) 4 1 w/w ; m/m (white) u
4
w/w ; m/m Blocked at first enzyme
F i g u r e 6 -16 Wild-type alleles of two genes (w + and m +)
encode enzymes catalyzing successive steps in the synthesis of a blue petal pigment. Homozygous m/m plants produce magenta flowers, and homozygous w/w plants produce white flowers. The double mutant w/w ; m/m also produces white flowers, indicating that white is epistatic to magenta.
In the F2, the 9 : 3 : 4 phenotypic ratio is diagnostic of recessive epistasis. As in the preceding case, we see, again, that the ratio tells us what the phenotype of the double must be, because the 164 component of the ratio must be a grouping of one single mutant class ( 163 ) plus the double mutant class ( 161 ). Hence, the double mutant expresses only one of the two mutant phenotypes; so, by definition, white must be epistatic to magenta. (To find the double mutant within the group, white F2 plants would have to be individually testcrossed.)
6.3 Inferring Gene Interactions 23 5
Recessive epistasis due to the yellow coat mutation
(a)
(c)
(b)
F i g u r e 6 -17 Three different coat
This interaction is called recessive epistasis because a recessive phenotype (white) overrides the other phenotype. Dominant epistasis will be considered in the next section. At the cellular level, we can account for the recessive epistasis in Collinsia by the following type of pathway (see also Figure 6-16). colorless
gene w
magenta
gene m
blue
Notice that the epistatic mutation occurs in a step in the pathway leading to blue pigment; this step is upstream of the step that is blocked by the masked mutation. Another informative case of recessive epistasis is the yellow coat color of some Labrador retriever dogs. Two alleles, B and b, stand for black and brown coats, respectively. The two alleles produce black and brown melanin. The allele e of another gene is epistatic on these alleles, giving a yellow coat (Figure 6-17). Therefore, the genotypes B/− ; e/e and b/b ; e/e both produce a yellow phenotype, whereas B/− ; E/− and b/b ; E/− are black and brown, respectively. This case of epistasis is not caused by an upstream block in a pathway leading to dark pigment. Yellow dogs can make black or brown pigment, as can be seen in their noses and lips. The action of the allele e is to prevent the deposition of the pigment in hairs. In this case, the epistatic gene is developmentally downstream; it represents a kind of developmental target that must be of E genotype before pigment can be deposited. K e y C o n c e p t Epistasis is inferred when a mutant allele of one gene masks the expression of a mutant allele of another gene and expresses its own phenotype instead.
In fungi, tetrad analysis is useful in identifying a double mutant. For example, an ascus containing half its products as wild type must contain double mutants. Consider the cross a • b+ × a+ • b In some proportion of progeny, the alleles a and b will segregate together (a nonparental ditype ascus). Such a tetrad will show the following phenotypes:
wild type a+ • b+ wild type a+ • b+
double mutant a • b double mutant a • b
colors in Labrador retrievers. Two alleles B and b of a pigment gene determine (a) black and (b) brown, respectively. At a separate gene, E allows color deposition in the coat, and e/e prevents deposition, resulting in (c) the gold phenotype. Part c illustrates recessive epistasis. [ Anthony Griffiths.]
23 6 CHA P TER 6 Gene Interaction
Hence, the double mutant must be the non-wild-type genotype and can be assessed accordingly. If the phenotype is the a phenotype, then b is being overridden; if the phenotype is the b phenotype, then a is being overridden. If both phenotypes are present, then there is no epistasis. Dominant epistasis due to a white mutation
F i g u r e 6 -18 In foxgloves, D and d cause dark and light pigments,
respectively, whereas the epistatic W restricts pigment to the throat spots. [ Anthony Griffiths.]
The 12 : 3 : 1 ratio: dominant epistasis In foxgloves (Digitalis purpurea), two genes interact in the pathway that determines petal coloration. The two genes are unlinked. One gene affects the intensity of the red pigment in the petal; allele d results in the light red color seen in natural populations of foxgloves, whereas D is a mutant allele that produces dark red color (Figure 6-18). The other gene determines in which cells the pigment is synthesized: allele w allows synthesis of the pigment throughout the petals as in the wild type, but the mutant allele W confines pigment synthesis to the small throat spots. If we self a dihybrid D/d ; W/w, then the F2 ratio is as follows: 9 D/- ; W/- (white with spots) 3 d/d ; W/- (white with spots) 3 D/- ; w/w (dark red) 1 d/d ; w/w (light red)
u
12 3 1
The ratio tells us that the dominant allele W is epistatic, producing the 12 : 3 : 1 9 ratio. The 12 16 component of the ratio must include the double mutant class ( 16 ), which is clearly white in phenotype, establishing the epistasis of the dominant allele W. The two genes act in a common developmental pathway: W prevents the synthesis of red pigment but only in a special class of cells constituting the main area of the petal; synthesis is allowed in the throat spots. When synthesis is allowed, the pigment can be produced in either high or low concentrations. Suppressors It is not easy to specifically select or screen for epistatic interactions, and cases of epistasis have to be built up by the laborious combination of candidate mutations two at a time. However, for our next type of gene interaction, the experimenter can readily select interesting mutant alleles. A suppressor is a mutant allele of a gene that reverses the effect of a mutation of another gene, resulting in a wild-type or near-wild-type phenotype. Suppression implies that the target gene and the suppressor gene normally interact at some functional level in their wild-type states. For example, assume that an allele a+ produces the normal phenotype, whereas a recessive mutant allele a results in abnormality. A recessive mutant allele s at another gene suppresses the effect of a, and so the genotype a/a • s/s will have the wild-type (a+-like) phenotype. Suppressor alleles sometimes have no effect in the absence of the other mutation; in such a case, the phenotype of a+/a+ • s/s would be wild type. In other cases, the suppressor allele produces its own abnormal phenotype. Screening for suppressors is quite straightforward. Start with a mutant in some process of interest, expose this mutant to mutation-causing agents such as highenergy radiation, and screen the descendants for wild types. In haploids such as fungi, screening is accomplished by simply plating mutagenized cells and looking for colonies with wild-type phenotypes. Most wild types arising in this way are merely reversals of the original mutational event and are called revertants. However, some will be “pseudorevertants,” double mutants in which one of the mutations is a suppressor.
6.3 Inferring Gene Interactions 237
Revertant and suppressed states can be distinguished by appropriate crossing. For example, in yeast, the two results would be distinguished as follows: true revertant a+ × standard wild-type a+ ↓ Progeny all a+ suppressed mutant a • s × standard wild-type a+ • s+ ↓ + Progeny a • s+ wild type a+ • s wild type + a • s original mutant a • s wild type (suppressed) The appearance of the original mutant phenotype identifies the parent as a suppressed mutant. In diploids, suppressors produce various modified F2 ratios, which are useful in confirming suppression. Let’s look at a real-life example from Drosophila. The recessive allele pd results in purple eye color when unsuppressed. A recessive allele su has no detectable phenotype itself but suppresses the unlinked recessive allele pd. Hence, pd/pd ; su/su is wild type in appearance and has red eyes. The following analysis illustrates the inheritance pattern. A homozygous purple-eyed fly is crossed with a homozygous red-eyed stock carrying the suppressor. pd/pd ; su+/su+ (purple) × pd+/pd+ ; su/su (red) ↓ F1 all pd+/pd ; su+/su (red) Self pd+/pd ; su+/su (red) × pd+/pd ; su+/su (red) ↓ F2 9 pd+/- ; su+/- red u 3 pd+/- ; su/su red 1 pd/pd ; su/su red 3 pd/pd ; su+/- purple
13 3
The overall ratio in the F2 is 13 red : 3 purple. The 13 16 component must include the double mutant, which is clearly wild type in phenotype. This ratio is expected from a recessive suppressor that itself has no detectable phenotype. Suppression is sometimes confused with epistasis. However, the key difference is that a suppressor cancels the expression of a mutant allele and restores the corresponding wild-type phenotype. Furthermore, often only two phenotypes segregate (as in the preceding examples) rather than three, as in epistasis. How do suppressors work at the molecular level? There are many possible mechanisms. A particularly useful type of suppression is based on the physical binding of gene products in the cell—for example, protein–protein binding. Assume that two proteins normally fit together to provide some type of cellular function. When a mutation causes a shape change in one protein, it no longer fits together with the other; hence, the function is lost (Figure 6-19). However, a suppressor mutation that causes a compensatory shape change in the second protein can restore fit and hence normal function. In this figure, if the genotypes were diploids representing an F2 from a dihybrid, then a 14 : 2 ratio would result because the only mutant genotypes would be m/m • s+/s+ (1/16) and m+/m+ • s/s (1/16), totaling 2/16. If this were a haploid dihybrid cross (such as m+ s+ × m s), a 1 : 1
23 8 CHA P TER 6 Gene Interaction
A molecular mechanism for suppression m+
s+
Wild type
Active protein complex
m
s+
First mutation
Second mutation acting as suppressor
ratio would result. From suppressor ratios generally, interacting proteins often can be deduced. Alternatively, in situations in which a mutation causes a block in a metabolic pathway, the suppressor finds some way of bypassing the block—for example, by rerouting into the blocked pathway intermediates similar to those beyond the block. In the following example, the suppressor provides an intermediate B to circumvent the block. No suppressor A
Inactive
m
B
product
With suppressor A
s
B
product
B In several organisms, nonsense suppressors have been found—mutations in tRNA genes resulting in an anticodon that will bind to a premature stop codon within a mutant coding sequence. Hence, the suppressor allows translation to proceed past the former block and make a complete protein rather than a truncated one. Such suppressor mutations often have little effect on the phenotype other than in suppression.
Active protein complex
m+
s
Suppressor mutation alone
Inactive
K e y C o n c e p t Mutant alleles called suppressors cancel the
F i g u r e 6 -19 A first mutation alters the binding site of one protein so that it can no longer bind to a partner. A suppressor mutation in the partner alters the binding site so that both proteins are able to bind once again.
effect of a mutant allele of another gene, resulting in wild-type phenotype.
Modifiers As the name suggests, a modifier mutation at a second locus changes the degree of expression of a mutated gene at the first locus. Regulatory genes provide a simple illustration. As in an earlier example, regulatory proteins bind to the sequence of the DNA upstream of the start site for transcription. These proteins regulate the level of transcription. In the discussion of complementation, we considered a null mutation of a regulatory gene that almost completely prevented transcription. However, some regulatory mutations change the level of transcription of the target gene so that either more or less protein is produced. In other words, a mutation in a regulatory protein can down-regulate or up-regulate the transcribed gene. Let’s look at an example using a down-regulating regulatory mutation b, affecting a gene A in a fungus such as yeast. We look at the effect of b on a leaky mutation of gene A. A leaky mutation is one with some low level of gene function. We cross a leaky mutation a with the regulatory mutation b: leaky mutant a • b+ × inefficient regulator a+ • b
Progeny
Phenotype
a+ • b+
wild type defective (low transcription) a • b+ defective (defective protein A) a • b extremely defective (low transcription of defective protein) a+ • b
Hence, the action of the modifier is seen in the appearance of two grades of mutant phenotypes within the a progeny.
6.4 Penetrance and Expressivity 23 9
Synthetic lethals In some cases, when two viable single mutants are intercrossed, the resulting double mutants are lethal. In a diploid F2, this result would be manifested as a 9 : 3 : 3 ratio because the double mutant (which would be the “1” component of the ratio) would be absent. These synthetic lethals can be considered a special category of gene interaction. They can point to specific types of interactions of gene products. For instance, genome analysis has revealed that evolution has produced many duplicate systems within the cell. One advantage of these duplicates might be to provide “backups.” If there are null mutations in genes in both duplicate systems, then a faulty system will have no backup, and the individual will lack essential function and die. In another instance, a leaky mutation in one step of a pathway may cause the pathway to slow down, but leave enough function for life. However, if double mutants combine, each with a leaky mutation in a different step, the whole pathway grinds to a halt. One version of the latter interaction is two mutations in a protein machine, as shown in Figure 6-20. In the earlier discussions of modified Mendelian ratios, all the crosses were dihybrid selfs. As an exercise, you might want to calculate the ratios that would be produced in the same systems if testcrosses were made instead of selfs.
A model for synthetic lethality
A+
B+
Wild type full binding; fully functional
A–
B+
Mutant A partial binding; functional
A+
B–
Mutant B partial binding; functional
A–
B–
Double mutant binding impossible; nonfunctional
DNA
K e y C o n c e p t A range of modified 9 : 3 : 3 : 1 F1 ratios can reveal specific types of gene interaction.
A summary of some of the ratios that reveal gene interaction is shown in Table 6-2. Table 6-2
Some Modified F2 Ratios 9 : 3 : 3 : 1 No interaction 9 : 7 Genes in same pathway 9 : 3 : 4 Recessive epistasis 12 : 3 : 1 Dominant epistasis 13 : 3 Suppressor has no phenotype 14 : 2 Suppressor is like mutant
Note: Some of these ratios can be produced with other mechanisms of interaction.
6.4 Penetrance and Expressivity In the analysis of single-gene inheritance, there is a natural tendency to choose mutants that produce clear Mendelian ratios. In such cases, we can use the phenotype to distinguish mutant and wild-type genotypes with almost 100 percent certainty. In these cases, we say that the mutation is 100 percent penetrant into the phenotype. However, many mutations show incomplete penetrance: that is, not every individual with the genotype expresses the corresponding phenotype. Thus, penetrance is defined as the percentage of individuals with a given allele who exhibit the phenotype associated with that allele. Why would an organism have a particular genotype and yet not express the corresponding phenotype? There are several possible reasons:
F i g u r e 6 -2 0 Two interacting proteins perform some essential function on some substrate such as DNA but must first bind to it. Reduced binding of either protein allows some functions to remain, but reduced binding of both is lethal.
240 CHA P TER 6 Gene Interaction
Inferring incomplete penetrance
Q R F i g u r e 6 -2 1 In this human pedigree of
a dominant allele that is not fully penetrant, person Q does not display the phenotype but passed the dominant allele to at least two progeny. Because the allele is not fully penetrant, the other progeny (for example, R) may or may not have inherited the dominant allele.
1. The influence of the environment. Individuals with the same genotype may show a range of phenotypes, depending on the environment. The range of phenotypes for mutant and wild-type individuals may overlap: the phenotype of a mutant individual raised in one set of circumstances may match the phenotype of a wild-type individual raised in a different set of circumstances. Should this matching happen, the mutant cannot be distinguished from the wild type. 2. The influence of other interacting genes. Uncharacterized modifiers, epistatic genes, or suppressors in the rest of the genome may act to prevent the expression of the typical phenotype. 3. The subtlety of the mutant phenotype. The subtle effects brought about by the absence of a gene function may be difficult to measure in a laboratory situation. A typical encounter with incomplete penetrance is shown in Figure 6-21. In this human pedigree, we see a normally dominantly inherited phenotype disappearing in the second generation only to reappear in the next. Another measure for describing the range of phenotypic expression is called expressivity. Expressivity measures the degree to which a given allele is expressed at the phenotypic level; that is, expressivity measures the intensity of the phenotype. For example, “brown” animals (genotype b/b) from different stocks might show very different intensities of brown pigment from light to dark. As for penetrance, variable expressivity may be due to variation in the allelic constitution of the rest of the genome or to environmental factors. Figure 6-22 illustrates the distinction between penetrance and expressivity. An example of variable expressivity in dogs is found in Figure 6-23. The phenomena of incomplete penetrance and variable expressivity can make any kind of genetic analysis substantially more difficult, including human pedigree analysis and predictions in genetic counseling. For example, it is often the case that a disease-causing allele is not fully penetrant. Thus, someone could have the allele but not show any signs of the disease. If that is the case, it is difficult to give a clean genetic bill of health to any person in a disease pedigree (for example, person R in Figure 6-21). On the other hand, pedigree analysis can sometimes identify persons who do not express but almost certainly do have a disease genotype (for example, individual Q in Figure 6-21). Similarly, variable expressivity
Penetrance and expressivity contrasted Phenotypic expression (each oval represents an individual)
Variable penetrance
Variable expressivity
Variable penetrance and expressivity
F i g u r e 6 -2 2 Assume that all the individuals shown have the same pigment allele (P) and possess the same potential to produce pigment. Effects from the rest of the genome and the environment may suppress or modify pigment production in any one individual. The color indicates the level of expression.
Summary 241
F i g u r e 6 -2 3 Ten grades of piebald
Variable expressivity
spotting in beagles. Each of these dogs has the allele SP, the allele responsible for piebald spots in dogs. The variation is caused by variation at other loci.
1
2
3
4
5
6
7
8
9
10
can complicate counseling because persons with low expressivity might be misdiagnosed. Even though penetrance and expressivity can be quantified, they nevertheless represent “fuzzy” situations because rarely is it possible to identify the specific factors causing variation without substantial extra research. K e y C o n c e p t The terms penetrance and expressivity quantify the modification of a gene’s effect by varying environment and genetic background; they measure, respectively, the percentage of cases in which the phenotype is observed and its severity.
s u m m a ry A gene does not act alone; rather, it acts in concert with many other genes in the genome. In forward genetic analysis, deducing these complex interactions is an important stage of the research. Individual mutations are first tested for their dominance relations, a type of allelic interaction. Recessive mutations are often a result of haplosufficiency of the wild-type allele, whereas dominant mutations are often the result either of haploinsufficiency of the wild type or of the mutant acting as a dominant negative (a rogue polypeptide). Some mutations cause severe effects or even death (lethal mutations). Lethality of a homozygous recessive mutation is a way to assess if a gene is essential in the genome. The interaction of different genes is a result of their participation in the same or connecting pathways of various kinds—synthetic, signal transduction, or developmental.
Genetic dissection of gene interactions begins by the experimenter amassing mutants affecting a character of interest. The complementation test determines whether two distinct recessive mutations are of one gene or of two different genes. The mutant genotypes are brought together in an F1 individual, and if the phenotype is mutant, then no complementation has occurred and the two alleles must be of the same gene. If the phenotype is wild type, then complementation has occurred, and the alleles must be of different genes. The interaction of different genes can be detected by testing double mutants because allele interaction implies interaction of gene products at the functional level. Some key types of interaction are epistasis, suppression, and synthetic lethality. Epistasis is the replacement of a mutant phenotype produced by one mutation with a mutant phenotype produced
242 CHA P TER 6 Gene Interaction
The different types of gene interactions produce F2 dihybrid ratios that are modifications of the standard 9 : 3 : 3 : 1. For example, recessive epistasis results in a 9 : 3 : 4 ratio. In more general terms, gene interaction and gene-environment interaction are revealed by variable penetrance (the ability of a genotype to express itself in the phenotype) and expressivity (the quantitative degree of phenotypic manifestation of a genotype).
by mutation of another gene. The observation of epistasis suggests a common developmental or chemical pathway. A suppressor is a mutation of one gene that can restore wild-type phenotype to a mutation at another gene. Suppressors often reveal physically interacting proteins or nucleic acids. Some combinations of viable mutants are lethal, a result known as synthetic lethality. Synthetic lethals can reveal a variety of interactions, depending on the nature of the mutations.
key terms full (complete) dominance (p. 216) functional RNA (p. 225) heterokaryon (p. 229) incomplete dominance (p. 219) lethal allele (p. 220) modifier (p. 238) multiple alleles (p. 216) null mutation (p. 218) one-gene–one-polypeptide hypothesis (p. 225)
allelic series (multiple alleles) (p. 216) codominance (p. 219) complementation (p. 228) complementation test (p. 227) dominant negative mutation (p. 218) double mutants (p. 227) epistasis (p. 234) essential gene (p. 221) expressivity (p. 240)
penetrance (p. 239) permissive temperature (p. 223) pleiotropic allele (p. 222) restrictive temperature (p. 223) revertant (p. 236) suppressor (p. 236) synthetic lethal (p. 239) temperature-sensitive (ts) mutations (p. 223)
s olv e d p r obl e m s SOLVED PROBLEM 1. Most pedigrees show polydactyly (see Figure 2-25) inherited as a rare autosomal dominant, but the pedigrees of some families do not fully conform to the patterns expected for such inheritance. Such a pedigree is shown here. (The unshaded diamonds stand for the specified number of unaffected persons of unknown sex.)
a. What irregularity does this pedigree show? b. What genetic phenomenon does this pedigree illustrate? c. Suggest a specific gene-interaction mechanism that could produce such a pedigree, showing genotypes of pertinent family members.
I 1
2
II 1
III
2
3
4
6
7
8
9
10
11
4 5
IV
5
6
7
8
9
5
6
10 11
12 13 14 15 16 17
4 7
Solution a. The normal expectation for an autosomal dominant is for each affected individual to have an affected parent, but this
8
9
expectation is not seen in this pedigree, which constitutes the irregularity. What are some possible explanations?
Solved Problems 24 3
Could some cases of polydactyly be caused by a different gene, one that is an X-linked dominant gene? This suggestion is not useful, because we still have to explain the absence of the condition in persons II-6 and II-10. Furthermore, postulating recessive inheritance, whether autosomal or sexlinked, requires many people in the pedigree to be heterozygotes, which is inappropriate because polydactyly is a rare condition. b. Thus, we are left with the conclusion that polydactyly must sometimes be incompletely penetrant. As described in this chapter, some individuals who have the genotype for a particular phenotype do not express it. In this pedigree, II-6 and II-10 seem to belong in this category; they must carry the polydactyly gene inherited from I-1 because they transmit it to their progeny. c. As discussed in this chapter, environmental suppression of gene expression can cause incomplete penetrance, as can suppression by another gene. To give the requested genetic explanation, we must come up with a genetic hypothesis. What do we need to explain? The key is that I-1 passes the mutation on to two types of progeny, represented by II-1, who expresses the mutant phenotype, and by II-6 and II-10, who do not. (From the pedigree, we cannot tell whether the other children of I-1 have the mutant allele.) Is genetic suppression at work? I-1 does not have a suppressor allele because he expresses polydactyly. So the only person from whom a suppressor could come is I-2. Furthermore, I-2 must be heterozygous for the suppressor allele because at least one of her children does express polydactyly. Therefore, the suppressor allele must be dominant. We have thus formulated the hypothesis that the mating in generation I must have been (I-1) P/p • s/s × (I-2) p/p • S/s where S is the suppressor and P is the allele responsible for polydactyly. From this hypothesis, we predict that the progeny will comprise the following four types if the genes assort: Genotype Phenotype
Example
P/p • S/s normal (suppressed) • P/p s/s polydactylous p/p • S/s normal p/p • s/s normal
II-6, II-10 II-1
If S is rare, the progeny of II-6 and II-10 are: Progeny genotype
Example
P/p · S/s III-13 P/p · s/s III-8 p/p · S/s p/p · s/s
We cannot rule out the possibilities that II-2 and II-4 have the genotype P/p • S/s and that by chance none of their descendants are affected. SOLVED PROBLEM 2. Beetles of a certain species may have
green, blue, or turquoise wing covers. Virgin beetles were selected from a polymorphic laboratory population and mated to determine the inheritance of wing-cover color. The crosses and results were as given in the following table: Cross
Parents
Progeny
1
blue × green
all blue
2
blue × blue
3 4
blue : 41 turquoise
3
green × green
3 4
green : 41 turquoise
4
blue × turquoise
1 2
blue : 21 turquoise
5
blue × blue
3 4
blue : 41 green
6
blue × green
1 2
blue : 21 green
7 blue × green 8 turquoise × turquoise
blue : 41 green turquoise all turquoise 1 2 1 4
a. Deduce the genetic basis of wing-cover color in this species. b. Write the genotypes of all parents and progeny as completely as possible. Solution a. These data seem complex at first, but the inheritance pattern becomes clear if we consider the crosses one at a time. A general principle of solving such problems, as we have seen, is to begin by looking over all the crosses and by grouping the data to bring out the patterns. One clue that emerges from an overview of the data is that all the ratios are one-gene ratios: there is no evidence of two separate genes taking part at all. How can such variation be explained with a single gene? The answer is that there is variation for the single gene itself—that is, multiple allelism. Perhaps there are three alleles of one gene; let’s call the gene w (for wing-cover color) and represent the alleles as w g, w b, and w t. Now we have an additional problem, which is to determine the dominance of these alleles. Cross 1 tells us something about dominance because all of the progeny of a blue × green cross are blue; hence, blue appears to be dominant over green. This conclusion is supported by cross 5, because the green determinant must have been present in the parental stock to appear in the progeny. Cross 3 informs us about the turquoise determinants, which must have been present, although unexpressed, in the parental stock because there are turquoise wing covers in the progeny. So green must be dominant over turquoise. Hence, we have
24 4 CHA P TER 6 Gene Interaction
formed a model in which the dominance is w b > w g > w t. Indeed, the inferred position of the w t allele at the bottom of the dominance series is supported by the results of cross 7, where turquoise shows up in the progeny of a blue × green cross. b. Now it is just a matter of deducing the specific genotypes. Notice that the question states that the parents were taken from a polymorphic population, which means that they could be either homozygous or heterozygous. A parent with blue wing covers, for example, might be homozygous (w b/w b) or heterozygous (w b/w g or w b/w t). Here, a little trial and error and common sense are called for, but, by this stage, the question has essentially been answered, and all that remains is to “cross the t’s and dot the i’s.” The following genotypes explain the results. A dash indicates that the genotype may be either homozygous or heterozygous in having a second allele farther down the allelic series. Cross
Parents
1
wb/wb
2
wb/wt × wb/wt
3 4
wb/- : 41 wt/wt
3 4
wg/wt × wg/wt wb/wt × wt/wt
3 4
wg/- : 41 wt/wt wb/wt : 21 wt/wt
5 6 7 8
wb/wg × wb/wg wb/wg × wg/wg wb/wt × wg/wt wt/wt × wt/wt
×
Progeny wg/-
wb/wg or wb/-
1 2 3 4
wb/- : 41 wg/wg 1 b g 1 g g 2 w /w : 2 w /w 1 1 b g t 1 t t 2 w /- : 4 w /w : 4 w /w all wt/wt
SOLVED PROBLEM 3. The leaves of pineapples can be classi-
fied into three types: spiny (S), spiny tip (ST), and piping (nonspiny; P). In crosses between pure strains followed by intercrosses of the F1, the following results appeared:
Phenotypes
Cross
Parental
F1 F2
ST × S P × ST P × S
ST P P
1 2 3
99 ST : 34 S 120 P : 39 ST 95 P : 25 ST : 8 S
ratio. How do we know this ratio? Well, there are simply not that many complex ratios in genetics, and trial and error brings us to the 12 : 3 : 1 quite quickly. In the 128 progeny total, the numbers of 96 : 24 : 8 are expected, but the actual numbers fit these expectations remarkably well. One of the principles of this chapter is that modified Mendelian ratios reveal gene interactions. Cross 3 gives F2 numbers appropriate for a modified dihybrid Mendelian ratio, and so it looks as if we are dealing with a two-gene interaction. It seems the most promising place to start; we can return to crosses 1 and 2 and try to fit them in later. Any dihybrid ratio is based on the phenotypic proportions 9 : 3 : 3 : 1. Our observed modification groups them as follows: 9 A/- ; B/- 3 A/- ; b/b 3 a/a ; B/- 1 a/a ; b/b
u
12 piping 3 spiny tip 1 spiny
So, without worrying about the name of the type of gene interaction (we are not asked to supply this anyway), we can already define our three pineapple-leaf phenotypes in relation to the proposed allelic pairs A/a and B/b: piping = A/- (B/b irrelevant) spiny tip = a/a ; B/ spiny = a/a ; b/b What about the parents of cross 3? The spiny parent must be a/a ; b/b, and, because the B gene is needed to produce F2 spiny-tip leaves, the piping parent must be A/A ; B/B. (Note that we are told that all parents are pure, or homozygous.) The F1 must therefore be A/a ; B/b. Without further thought, we can write out cross 1 as follows: a/a ; B/B a/a ; b/b a/a ; B/b
3 4
a/a ; B/–
1 4
a/a ; b/b
a. Assign gene symbols. Explain these results in regard to the genotypes produced and their ratios. b. Using the model from part a, give the phenotypic ratios that you would expect if you crossed (1) the F1 progeny from piping × spiny with the spiny parental stock and (2) the F1 progeny of piping × spiny with the F1 progeny of spiny × spiny tip.
Cross 2 can be partly written out without further thought by using our arbitrary gene symbols:
Solution a. First, let’s look at the F2 ratios. We have clear 3 : 1 ratios in crosses 1 and 2, indicating single-gene segregations. Cross 3, however, shows a ratio that is almost certainly a 12 : 3 : 1
We know that the F2 of cross 2 shows single-gene segregation, and it seems certain now that the A/a allelic pair has a role. But the B allele is needed to produce the spiny-tip phenotype, and so all plants must be homozygous B/B:
A/A ; –/– a/a ; B/B A/a ; B/–
3 4
A/– ; –/–
1 4
a/a ; B/–
Problems 24 5
A/A ; B/B a/a ; B/B A/a ; B/B
3 4 1 4
A/– ; B/B a/a ; B/B
Notice that the two single-gene segregations in crosses 1 and 2 do not show that the genes are not interacting. What is shown is that the two-gene interaction is not revealed by these crosses—only by cross 3, in which the F1 is heterozygous for both genes. b. Now it is simply a matter of using Mendel’s laws to predict cross outcomes:
(1) A/a ; B/b a/a ; b/b (independent assortment in a standard testcross)
1 4
A/a ; B/b
1 4
A/a ; b/b
1 4
a/a ; B/b
spiny tip
1 4
a/a ; b/b
spiny
piping
(2) A/a ; B/b a/a ; B/b 1 2
1 2
A/a
a/a
3 4
B/–
3 8
1 4
b/b
1 8
3 4
B/–
3 8
spiny tip
1 4
b/b
1 8
spiny
1 2
piping
p r obl e m s Most of the problems are also available for review/grading through the launchpad/iga11e. Working with the Figures
1. In Figure 6-1, a. what do the yellow stars represent? b. explain in your own words why the heterozygote is functionally wild type. 2. In Figure 6-2, explain how the mutant polypeptide acts as a spoiler and what its net effect on phenotype is. 3. In Figure 6-6, assess the allele V f with respect to the V by allele: is it dominant? recessive? codominant? incompletely dominant? 4. In Figure 6-11, a. in view of the position of HPA oxidase earlier in the pathway compared to that of HA oxidase, would you expect people with tyrosinosis to show symptoms of alkaptonuria? b. if a double mutant could be found, would you expect tyrosinosis to be epistatic to alkaptonuria? 5. In Figure 6-12, a. what do the dollar, pound, and yen symbols represent? b. why can’t the left-hand F1 heterozygote synthesize blue pigment? 6. In Figure 6-13, explain at the protein level why this heterokaryon can grow on minimal medium. 7. In Figure 6-14, write possible genotypes for each of the four snakes illustrated. 8. In Figure 6-15, a. which panel represents the double mutant? b. state the function of the regulatory gene.
http://www.whfreeman.com/
c. in the situation in panel b, would protein from the active protein gene be made? 9. In Figure 6-16, if you selfed 10 different F2 pink plants, would you expect to find any white-flowered plants among the offspring? Any blue-flowered plants? 10. In Figure 6-19, a. what do the square/triangular pegs and holes represent? b. is the suppressor mutation alone wild type in phenotype? 11. In Figure 6-21, propose a specific genetic explanation for individual Q (give a possible genotype, defining the alleles). B a s i c P r obl e m s
12. In humans, the disease galactosemia causes mental retardation at an early age. Lactose (milk sugar) is broken down to galactose plus glucose. Normally, galactose is broken down further by the enzyme galactose-1-phosphate uridyltransferase (GALT). However, in patients with galactosemia, GALT is inactive, leading to a buildup of high levels of galactose, which, in the brain, causes mental retardation. How would you provide a secondary cure for galactosemia? Would you expect this disease phenotype to be dominant or recessive? 13. In humans, PKU (phenylketonuria) is a recessive disease caused by an enzyme inefficiency at step A in the following simplified reaction sequence, and AKU (alkaptonuria) is another recessive disease due to an enzyme inefficiency in one of the steps summarized as step B here:
246 CHA P TER 6 Gene Interaction
phenylalanine
A
tyrosine
B
CO2 H2O
A person with PKU marries a person with AKU. What phenotypes do you expect for their children? All normal, all having PKU only, all having AKU only, all having both PKU and AKU, or some having AKU and some having PKU? 14. In Drosophila, the autosomal recessive bw causes a dark brown eye, and the unlinked autosomal recessive st causes a bright scarlet eye. A homozygote for both genes has a white eye. Thus, we have the following correspondences between genotypes and phenotypes: st+/st+ ; bw+/bw+ = red eye (wild type) st+/st+ ; bw/bw = brown eye st/st ; bw+/bw+ = scarlet eye st/st ; bw/bw = white eye Construct a hypothetical biosynthetic pathway showing how the gene products interact and why the different mutant combinations have different phenotypes. 15. Several mutants are isolated, all of which require compound G for growth. The compounds (A to E) in the biosynthetic pathway to G are known, but their order in the pathway is not known. Each compound is tested for its ability to support the growth of each mutant (1 to 5). In the following table, a plus sign indicates growth and a minus sign indicates no growth. Compound tested
A
B
C
D
E
G
Mutant
- - - - +
- + - + +
- - - + +
+ + - + +
- - - - -
+ + + + +
1 2 3 4 5
a. What is the order of compounds A to E in the pathway? b. At which point in the pathway is each mutant blocked? c. Would a heterokaryon composed of double mutants 1,3 and 2,4 grow on a minimal medium? Would 1,3 and 3,4? Would 1,2 and 2,4 and 1,4? 16. In a certain plant, the flower petals are normally purple. Two recessive mutations arise in separate plants and are found to be on different chromosomes. Mutation 1 (m1) gives blue petals when homozygous (m1/m1). Mutation 2 (m 2 ) gives red petals when homozygous (m 2/m 2). Biochemists working on the synthesis of flower pigments in this species have already described the following pathway: blue pigment eA enzym colorless (white) compound
enzym
eB
red pigment
a. Which mutant would you expect to be deficient in enzyme A activity? b. A plant has the genotype M 1/m1 ; M 2/m 2. What would you expect its phenotype to be? c. If the plant in part b is selfed, what colors of progeny would you expect and in what proportions? d. Why are these mutants recessive? 17. In sweet peas, the synthesis of purple anthocyanin pigment in the petals is controlled by two genes, B and D. The pathway is white intermediate
gene B enzyme
blue intermediate
gene D enzyme
anthocyanin (purple)
a. What color petals would you expect in a purebreeding plant unable to catalyze the first reaction? b. What color petals would you expect in a purebreeding plant unable to catalyze the second reaction? c. If the plants in parts a and b are crossed, what color petals will the F1 plants have? d. What ratio of purple : blue : white plants would you expect in the F2? 18. If a man of blood-group AB marries a woman of bloodgroup A whose father was of blood-group O, to what different blood groups can this man and woman expect their children to belong? 19. Most of the feathers of erminette fowl are light colored, with an occasional black one, giving a flecked appearance. A cross of two erminettes produced a total of 48 progeny, consisting of 22 erminettes, 14 blacks, and 12 pure whites. What genetic basis of the erminette pattern is suggested? How would you test your hypotheses? 20. Radishes may be long, round, or oval, and they may be red, white, or purple. You cross a long, white variety with a round, red one and obtain an oval, purple F1. The F2 shows nine phenotypic classes as follows: 9 long, red; 15 long, purple; 19 oval, red; 32 oval, purple; 8 long, white; 16 round, purple; 8 round, white; 16 oval, white; and 9 round, red. a. Provide a genetic explanation of these results. Be sure to define the genotypes and show the constitution of the parents, the F1, and the F2. b. Predict the genotypic and phenotypic proportions in the progeny of a cross between a long, purple radish and an oval, purple one. 21. In the multiple-allele series that determines coat color in rabbits, c + encodes agouti, c ch encodes chinchilla (a beige coat color), and c h encodes Himalayan. Dominance is in the order c + > c ch > c h. In a cross of c +/c ch × c ch/c h, what proportion of progeny will be chinchilla?
Problems 247
22. Black, sepia, cream, and albino are coat colors of guinea pigs. Individual animals (not necessarily from pure lines) showing these colors were intercrossed; the results are tabulated as follows, where the abbreviations A (albino), B (black), C (cream), and S (sepia) represent the phenotypes: Phenotypes of progeny Parental Cross phenotypes B S C A 1 2 3 4 5 6 7 8 9 10
B B C S B B B B S C
× B × A × C × C × A × C × S × S × S × A
22 10 0 0 13 19 18 14 0 0
0 9 0 24 0 20 20 8 26 0
0 0 34 11 12 0 0 6 9 15
7 0 11 12 0 0 0 0 0 17
a. Deduce the inheritance of these coat colors, and use gene symbols of your own choosing. Show all parent and progeny genotypes. b. If the black animals in crosses 7 and 8 are crossed, what progeny proportions can you predict by using your model? 23. In a maternity ward, four babies become accidentally mixed up. The ABO types of the four babies are known to be O, A, B, and AB. The ABO types of the four sets of parents are determined. Indicate which baby belongs to each set of parents: (a) AB × O, (b) A × O, (c) A × AB, (d) O × O. 24. Consider two blood polymorphisms that humans have in addition to the ABO system. Two alleles LM and LN determine the M, N, and MN blood groups. The dominant allele R of a different gene causes a person to have the Rh+ (rhesus positive) phenotype, whereas the homozygote for r is Rh− (rhesus negative). Two men took a paternity dispute to court, each claiming three children to be his own. The blood groups of the men, the children, and their mother were as follows: Person Blood group husband wife’s lover wife child 1 child 2 child 3
O AB A O A A
M MN N MN N MN
Rh+ RhRh+ Rh+ Rh+ Rh-
From this evidence, can the paternity of the children be established? 25. On a fox ranch in Wisconsin, a mutation arose that gave a “platinum” coat color. The platinum color proved very
popular with buyers of fox coats, but the breeders could not develop a pure-breeding platinum strain. Every time two platinums were crossed, some normal foxes appeared in the progeny. For example, the repeated matings of the same pair of platinums produced 82 platinum and 38 normal progeny. All other such matings gave similar progeny ratios. State a concise genetic hypothesis that accounts for these results. 26. For several years, Hans Nachtsheim investigated an inherited anomaly of the white blood cells of rabbits. This anomaly, termed the Pelger anomaly, is the arrest of the segmentation of the nuclei of certain white cells. This anomaly does not appear to seriously burden the rabbits. a. When rabbits showing the Pelger anomaly were mated with rabbits from a true-breeding normal stock, Nachtsheim counted 217 offspring showing the Pelger anomaly and 237 normal progeny. What is the genetic basis of the Pelger anomaly? b. When rabbits with the Pelger anomaly were mated with each other, Nachtsheim found 223 normal progeny, 439 with the Pelger anomaly, and 39 extremely abnormal progeny. These very abnormal progeny not only had defective white blood cells, but also showed severe deformities of the skeletal system; almost all of them died soon after birth. In genetic terms, what do you suppose these extremely defective rabbits represented? Why were there only 39 of them? c. What additional experimental evidence might you collect to test your hypothesis in part b? d. In Berlin, about 1 human in 1000 shows a Pelger anomaly of white blood cells very similar to that described for rabbits. The anomaly is inherited as a simple dominant, but the homozygous type has not been observed in humans. Based on the condition in rabbits, why do you suppose the human homozygous has not been observed? e. Again by analogy with rabbits, what phenotypes and genotypes would you expect among the children of a man and woman who both show the Pelger anomaly? (Data from A. M. Srb, R. D. Owen, and R. S. Edgar, General Genetics, 2nd ed. W. H. Freeman and Company, 1965.) 27. Two normal-looking fruit flies were crossed, and, in the progeny, there were 202 females and 98 males. a. What is unusual about this result? b. Provide a genetic explanation for this anomaly. c. Provide a test of your hypothesis. 28. You have been given a virgin Drosophila female. You notice that the bristles on her thorax are much shorter than normal. You mate her with a normal male (with long bristles) and obtain the following F1 progeny:
248 CHA P TER 6 Gene Interaction
short-bristled females, 13 long-bristled females, and long-bristled males. A cross of the F1 long-bristled females with their brothers gives only long-bristled F2. A cross of short-bristled females with their brothers gives 1 1 3 short-bristled females, 3 long-bristled females, and 1 3 long-bristled males. Provide a genetic hypothesis to account for all these results, showing genotypes in every cross.
32. In roses, the synthesis of red pigment is by two steps in a pathway, as follows:
29. A dominant allele H reduces the number of body bristles that Drosophila flies have, giving rise to a “hairless” phenotype. In the homozygous condition, H is lethal. An independently assorting dominant allele S has no effect on bristle number except in the presence of H, in which case a single dose of S suppresses the hairless phenotype, thus restoring the hairy phenotype. However, S also is lethal in the homozygous (S/S) condition. a. What ratio of hairy to hairless flies would you find in the live progeny of a cross between two hairy flies both carrying H in the suppressed condition? b. When the hairless progeny are backcrossed with a parental hairy fly, what phenotypic ratio would you expect to find among their live progeny?
b. What would the phenotype be of a plant homozygous for a null mutation of gene Q?
1 3 1 3
30. After irradiating wild-type cells of Neurospora (a haploid fungus), a geneticist finds two leucine-requiring auxotrophic mutants. He combines the two mutants in a heterokaryon and discovers that the heterokaryon is prototrophic. a. Were the mutations in the two auxotrophs in the same gene in the pathway for synthesizing leucine or in two different genes in that pathway? Explain. b. Write the genotype of the two strains according to your model. c. What progeny and in what proportions would you predict from crossing the two auxotrophic mutants? (Assume independent assortment.) 31. A yeast geneticist irradiates haploid cells of a strain that is an adenine-requiring auxotrophic mutant, caused by mutation of the gene ade1. Millions of the irradiated cells are plated on minimal medium, and a small number of cells divide and produce prototrophic colonies. These colonies are crossed individually with a wildtype strain. Two types of results are obtained: (1) prototroph × wild type : progeny all prototrophic (2) prototroph × wild type : progeny 75% prototrophic, 25% adenine-requiring auxotrophs a. Explain the difference between these two types of results. b. Write the genotypes of the prototrophs in each case. c. What progeny phenotypes and ratios do you predict from crossing a prototroph of type 2 by the original ade1 auxotroph?
colorless intermediate
gene P
magenta intermediate
gene Q
red pigment
a. What would the phenotype be of a plant homozygous for a null mutation of gene P?
c. What would the phenotype be of a plant homozygous for null mutations of genes P and Q? d. Write the genotypes of the three strains in parts a, b, and c. e. What F2 ratio is expected from crossing plants from parts a and b? (Assume independent assortment.) 33. Because snapdragons (Antirrhinum) possess the pigment anthocyanin, they have reddish purple petals. Two pure anthocyaninless lines of Antirrhinum were developed, one in California and one in Holland. They looked identical in having no red pigment at all, manifested as white (albino) flowers. However, when petals from the two lines were ground up together in buffer in the same test tube, the solution, which appeared colorless at first, gradually turned red. a. What control experiments should an investigator conduct before proceeding with further analysis? b. What could account for the production of the red color in the test tube? c. According to your explanation for part b, what would be the genotypes of the two lines? d. If the two white lines were crossed, what would you predict the phenotypes of the F1 and F2 to be? 34. The frizzle fowl is much admired by poultry fanciers. It gets its name from the unusual way that its feathers curl up, giving the impression that it has been (in the memorable words of animal geneticist F. B. Hutt) “pulled backwards through a knothole.” Unfortunately, frizzle fowl do not breed true: when two frizzles are intercrossed, they always produce 50 percent frizzles, 25 percent normal, and 25 percent with peculiar woolly feathers that soon fall out, leaving the birds naked. a. Give a genetic explanation for these results, showing genotypes of all phenotypes, and provide a statement of how your explanation works. b. If you wanted to mass-produce frizzle fowl for sale, which types would be best to use as a breeding pair? 35. The petals of the plant Collinsia parviflora are normally blue, giving the species its common name, blue-eyed Mary. Two pure-breeding lines were obtained from color variants found in nature; the first line had pink pet-
Problems 249
als, and the second line had white petals. The following crosses were made between pure lines, with the results shown: Parents blue × white blue × pink pink × white
F1 F2 blue blue blue
101 blue, 33 white 192 blue, 63 pink 272 blue, 121 white, 89 pink
a. Explain these results genetically. Define the allele symbols that you use, and show the genetic constitution of the parents, the F1, and the F2 in each cross. b. A cross between a certain blue F2 plant and a certain white F2 plant gave progeny of which 83 were blue, 81 were pink, and 21 were white. What must the genotypes of these two F2 plants have been? www
Unpacking Problem 35 www
1. What is the character being studied? 2. What is the wild-type phenotype? 3. What is a variant? 4. What are the variants in this problem? 5. What does “in nature” mean? 6. In what way would the variants have been found in nature? (Describe the scene.) 7. At which stages in the experiments would seeds be used? 8. Would the way of writing a cross “blue × white,” for example, mean the same as “white × blue”? Would you expect similar results? Why or why not? 9. In what way do the first two rows in the table differ from the third row? 10. Which phenotypes are dominant? 11. What is complementation? 12. Where does the blueness come from in the progeny of the pink × white cross? 13. What genetic phenomenon does the production of a blue F1 from pink and white parents represent? 14. List any ratios that you can see. 15. Are there any monohybrid ratios? 16. Are there any dihybrid ratios? 17. What does observing monohybrid and dihybrid ratios tell you? 18. List four modified Mendelian ratios that you can think of. 19. Are there any modified Mendelian ratios in the problem? 20. What do modified Mendelian ratios indicate generally? 21. What is indicated by the specific modified ratio or ratios in this problem?
22. Draw chromosomes representing the meioses in the parents in the cross blue × white and representing meiosis in the F1. 23. Repeat step 22 for the cross blue × pink. 36. A woman who owned a purebred albino poodle (an autosomal recessive phenotype) wanted white puppies; so she took the dog to a breeder, who said he would mate the female with an albino stud male, also from a pure stock. When six puppies were born, all of them were black; so the woman sued the breeder, claiming that he replaced the stud male with a black dog, giving her six unwanted puppies. You are called in as an expert witness, and the defense asks you if it is possible to produce black offspring from two pure-breeding recessive albino parents. What testimony do you give? 37. A snapdragon plant that bred true for white petals was crossed with a plant that bred true for purple petals, and all the F1 had white petals. The F1 was selfed. Among the F2, three phenotypes were observed in the following numbers: white 240 solid purple 61 spotted purple 19 Total 320 a. Propose an explanation for these results, showing genotypes of all generations (make up and explain your symbols). b. A white F2 plant was crossed with a solid purple F2 plant, and the progeny were
white 50% solid purple 25% spotted purple 25%
What were the genotypes of the F2 plants crossed? 38. Most flour beetles are black, but several color variants are known. Crosses of pure-breeding parents produced the following results (see table) in the F1 generation, and intercrossing the F1 from each cross gave the ratios shown for the F2 generation. The phenotypes are abbreviated Bl, black; Br, brown; Y, yellow; and W, white. Cross
Parents
F1 F2
Br × Y Bl × Br Bl × Y W × Y W × Br Bl × W
Br Bl Bl Bl Bl Bl
1 2 3 4 5 6
3 Br : 1 Y 3 Bl : 1 Br 3 Bl : 1 Y 9 Bl : 3 Y : 4 W 9 Bl : 3 Br : 4 W 9 Bl : 3 Y : 4 W
250 CHA P TER 6 Gene Interaction
a. From these results, deduce and explain the inheritance of these colors. b. Write the genotypes of each of the parents, the F1, and the F2 in all crosses. 39. Two albinos marry and have four normal children. How is this possible? 40. Consider the production of flower color in the Japanese morning glory (Pharbitis nil ). Dominant alleles of either of two separate genes (A/− • b/b or a/a • B/−) produce purple petals. A/− • B/− produces blue petals, and a/a • b/b produces scarlet petals. Deduce the genotypes of parents and progeny in the following crosses: Cross
Parents
Progeny
blue × scarlet purple × purple blue × blue blue × purple purple × scarlet
1 4 1 4 3 4 3 8 1 2
1 2 3 4 5
type mice are crossed with cinnamons, all of the F1 are wild type and the F2 has a 3 : 1 ratio of wild type to cinnamon. Diagram this cross as in part a, letting B stand for the wild-type black allele and b stand for the cinnamon brown allele. c. When mice of a true-breeding cinnamon line are crossed with mice of a true-breeding nonagouti (black) line, all of the F1 are wild type. Use a genetic diagram to explain this result. d. In the F2 of the cross in part c, a fourth color called chocolate appears in addition to the parental cinnamon and nonagouti and the wild type of the F1. Chocolate mice have a solid, rich brown color. What is the genetic constitution of the chocolates? e. Assuming that the A/a and B/b allelic pairs assort independently of each other, what do you expect to be the relative frequencies of the four color types in the F2 described in part d ? Diagram the cross of parts c and d, showing phenotypes and genotypes (including gametes).
blue : 21 purple : 41 scarlet blue : 21 purple : 41 scarlet blue : 41 purple 4 blue : 8 purple : 81 scarlet purple : 21 scarlet
f. What phenotypes would be observed in what proportions in the progeny of a backcross of F1 mice from part c with the cinnamon parental stock? With the nonagouti (black) parental stock? Diagram these backcrosses. g. Diagram a testcross for the F1 of part c. What colors would result and in what proportions? h. Albino (pink-eyed white) mice are homozygous for the recessive member of an allelic pair C/c, which assorts independently of the A/a and B/b pairs. Suppose that you have four different highly inbred (and therefore presumably homozygous) albino lines. You cross each of these lines with a true-breeding wild-type line, and you raise a large F2 progeny from each cross. What genotypes for the albino lines can you deduce from the following F2 phenotypes?
41. Corn breeders obtained pure lines whose kernels turn sun red, pink, scarlet, or orange when exposed to sunlight (normal kernels remain yellow in sunlight). Some crosses between these lines produced the following results. The phenotypes are abbreviated O, orange; P, pink; Sc, scarlet; and SR, sun red.
Phenotypes
Cross
Parents
SR × P O × SR O × P O × Sc
1 2 3 4
F1 F2 all SR all SR all O all Y
66 SR : 20 P 998 SR : 314 O 1300 O : 429 P 182 Y : 80 O : 58 Sc
Analyze the results of each cross, and provide a unifying hypothesis to account for all the results. (Explain all symbols that you use.) 42. Many kinds of wild animals have the agouti coloring pattern, in which each hair has a yellow band around it. a. Black mice and other black animals do not have the yellow band; each of their hairs is all black. This absence of wild agouti pattern is called nonagouti. When mice of a true-breeding agouti line are crossed with nonagoutis, the F1 is all agouti and the F2 has a 3 : 1 ratio of agoutis to nonagoutis. Diagram this cross, letting A represent the allele responsible for the agouti phenotype and a, nonagouti. Show the phenotypes and genotypes of the parents, their gametes, the F1, their gametes, and the F2. b. Another inherited color deviation in mice substitutes brown for the black color in the wild-type hair. Such brown-agouti mice are called cinnamons. When wild-
F2 of line 1 2 3 4
Phenotypes of progeny Wild type Black 87 62 96 287
0 0 30 86
Cinna- mon
Chocolate
32 0 0 92
0 0 0 29
Albino 39 18 41 164
(Adapted from A. M. Srb, R. D. Owen, and R. S. Edgar, General Genetics, 2nd ed. W. H. Freeman and Company, 1965.) 43. An allele A that is not lethal when homozygous causes rats to have yellow coats. The allele R of a separate gene that assorts independently produces a black coat. Together, A and R produce a grayish coat, whereas a and r produce a white coat. A gray male is crossed with a yel-
Problems 251
1
low female, and the F1 is 83 yellow, 83 gray, 8 black, and white. Determine the genotypes of the parents.
1 8
44. The genotype r/r ; p/p gives fowl a single comb, R/− ; P/− gives a walnut comb, r/r ; P/− gives a pea comb, and R/− ; p/p gives a rose comb (see the illustrations). Assume independent assortment.
An individual of genotype td grows only when its medium supplies tryptophan. The allele su assorts independently of td; its only known effect is to suppress the td phenotype. Therefore, strains carrying both td and su do not require tryptophan for growth. a. If a td ; su strain is crossed with a genotypically wildtype strain, what genotypes are expected in the progeny and in what proportions? b. What will be the ratio of tryptophan-dependent to tryptophan-independent progeny in the cross of part a?
Single
Walnut
Pea
Rose
a. What comb types will appear in the F1 and in the F2 and in what proportions if single-combed birds are crossed with birds of a true-breeding walnut strain? b. What are the genotypes of the parents in a walnut × rose mating from which the progeny are 83 rose, 83 walnut, 1 1 8 pea, and 8 single? c. What are the genotypes of the parents in a walnut × rose mating from which all the progeny are walnut? d. How many genotypes produce a walnut phenotype? Write them out. 45. The production of eye-color pigment in Drosophila requires the dominant allele A. The dominant allele P of a second independent gene turns the pigment to purple, but its recessive allele leaves it red. A fly producing no pigment has white eyes. Two pure lines were crossed with the following results: P
red-eyed female white-eyed male
F1
purple-eyed females red-eyed males F1 F1
F2
both males and females:
3 8 3 8 2 8
purple eyed red eyed white eyed
Explain this mode of inheritance, and show the genotypes of the parents, the F1, and the F2. 46. When true-breeding brown dogs are mated with certain true-breeding white dogs, all the F1 pups are white. The F2 progeny from some F1 × F1 crosses were 118 white, 32 black, and 10 brown pups. What is the genetic basis for these results? 47. Wild-type strains of the haploid fungus Neurospora can make their own tryptophan. An abnormal allele td renders the fungus incapable of making its own tryptophan.
48. Mice of the genotypes A/A ; B/B ; C/C ; D/D ; S/S and a/a ; b/b ; c/c ; d/d ; s/s are crossed. The progeny are intercrossed. What phenotypes will be produced in the F2 and in what proportions? [The allele symbols stand for the following: A = agouti, a = solid (nonagouti); B = black pigment, b = brown; C = pigmented, c = albino; D = nondilution, d = dilution (milky color); S = unspotted, s = pigmented spots on white background.] 49. Consider the genotypes of two lines of chickens: the pure-line mottled Honduran is i/i ; D/D ; M/M ; W/W, and the pure-line leghorn is I/I ; d/d ; m/m ; w/w, where I = white feathers, i = colored feathers D = duplex comb, d = simplex comb M = bearded, m = beardless W = white skin, w = yellow skin These four genes assort independently. Starting with these two pure lines, what is the fastest and most convenient way of generating a pure line that has colored feathers, has a simplex comb, is beardless, and has yellow skin? Make sure that you show a. the breeding pedigree. b. the genotype of each animal represented. c. how many eggs to hatch in each cross, and why this number. d. why your scheme is the fastest and the most convenient. 50. The following pedigree is for a dominant phenotype governed by an autosomal allele. What does this pedigree suggest about the phenotype, and what can you deduce about the genotype of individual A?
A
51. Petal coloration in foxgloves is determined by three genes. M encodes an enzyme that synthesizes anthocyanin,
252 CHA P TER 6 Gene Interaction
the purple pigment seen in these petals; m/m produces no pigment, resulting in the phenotype albino with yellowish spots. D is an enhancer of anthocyanin, resulting in a darker pigment; d/d does not enhance. At the third locus, w/w allows pigment deposition in petals, but W prevents pigment deposition except in the spots and so results in the white, spotted phenotype. Consider the following two crosses: Cross
Parents
Progeny
1 dark × white with purple yellowish spots 2 white with × light yellowish purple spots
1 2 1 2
dark purple : light purple
white with purple spots : 41 dark purple : 41 light purple 1 2
In each case, give the genotypes of parents and progeny with respect to the three genes. 52. In one species of Drosophila, the wings are normally round in shape, but you have obtained two pure lines, one of which has oval wings and the other sickle-shaped wings. Crosses between pure lines reveal the following results: Parents
F1
Female Male Female Male sickle round sickle sickle round sickle sickle round sickle oval oval sickle a. Provide a genetic explanation of these results, defining all allele symbols. b. If the F1 oval females from cross 3 are crossed with the F1 round males from cross 2, what phenotypic proportions are expected for each sex in the progeny? 53. Mice normally have one yellow band on each hair, but variants with two or three bands are known. A female mouse having one band was crossed with a male having three bands. (Neither animal was from a pure line.) The progeny were 1 Females 2 one band
1 2
Males
three bands
1 2
one band
1 2
two bands
a. Provide a clear explanation of the inheritance of these phenotypes. b. In accord with your model, what would be the outcome of a cross between a three-banded daughter and a one-banded son?
54. In minks, wild types have an almost black coat. Breeders have developed many pure lines of color variants for the mink-coat industry. Two such pure lines are platinum (blue gray) and aleutian (steel gray). These lines were used in crosses, with the following results: Cross
Parents
F1 F2
1 wild × platinum wild 2 wild × aleutian wild 3 platinum × aleutian wild
18 wild, 5 platinum 27 wild, 10 aleutian 133 wild 41 platinum 46 aleutian 17 sapphire (new)
a. Devise a genetic explanation of these three crosses. Show genotypes for the parents, the F1, and the F2 in the three crosses, and make sure that you show the alleles of each gene that you hypothesize for every mink. b. Predict the F1 and F2 phenotypic ratios from crossing sapphire with platinum and with aleutian pure lines. 55. In Drosophila, an autosomal gene determines the shape of the hair, with B giving straight and b giving bent hairs. On another autosome, there is a gene of which a dominant allele I inhibits hair formation so that the fly is hairless (i has no known phenotypic effect). a. If a straight-haired fly from a pure line is crossed with a fly from a pure-breeding hairless line known to be an inhibited bent genotype, what will the genotypes and phenotypes of the F1 and the F2 be? b. What cross would give the ratio 4 hairless : 3 straight : 1 bent? 56. The following pedigree concerns eye phenotypes in Tribolium beetles. The solid symbols represent black eyes, the open symbols represent brown eyes, and the cross symbols (X) represent the “eyeless” phenotype, in which eyes are totally absent. I
II III IV
1
1
2 1
2
3
3
5
4 2
3
4
1
a. From these data, deduce the mode of inheritance of these three phenotypes. b. Using defined gene symbols, show the genotype of beetle II-3.
Problems 253
57. A plant believed to be heterozygous for a pair of alleles B/b (where B encodes yellow and b encodes bronze) was selfed, and, in the progeny, there were 280 yellow and 120 bronze plants. Do these results support the hypothesis that the plant is B/b? 58. A plant thought to be heterozygous for two independently assorting genes (P/p ; Q/q) was selfed, and the progeny were
one another. Luxuriant growth is noted at both ends of the trpE streak and at one end of the trpD streak (see the figure below).
88 P/- ; Q/- 25 p/p ; Q/ 32 P/- ; q/q 14 p/p ; q/q Do these results support the hypothesis that the original plant was P/p ; Q/q? 59. A plant of phenotype 1 was selfed, and, in the progeny, there were 100 plants of phenotype 1 and 60 plants of an alternative phenotype 2. Are these numbers compatible with expected ratios of 9 : 7, 13 : 3, and 3 : 1? Formulate a genetic hypothesis on the basis of your calculations. 60. Four homozygous recessive mutant lines of Drosophila melanogaster (labeled 1 through 4) showed abnormal leg coordination, which made their walking highly erratic. These lines were intercrossed; the phenotypes of the F1 flies are shown in the following grid, in which “+” represents wild-type walking and “−” represents abnormal walking:
1
2
3
4
1 2 3 4
- + + +
+ - - +
+ - - +
+ + + -
a. Do you think complementation has a role? b. Briefly explain the pattern of luxuriant growth. c. Draw the enzymatic steps that are defective in mutants trpB, trpD, and trpE in order in the tryptophansynthesizing pathway. d. Why was it necessary to add a small amount of tryptophan to the medium to demonstrate such a growth pattern? C h a ll e n g i n g P r obl e m s
62. A pure-breeding strain of squash that produced diskshaped fruits (see the accompanying illustration) was crossed with a pure-breeding strain having long fruits. The F1 had disk fruits, but the F2 showed a new phenotype, sphere, and was composed of the following proportions:
a. What type of test does this analysis represent? b. How many different genes were mutated in creating these four lines? c. Invent wild-type and mutant symbols, and write out full genotypes for all four lines and for the F1 flies. d. Do these data tell us which genes are linked? If not, how could linkage be tested?
Long
Sphere
Disk
long 32
sphere 178
disk 270
e. Do these data tell us the total number of genes taking part in leg coordination in this animal?
Propose an explanation for these results, and show the genotypes of the P, F1, and F2 generations.
61. Three independently isolated tryptophan-requiring mutants of haploid yeast are called trpB, trpD, and trpE. Cell suspensions of each are streaked on a plate of nutritional medium supplemented with just enough tryptophan to permit weak growth for a trp strain. The streaks are arranged in a triangular pattern so that they do not touch
63. Marfan’s syndrome is a disorder of the fibrous connective tissue, characterized by many symptoms, including long, thin digits; eye defects; heart disease; and long limbs. (Flo Hyman, the American volleyball star, suffered from Marfan’s syndrome. She died from a ruptured aorta.)
254 CHA P TER 6 Gene Interaction
I
II
III Symptoms Unknown, presumed normal
Eye lens displacement
Long fingers and toes
Examined, normal
Congenital heart disease
Very long, thin fingers and toes Questionably affected
a. Use the pedigree above to propose a mode of inheritance for Marfan’s syndrome. b. What genetic phenomenon is shown by this pedigree? c. Speculate on a reason for such a phenomenon. (Data from J. V. Neel and W. J. Schull, Human Heredity. University of Chicago Press, 1954.) 64. In corn, three dominant alleles, called A, C, and R, must be present to produce colored seeds. Genotype A/− ; C/− ; R/− is colored; all others are colorless. A colored plant is crossed with three tester plants of known genotype. With tester a/a ; c/c ; R/R, the colored plant produces 50 percent colored seeds; with a/a ; C/C ; r/r, it produces 25 percent colored; and with A/A ; c/c ; r/r, it produces 50 percent colored. What is the genotype of the colored plant? 65. The production of pigment in the outer layer of seeds of corn requires each of the three independently assorting genes A, C, and R to be represented by at least one dominant allele, as specified in Problem 64. The dominant allele Pr of a fourth independently assorting gene is required to convert the biochemical precursor into a purple pigment, and its recessive allele pr makes the pigment red. Plants that do not produce pigment have yellow seeds. Consider a cross of a strain of genotype A/A ; C/C ; R/R ; pr/pr with a strain of genotype a/a ; c/c ; r/r ; Pr/Pr. a. What are the phenotypes of the parents? b. What will be the phenotype of the F1? c. What phenotypes, and in what proportions, will appear in the progeny of a selfed F1? d. What progeny proportions do you predict from the testcross of an F1? 66. The allele B gives mice a black coat, and b gives a brown one. The genotype e/e of another, independently assorting gene prevents the expression of B and b, making the coat color beige, whereas E/− permits the expression of B
and b. Both genes are autosomal. In the following pedigree, black symbols indicate a black coat, pink symbols indicate brown, and white symbols indicate beige. I
II
III
2
1
1
1
2
3
2
3
4
4
5
5
6
6
7
a. What is the name given to the type of gene interaction in this example? b. What are the genotypes of the individual mice in the pedigree? (If there are alternative possibilities, state them.) 67. A researcher crosses two white-flowered lines of Antirrhinum plants as follows and obtains the following results: pure line 1 × pure line 2 ↓ F1 all white F1 × F1 ↓ F2 131 white 29 red a. Deduce the inheritance of these phenotypes; use clearly defined gene symbols. Give the genotypes of the parents, F1, and F2. b. Predict the outcome of crosses of the F1 with each parental line. 68. Assume that two pigments, red and blue, mix to give the normal purple color of petunia petals. Separate bio-
Problems 255
chemical pathways synthesize the two pigments, as shown in the top two rows of the accompanying diagram. “White” refers to compounds that are not pigments. (Total lack of pigment results in a white petal.) Red pigment forms from a yellow intermediate that is normally at a concentration too low to color petals. pathway I
white1
pathway II
white2
A
E
blue
yellow
B
red
D
white4
C
pathway III
white3
A third pathway, whose compounds do not contribute pigment to petals, normally does not affect the blue and red pathways, but, if one of its intermediates (white3) should build up in concentration, it can be converted into the yellow intermediate of the red pathway. In the diagram, the letters A through E represent enzymes; their corresponding genes, all of which are unlinked, may be symbolized by the same letters. Assume that wild-type alleles are dominant and encode enzyme function and that recessive alleles result in a lack of enzyme function. Deduce which combinations of true-breeding parental genotypes could be crossed to produce F2 progeny in the following ratios: a. 9 purple : 3 green : 4 blue c. 13 purple : 3 blue (Note: Blue mixed with yellow makes green; assume that no mutations are lethal.) 69. The flowers of nasturtiums (Tropaeolum majus) may be single (S), double (D), or superdouble (Sd). Superdoubles are female sterile; they originated from a double-flowered variety. Crosses between varieties gave the progeny listed in the following table, in which pure means “pure breeding.”
1 2 3 4 5 6
pure S × pure D cross 1 F1 × cross 1 F1 pure D × Sd pure S × Sd pure D × cross 4 Sd progeny pure D × cross 4 S progeny
F 1 all Y all R all R all R Y1 F2 all Y 9 R 9 R 9R 7 Y 4 Y 4O 3 B 3Y F1 all Y all R all R Y2 F2 all Y 9 R 9R 4 Y 4Y 3 B 3O F1 all B all R B F2 all B 9R 4 O 3 B F1 all O O F2 all O
b. Show how the F1 phenotypes and the F2 ratios are produced.
d. 9 purple : 3 red : 3 green : 1 yellow
Parents
70. In a certain species of fly, the normal eye color is red (R). Four abnormal phenotypes for eye color were found: two were yellow (Y1 and Y2), one was brown (B), and one was orange (O). A pure line was established for each phenotype, and all possible combinations of the pure lines were crossed. Flies of each F1 were intercrossed to produce an F2. The F1 and the F2 flies are shown within the following square; the pure lines are given at the top and at the left-hand side. Y1 Y2 B O
a. Define your own symbols, and list the genotypes of all four pure lines.
b. 9 purple : 3 red : 3 blue : 1 white
Cross
a. all the genotypes in each of the six rows. b. the proposed origin of the superdouble.
Progeny All S 78 S : 27 D 112 Sd : 108 D 8 Sd : 7 S 18 Sd : 19 S 14 D : 16 S
Using your own genetic symbols, propose an explanation for these results, showing
c. Show a biochemical pathway that explains the genetic results, indicating which gene controls which enzyme. 71. In common wheat, Triticum aestivum, kernel color is determined by multiply duplicated genes, each with an R and an r allele. Any number of R alleles will give red, and a complete lack of R alleles will give the white phenotype. In one cross between a red pure line and a white pure 63 1 line, the F2 was 64 red and 64 white. a. How many R genes are segregating in this system? b. Show the genotypes of the parents, the F1, and the F2. c. Different F2 plants are backcrossed with the white parent. Give examples of genotypes that would give the following progeny ratios in such backcrosses: (1) 1 red : 1 white, (2) 3 red : 1 white, (3) 7 red : 1 white. d. What is the formula that generally relates the number of segregating genes to the proportion of red individuals in the F2 in such systems? 72. The following pedigree shows the inheritance of deafmutism.
256 CHA P TER 6 Gene Interaction
I
1
II
1
2
3
2
4
III
3
5
1
6
2
7
3
8
4
10 11 12 13 14 15
9
5
4
6
7
a. Provide an explanation for the inheritance of this rare condition in the two families in generations I and II, showing the genotypes of as many persons as possible; use symbols of your own choosing. b. Provide an explanation for the production of only normal persons in generation III, making sure that your explanation is compatible with the answer to part a. 73. The pedigree below is for blue sclera (bluish thin outer wall of the eye) and brittle bones. I
II
III
IV
3
1 2
1
2
3
4
1
2 3
4
5
1
2 ,
3
4 5
6
5
7
blue sclera
6 7 8
9 10 11 12 13 14 15
6
9 10 11 12 13 14 15 16 17
8
7 8
9 10
11 12 13
brittle bones
a. Are these two abnormalities caused by the same gene or by separate genes? State your reasons clearly. b. Is the gene (or genes) autosomal or sex-linked? c. Does the pedigree show any evidence of incomplete penetrance or expressivity? If so, make the best calculations that you can of these measures. 74. Workers of the honeybee line known as Brown (nothing to do with color) show what is called “hygienic behavior”; that is, they uncap hive compartments containing dead pupae and then remove the dead pupae. This behavior prevents the spread of infectious bacteria through the colony. Workers of the Van Scoy line, however, do not perform these actions, and therefore this line is said to be “nonhygienic.” When a queen from the Brown line was mated with Van Scoy drones, all the F1 were nonhygienic. When drones from this F1 inseminated a queen from the Brown line, the progeny behaviors were as follows:
1 4
hygienic
1 4
uncapping but no removing of pupae
1 2
nonhygienic
However, when the compartment of dead pupae was uncapped by the beekeeper and the nonhygienic honeybees were examined further, about half the bees were found to remove the dead pupae, but the other half did not. a. Propose a genetic hypothesis to explain these behavioral patterns. b. Discuss the data in relation to epistasis, dominance, and environmental interaction. (Note: Workers are sterile, and all bees from one line carry the same alleles.) 75. The normal color of snapdragons is red. Some pure lines showing variations of flower color have been found. When these pure lines were crossed, they gave the following results (see the table): Cross
1 2 3 4 5 6 7
Parents
F1 F2
orange × yellow red × orange red × yellow red × white yellow × white orange × white red × white
orange red red red red red red
3 orange : 1 yellow 3 red : 1 orange 3 red : 1 yellow 3 red : 1 white 9 red : 3 yellow : 4 white 9 red : 3 orange : 4 white 9 red : 3 yellow : 4 white
a. Explain the inheritance of these colors. b. Write the genotypes of the parents, the F1, and the F 2. 76. Consider the following F1 individuals in different species and the F2 ratios produced by selfing: F 1 1 cream 2 orange 3 black 4 solid red
Phenotypic ratio in the F2 12 16 9 16 13 16 9 16
cream orange black solid red
3 16 7 16 3 16 3 16
black yellow white mottled red
1 16
gray
4 16
small red dots
If each F1 were testcrossed, what phenotypic ratios would result in the progeny of the testcross? 77. To understand the genetic basis of locomotion in the diploid nematode Caenorhabditis elegans, recessive mutations were obtained, all making the worm “wiggle” ineffectually instead of moving with its usual smooth gliding motion. These mutations presumably affect the nervous or muscle systems. Twelve homozygous mutants were intercrossed, and the F1 hybrids were examined to see if they wiggled. The results were as follows, where a plus sign means that the F1 hybrid was wild type (gliding) and “w” means that the hybrid wiggled.
Problems 257
1 2 3 4 5 6 7 8 9 10 11 12 1 w + + + w + + + + + + + 2 w + + + w + w + w + + 3 w w + + + + + + + + 4 w + + + + + + + + 5 w + + + + + + + 6 w + w + w + + 7 w + + + w w 8 w + w + + 9 w + + + 10 w + + 11 w w 12 w a. Explain what this experiment was designed to test. b. Use this reasoning to assign genotypes to all 12 mutants. c. Explain why the phenotype of the F1 hybrids between mutants 1 and 2 differed from that of the hybrids between mutants 1 and 5. 78. A geneticist working on a haploid fungus makes a cross between two slow-growing mutants called mossy and spider (referring to the abnormal appearance of the colonies). Tetrads from the cross are of three types (A, B, C), but two of them contain spores that do not germinate. Spore 1 2 3 4
A wild type wild type no germination no germination
B wild type spider mossy no germination
C spider spider mossy mossy
Devise a model to explain these genetic results, and propose a molecular basis for your model.
79. In the nematode C. elegans, some worms have blistered cuticles due to a recessive mutation in one of the bli genes. Someone studying a suppressor mutation that suppressed bli-3 mutations wanted to know if it would also suppress mutations in bli-4. They had a strain that was homozygous for this recessive suppressor mutation, and its phenotype was wild type. a. How would they determine whether this recessive suppressor mutation would suppress mutations in bli-4? In other words, what is the genotype of the worms required to answer the question? b. What cross(es) would they do to make these worms? c. What results would they expect in the F2 if (1) it did act as a suppressor of bli-4? (2) it did not act as a suppressor of bli-4?
This page intentionally left blank
344
7
C h a p t e r
DNA: Structure and Replication
Learning Outcomes After completing this chapter, you will be able to • Assess the types of evidence (historical and modern) that can be used to show that DNA is the genetic material. • Evaluate the data used to build the doublehelix model of DNA. • Explain why the double-helical structure suggests a particular mechanism for DNA replication. • Illustrate the features of DNA replication that contribute to its speed and accuracy. • Explain why chromosome ends require special replication. • Predict the possible consequences to human health if end replication is defective.
Computer model of DNA. [ Kenneth Eward/Science Source/Getty Images.]
outline 7.1 DNA: the genetic material 7.2 DNA structure 7.3 Semiconservative replication 7.4 Overview of DNA replication 7.5 The replisome: a remarkable replication machine 7.6 Replication in eukaryotic organisms 7.7 Telomeres and telomerase: replication termination
259
26 0 CHAPTER 7 DNA: Structure and Replication
A sculpture of DNA
F i g u r e 7-1 [ Neil Grant/Alamy.]
J
ames Watson (an American microbial geneticist) and Francis Crick (an English physicist) solved the structure of DNA in 1953. Their model of the structure of DNA was revolutionary. It proposed a definition for the gene in chemical terms and, in doing so, paved the way for an understanding of gene action and heredity at the molecular level. A measure of the importance of their discovery is that the double-helical structure has become a cultural icon that is seen more and more frequently in paintings, in sculptures, and even in playgrounds (Figure 7-1). The story begins in the first half of the twentieth century, when the results of several experiments led scientists to conclude that DNA is the genetic material, not some other biological molecule such as a carbohydrate, protein, or lipid. DNA is a simple molecule made up of only four different building blocks (the four nucleotide bases). It was thus necessary to understand how this very simple molecule could be the blueprint for the incredible diversity of organisms on Earth. The model of the double helix proposed by Watson and Crick was built upon the results of scientists before them. They relied on earlier discoveries of the chemical composition of DNA and the ratios of its bases. In addition, X-ray diffraction pictures of DNA revealed to the trained eye that DNA is a helix of precise dimensions. Watson and Crick concluded that DNA is a double helix composed of two strands of linked nucleotide bases that wind around each other. The proposed structure of the hereditary material immediately suggested how it could serve as a blueprint and how this blueprint could be passed down through the generations. First, the information for making an organism is encoded in the sequence of the nucleotide bases composing the two DNA strands of the helix. Second, because of the rules of base complementarity discovered by Watson and Crick, the sequence of one strand dictates the sequence of the other strand. In this way, the genetic information in the DNA sequence can be passed down from one generation to the next by having each of the separated strands of DNA serve as a template for producing new copies of the molecule. In this chapter, we focus on DNA, its structure, and the production of DNA copies in a process called replication. Precisely how DNA is replicated is still an active area of research more than 50 years after the discovery of the double helix. Our current understanding of the mechanism of replication gives a central role to a protein machine, called the replisome. This complex of proteins coordinates the numerous reactions that are necessary for the rapid and accurate replication of DNA.
7.1 DNA: The Genetic Material Before we see how Watson and Crick solved the structure of DNA, let’s review what was known about genes and DNA at the time that they began their historic collaboration: 1. Genes—the hereditary “factors” described by Mendel—were known to be associated with specific traits, but their physical nature was not understood. Similarly, mutations were known to alter gene function, but the precise chemical nature of a mutation was not understood. 2. The one-gene–one-polypeptide hypothesis (described in Chapter 6) postulated that genes determine the structure of proteins and other polypeptides. 3. Genes were known to be carried on chromosomes. 4. The chromosomes were found to consist of DNA and protein. 5. The results of a series of experiments beginning in the 1920s revealed that DNA is the genetic material. These experiments, described next, showed that
7.1 DNA: The Genetic Material 261
bacterial cells that express one phenotype can be transformed into cells that express a different phenotype and that the transforming agent is DNA.
Discovery of transformation Frederick Griffith made a puzzling observation in the course of experiments performed in 1928 on the bacterium Streptococcus pneumoniae. This bacterium, which causes pneumonia in humans, is normally lethal in mice. However, some strains of this bacterial species have evolved to be less virulent (less able to cause disease or death). Griffith’s experiments are summarized in Figure 7-2. In these experiments, Griffith used two strains that are distinguishable by the appearance of their colonies when grown in laboratory cultures. One strain was a normal virulent type deadly to most laboratory animals. The cells of this strain are enclosed in a polysaccharide capsule, giving colonies a smooth appearance; hence, this strain is identified as S. Griffith’s other strain was a mutant nonvirulent type that grows in mice but is not lethal. In this strain, the polysaccharide coat is absent, giving colonies a rough appearance; this strain is called R. Griffith killed some virulent cells by boiling them. He then injected the heatkilled cells into mice. The mice survived, showing that the carcasses of the cells do not cause death. However, mice injected with a mixture of heat-killed virulent cells and live nonvirulent cells did die. Furthermore, live cells could be recovered from the dead mice; these cells gave smooth colonies and were virulent on
Transforming R cells into S cells (a)
(b)
R strain
Mouse dies S strain live cells
Mouse lives
(c)
(d)
R strain
+ Mouse dies
Mouse lives S strain heat-killed
S strain heat-killed
S strain live cells
F i g u r e 7-2 The presence of heat-killed S cells transforms live R cells into live S cells.
(a) Mouse dies after injection with the virulent S strain. (b) Mouse survives after injection with the R strain. (c) Mouse survives after injection with heat-killed S strain. (d) Mouse dies after injection with a mixture of heat-killed S strain and live R strain. Live S cells were isolated from the dead mouse, indicating that the heat-killed S strain somehow transforms the R strain into the virulent S strain.
262 CHAPTER 7 DNA: Structure and Replication
DNA is the transforming agent
S strain extract
No components Polysaccharides destroyed destroyed
Lipids destroyed
RNA destroyed
Protein destroyed
DNA destroyed
Mouse dies
Mouse lives
R strain
Mouse dies
Mouse dies
Mouse dies
Live S strain recovered
Mouse dies
No live S strain recovered
F i g u r e 7- 3 DNA is the agent transforming the R strain into virulence. If the DNA in an extract of heat-killed S-strain cells is destroyed, then mice survive when injected with a mixture of the heat-killed cells and the live nonvirulent R-strain cells.
subsequent injection. Somehow, the cell debris of the boiled S cells had converted the live R cells into live S cells. The process, already discussed in Chapter 5, is called transformation. The next step was to determine which chemical component of the dead donor cells had caused this transformation. This substance had changed the genotype of the recipient strain and therefore might be a candidate for the hereditary material. This problem was solved by experiments conducted in 1944 by Oswald Avery and two colleagues, Colin MacLeod and Maclyn McCarty (Figure 7-3). Their approach to the problem was to chemically destroy all the major categories of chemicals in an extract of dead cells one at a time and find out if the extract had lost the ability to transform. The virulent cells had a smooth polysaccharide coat, whereas the nonvirulent cells did not; hence, polysaccharides were an obvious candidate for the transforming agent. However, when polysaccharides were destroyed, the mixture could still transform. Proteins, fats, and ribonucleic acids (RNAs) were all similarly shown not to be the transforming agent. The mixture lost its transforming ability only when the donor mixture was treated with the enzyme deoxyribonuclease (DNase), which breaks up DNA. These results strongly implicate DNA as the genetic material. It is now known that fragments of the transforming DNA that confer virulence enter the bacterial chromosome and replace their counterparts that confer nonvirulence. K e y C o n c e p t The demonstration that DNA is the transforming principle was the first demonstration that genes (the hereditary material) are composed of DNA.
7.1 DNA: The Genetic Material 26 3
Hershey–Chase experiment The experiments conducted by Avery and his colleagues were definitive, but many scientists were very reluctant to accept DNA (rather than proteins) as the genetic material. After all, how could such a low-complexity molecule as DNA encode the diversity of life on this planet? Alfred Hershey and Martha Chase provided additional evidence in 1952 in an experiment that made use of phage T2, a virus that infects bacteria. They reasoned that the infecting phage must inject into the bacterium the specific information that dictates the reproduction of new viral particles. If they could find out what material the phage was injecting into the bacterial host, they would have determined the genetic material of phages. The phage is relatively simple in molecular constitution. The T2 structure is similar to T4 shown in Figures 5-22 to 5-24. Most of its structure is protein, with DNA contained inside the protein sheath of its “head.” Hershey and Chase decided to give the DNA and protein distinct labels by using radioisotopes so that they could track the two materials during infection. Phosphorus is not found in the amino acid building blocks of proteins but is an integral part of DNA; conversely, sulfur is present in proteins but never in DNA. Hershey and Chase incorporated the radioisotope of phosphorus (32P) into phage DNA and that of sulfur (35S) into the proteins of a separate phage culture. As shown in Figure 7-4, they then infected two E. coli cultures with many virus particles per cell: one E. coli culture received phage labeled with 32P, and the other received phage labeled with 35S. After allowing sufficient time for infection to take place, they sheared the empty phage carcasses (called ghosts) off the bacterial cells by agitation in a kitchen blender. They separated the bacterial cells from the phage ghosts in a
The phage genetic material is DNA
Phage ghosts
E. coli T2 phage
+
35S
Most of radioactivity recovered in phage ghosts
Blender and centrifuge
F i g u r e 7- 4 The Hershey–Chase
Phage ghosts
+
32P
Blender and centrifuge
Most of radioactivity recovered in bacteria
experiment demonstrated that the genetic material of phages is DNA, not protein. The experiment uses two sets of T2 bacteriophage. In one set, the protein coat is labeled with radioactive sulfur (35S), not found in DNA. In the other set, the DNA is labeled with radioactive phosphorus (32P), not found in amino acids. Only the 32P is recovered from the E. coli, indicating that DNA is the agent necessary for the production ofnew phages.
26 4 CHAPTER 7 DNA: Structure and Replication
centrifuge and then measured the radioactivity in the two fractions. When the 32P-labeled phages were used to infect E. coli, most of the radioactivity ended up inside the bacterial cells, indicating that the phage DNA entered the cells. When the 35S-labeled phages were used, most of the radioactive material ended up in the phage ghosts, indicating that the phage protein never entered the bacterial cell. The conclusion is inescapable: DNA is the hereditary material. The phage proteins are mere structural packaging that is discarded after delivering the viral DNA to the bacterial cell.
7.2 DNA Structure Even before the structure of DNA was elucidated, genetic studies indicated that the hereditary material must have three key properties: 1. Because essentially every cell in the body of an organism has the same genetic makeup, faithful replication of the genetic material at every cell division is crucial. Thus, the structural features of DNA must allow faithful replication. These structural features will be considered later in this chapter. 2. Because it must encode the constellation of proteins expressed by an organism, the genetic material must have informational content. How the information coded in DNA is deciphered to produce proteins will be the subject of Chapters 8 and 9. 3. Because hereditary changes, called mutations, provide the raw material for evolutionary selection, the genetic material must be able to change on rare occasion. Nevertheless, the structure of DNA must be stable so that organisms can rely on its encoded information. We will consider the mechanisms of mutation in Chapter 16.
DNA structure before Watson and Crick Consider the discovery of the double-helical structure of DNA by Watson and Crick as the solution to a complicated three-dimensional puzzle. To solve this puzzle, Watson and Crick used a process called “model building” in which they assembled the results of earlier and ongoing experiments (the puzzle pieces) to form the three-dimensional puzzle (the double-helix model). To understand how they did so, we first need to know what pieces of the puzzle were available to Watson and Crick in 1953. The building blocks of DNA The first piece of the puzzle was knowledge of the basic building blocks of DNA. As a chemical, DNA is quite simple. It contains three types of chemical components: (1) phosphate, (2) a sugar called deoxyribose, and (3) four nitrogenous bases—adenine, guanine, cytosine, and thymine. The sugar in DNA is called “deoxyribose” because it has only a hydrogen atom (H) at the 2′-carbon atom, unlike ribose (a component of RNA), which has a hydroxyl (OH) group at that position. Two of the bases, adenine and guanine, have a double-ring structure characteristic of a type of chemical called a purine. The other two bases, cytosine and thymine, have a single-ring structure of a type called a pyrimidine. The carbon atoms in the bases are assigned numbers for ease of reference. The carbon atoms in the sugar group also are assigned numbers—in this case, the number is followed by a prime (1′, 2′, and so forth). The chemical components of DNA are arranged into groups called nucleotides, each composed of a phosphate group, a deoxyribose sugar molecule, and any one of the four bases (Figure 7-5). It is convenient to refer to each nucleotide
7.2 DNA Structure 26 5
Structure of the four DNA nucleotides Purine nucleotides NH2 Phosphate
N 8
P
9
N
O −O
5 4
7
O
3
N
N
1 2
Nitrogenous base (Adenine, A)
H
H
3′
H 2′
OH
8
Deoxyribose sugar
P
O
9
6 3
N
1 2
N
Guanine (G) NH2
CH2 O H
−O
H
H
H
H
H OH
Deoxyadenosine 5′-monophosphate (dAMP)
5 4
7
N
O −O
CH2 O 1′
H
N
5′
4′
−O
6
O
H
Deoxyguanosine 5′-monophosphate (dGMP)
Pyrimidine nucleotides O
NH2 5 6
P
1
N
O −O
4
O
CH3
3N 2
Cytosine (C)
H
6
O
O
H
Deoxycytidine 5′-monophosphate (dCMP)
Thymine (T) O
H OH
H
H
H
H
H OH
1
3N 2
CH2 O
–O
H
H
P
4
N
O –O
CH2 O
−O
5
H
Deoxythymidine 5′-monophosphate (dTMP)
F i g u r e 7- 5 These nucleotides, two with purine bases and two with pyrimidine bases,
arethe fundamental building blocks of DNA. The sugar is called deoxyribose because it is avariation of a common sugar, ribose, that has one more oxygen atom (position indicated by the red arrow).
by the first letter of the name of its base: A, G, C, or T. The nucleotide with the adenine base is called deoxyadenosine 5′-monophosphate, where the 5′ refers to the position of the carbon atom in the sugar ring to which the single (mono) phosphate group is attached. Chargaff’s rules of base composition The second piece of the puzzle used by Watson and Crick came from work done several years earlier by Erwin Chargaff. Studying a large selection of DNAs from different organisms (Table 7-1), Chargaff established certain empirical rules about the amounts of each type of nucleotide found in DNA: 1. The total amount of pyrimidine nucleotides (T + C) always equals the total amount of purine nucleotides (A + G). 2. The amount of T always equals the amount of A, and the amount of C always equals the amount of G. But the amount of A + T is not necessarily equal to the amount of G + C, as can be seen in the right-hand column of Table 7-1. This ratio varies among different organisms but is virtually the same in different tissues of the same organism.
26 6 CHAPTER 7 DNA: Structure and Replication
Table 7-1 Molar Properties of Bases* in DNAs from Various Sources Organism
Escherichia coli (K12) Diplococcus pneumoniae Mycobacterium tuberculosis Yeast Paracentrotus lividus (sea urchin) Herring Rat Human Human Human
Adenine
Thymine
Guanine
Cytosine
A+T G+C
— — — — Sperm
26.0 29.8 15.1 31.3 32.8
23.9 31.6 14.6 32.9 32.1
24.9 20.5 34.9 18.7 17.7
25.2 18.0 35.4 17.1 18.4
1.00 1.59 0.42 1.79 1.85
Sperm Bone marrow Thymus Liver Sperm
27.8 28.6 30.9 30.3 30.7
27.5 28.4 29.4 30.3 31.2
22.2 21.4 19.9 19.5 19.3
22.6 21.5 19.8 19.9 18.8
1.23 1.33 1.52 1.53 1.62
Tissue
*Defined as moles of nitrogenous constituents per 100 g-atoms phosphate in hydrolysate. Source: Data from E. Chargaff and J. Davidson, eds., The Nucleic Acids. Academic Press, 1955.
X-ray diffraction analysis of DNA The third and most controversial piece of the puzzle came from X-ray diffraction data on DNA structure that were collected by Rosalind Franklin when she was in the laboratory of Maurice Wilkins (Figure 7-6). In such experiments, X rays are fired at DNA fibers, and the scatter of the rays from the fibers is observed by catching the rays on photographic film, on which the X rays produce spots. The angle of scatter represented by each spot on the film gives information about the position of an atom or certain groups of atoms in the DNA molecule. This procedure is not simple to carry out (or to explain), and the interpretation of the spot patterns requires complex mathematical treatment that is beyond the scope of this text. The available data suggested that DNA is long and skinny and that it has two similar parts that are parallel to each other and run along the length of the molecule. The X-ray data showed the molecule to be helical (spiral-like). Unknown to Rosalind Franklin, her best X-ray picture was shown to Watson and Crick by Maurice Wilkins, and it was this crucial piece of the puzzle
Rosalind Franklin’s critical experimental result
F i g u r e 7- 6 Rosalind Franklin ( left ) and her X-ray diffraction pattern of DNA ( right ). [ Left ) Science Source; ( right ) Rosalind Franklin/Science Source.]
7.2 DNA Structure 267
that allowed them to deduce the three-dimensional structure that could account for the X-ray spot patterns.
Watson and Crick’s DNA model
The double helix A 1953 paper by Watson and Crick in the journal Nature began with two sentences that ushered in a new age of biology: “We wish to suggest a structure for the salt of deoxyribose nucleic acid (D.N.A.). This structure has novel features which are of considerable biological interest.”1 The structure of DNA had been a subject of great debate since the experiments of Avery and co-workers in 1944. As we have seen, the general composition of DNA was known, but how the parts fit together was not known. The structure had to fulfill the main requirements for a hereditary molecule: the ability to store information, the ability to be replicated, and the ability to mutate. The three-dimensional structure derived by Watson and Crick is composed of two side-by-side chains (“strands”) of nucleotides twisted into the shape of a double helix (Figure 7-7). The two nucleotide strands are held together by hydrogen bonds between the bases of each strand, forming a structure like a spiral staircase (Figure 7-8a). The backbone of each strand is formed of alternating phosphate and deoxyribose sugar units that are connected by phosphodiester linkages (Figure 7-8b). We can use these linkages to describe how a nucleotide chain is organized. As already mentioned, the carbon atoms of the sugar groups are numbered 1′ through 5′. A phosphodiester linkage connects the 5′-carbon atom of one deoxyribose to the 3′-carbon atom of the adjacent deoxyribose. Thus, each sugar–phosphate backbone is said to have a 5′-to-3′ polarity, or direction, and understanding this polarity is essential in understanding how DNA fulfills its roles. In the double-stranded DNA molecule, the two backbones are in opposite, or antiparallel, orientation (see Figure 7-8b). Each base is attached to the 1′-carbon atom of a deoxyribose sugar in the backbone of each strand and faces inward toward a base on the other strand. Hydrogen bonds between pairs of bases hold the two strands of the DNA molecule together. The hydrogen bonds are indicated by dashed lines in Figure 7-8b. Two complementary nucleotide strands paired in an antiparallel manner automatically assume a double-helical conformation (Figure 7-9), mainly through the interaction of the base pairs. The base pairs, which are flat planar structures, stack on top of one another at the center of the double helix (see Figure 7-9a). Stacking adds to the stability of the DNA molecule by excluding water molecules from the spaces between the base pairs. The most stable form that results from base stacking is a double helix with two distinct sizes of grooves running in a spiral: the major groove and the minor groove, which can be seen in both the ribbon and the space-filling models of Figure 7-9a and 7-9b. Most DNA–protein associations are in major grooves. A single strand of nucleotides has no helical structure; the helical shape of DNA depends entirely on the pairing and stacking of the bases in the antiparallel strands. DNA is a right-handed helix; in other words, it has the same structure as that of a screw that would be screwed into place by using a clockwise turning motion. The double helix accounted nicely for the X-ray data and successfully accounted for Chargaff’s data. By studying models that they made of the structure, Watson and Crick realized that the observed radius of the double helix 1J.
Watson and F. Crick, Nature 171:737, 1953.
F i g u r e 7-7 James Watson and Francis
Crick with their DNA model. [ A. Barrington Brown/Science Source.]
26 8 CHAPTER 7 DNA: Structure and Replication
The structure of DNA
A
•• •• ••
Sugar –phosphate backbone
T G ••• ••• ••• C •• •• •• •••
C
A
G T
•• •• ••
T •• •• •• A C A
•• •• ••
•• •• •• •••
G
T
G ••• ••• ••• C
Base pair
A nucleoside monophosphate unit
O 5ʹ P O 3ʹ O O H N O H 5ʹ CH2 T N H N A O O 4ʹ 2ʹ 3ʹ 1ʹ 3ʹ O 1ʹ 2ʹ 4ʹ O O 5ʹ CH2 P O O O N O O P O H O CH2 O NC O GN H CH2
A
•• •• ••
O P O
T
T •• •• •• A C
•• •• •• •••
A
G
•• •• ••
T
•• •• ••
T
AN
N H C N
O
CH2
O
O
H N T
O
O
H N G H N
O
P O O
O 5ʹ CH 2 O
O
O
O P C 5ʹ O
O O
O
O
G ••• ••• ••• C
(a)
N H
O
CH2 O P O
A
O
3ʹ
O
N H
O
Phosphodiester linkage
P O O
O CH2
O
H
3ʹ
(b)
F i g u r e 7- 8 (a) A simplified model showing the helical structure of DNA. The sticks
represent base pairs, and the ribbons represent the sugar–phosphate backbones of the two antiparallel chains. (b) An accurate chemical diagram of the DNA double helix, unrolled to show the sugar–phosphate backbones (blue) and base-pair rungs (purple, orange). The backbones run in opposite directions; the 5 ′ and 3 ′ ends are named for the orientation of the 5 ′ and 3 ′ carbon atoms of the sugar rings. Each base pair has one purine base, adenine (A) or guanine (G), and one pyrimidine base, thymine (T) or cytosine (C), connected by hydrogen bonds (red dashed lines).
(known from the X-ray data) would be explained if a purine base always pairs (by hydrogen bonding) with a pyrimidine base (Figure 7-10). Such pairing would account for the (A + G) = (T + C) regularity observed by Chargaff, but it would predict four possible pairings: T…A, T…G, C…A, and C…G. Chargaff’s data, however, indicate that T pairs only with A, and C pairs only with G. Watson and Crick concluded that each base pair consists of one purine base and one pyrimidine base, paired according to the following rule: G pairs with C, and A pairs with T. Note that the G–C pair has three hydrogen bonds, whereas the A–T pair has only two (see Figure 7-8b). We would predict that DNA containing many G–C pairs
7.2 DNA Structure 26 9
Two representations of the DNA double helix H O
5ʹ
Major groove
C in phosphate ester chain
3ʹ P Minor groove
3ʹ
C and N in bases
5ʹ Base pairs
Sugar–phosphate backbone
(a)
(b)
F i g u r e 7- 9 The ribbon diagram (a) highlights the stacking of the base pairs, whereas the space-filling model (b) shows the major and minor grooves.
would be more stable than DNA containing many A–T pairs. In fact, this prediction is confirmed. Heat causes the two strands of DNA double helix to separate (a process called DNA melting or DNA denaturation); DNAs with higher G + C content can be shown to require higher temperatures to melt because of the greater attraction of the G–C pairing. K e y C o n c e p t DNA is a double helix composed of
Base pairing in DNA Pyrimidine + pyrimidine: DNA too thin
Purine + purine: DNA too thick
two nucleotide chains held together by complementary pairing of A with T and G with C.
Watson and Crick’s discovery of the structure of DNA is considered by some to be the most important biological discovery of the twentieth century and led to their being awarded the Nobel Prize with Maurice Wilkins in 1962 (Rosalind Franklin died of cancer in 1958 and the prize is not awarded posthumously). The reason that this discovery is considered so important is that the double helix
Purine + pyrimidine: thickness compatible with X-ray data
F i g u r e 7-10 The pairing of purines with pyrimidines accounts exactly for the diameter of the DNA double helix determined from X-ray data. That diameter is indicated by the vertical dashed lines.
270 CHAPTER 7 DNA: Structure and Replication
F i g u r e 7-11 The semiconservative model of DNA replication
Semiconservative DNA replication
proposed by Watson and Crick is based on the hydrogen-bonded specificity of the base pairs. Parental strands, shown in blue, serve as templates for polymerization. The newly polymerized strands, shown in gold, have base sequences that are complementary to their respective templates.
AT CG
model, in addition to being consistent with earlier data about DNA structure, fulfilled the three requirements for a hereditary substance:
CG
1. The double-helical structure suggested how the genetic material might determine the structure of proteins. Perhaps the sequence of nucleotide pairs in DNA dictates the sequence of amino acids in the protein specified by that gene. In other words, some sort of genetic code may write information in DNA as a sequence of nucleotides and then translate it into a different language of amino acid sequences in protein. Just how it is done is the subject of Chapter 9.
TA CG GC
TA TA AT
2. If the base sequence of DNA specifies the amino acid sequence, then mutation is possible by the substitution of one type of base for another at one or more positions. Mutations will be discussed in Chapter 16.
TA AT CG
TA T
A
G C
C
G
G
The two strands of the C parental double helix unwind, A and each specifies a new daughter T strand by base-pairing rules. GC GC
T A GC
3. As Watson and Crick stated in the concluding words of their 1953 Nature paper that reported the double-helical structure of DNA: “It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.”2 To geneticists at the time, the meaning of this statement was clear, as we see in the next section.
GC AT
AT TA
7.3 Semiconservative Replication
TA
GC TA
GC TA GC
TA GC
AT GC
AT GC GC
TA
GC TA
Old
TA
TA
AT
AT GC
TA CG
TA
New
GC
TA CG
The copying mechanism to which Watson and Crick referred is called semiconservative replication and is diagrammed in Figure 7-11. The sugar–phosphate backbones are represented by thick ribbons, and the sequence of base pairs is random. Let’s imagine that the double helix is analogous to a zipper that unzips, starting at one end. We can see that, if this zipper analogy is valid, the unwinding of the two strands will expose single bases on each strand. Each exposed base has the potential to pair with free nucleotides in solution. Because the DNA structure imposes strict pairing requirements, each exposed base will pair only with its complementary base, A with T and G with C. Thus, each of the two single strands will act as a template, or mold, to direct the assembly of complementary bases to re-form a double helix identical with the original. The newly added nucleotides are assumed to come from a pool of free nucleotides that must be present in the cell. If this model is correct, then each daughter molecule should contain one parental nucleotide chain and one newly 2 J.
Watson and F. Crick, Nature 171:737, 1953.
7.3 Semiconservative Replication 271
synthesized nucleotide chain. However, a little thought shows that there are at least three different ways in which a parental DNA molecule might be related to the daughter molecules. These hypothetical modes of replication are called semiconservative (the Watson–Crick model), conservative, and dispersive (Figure 7-12). In semiconservative replication, the double helix of each daughter DNA molecule contains one strand from the original DNA molecule and one newly synthesized strand. In conservative replication, the parent DNA molecule is conserved, and a single daughter double helix is produced consisting of two newly synthesized strands. In dispersive replication, daughter molecules consist of strands each containing segments of both parental DNA and newly synthesized DNA.
Three alternative models for DNA replication Semiconservative replication
Conservative replication
Meselson–Stahl experiment The first problem in understanding DNA replication was to figure out whether the mechanism of replication was semiconservative, conservative, or dispersive. In 1958, two young scientists, Matthew Dispersive replication Meselson and Franklin Stahl, set out to discover which of these possibilities correctly described DNA replication. Their idea was to allow parental DNA molecules containing nucleotides of one density to replicate in medium containing nucleotides of different density. If DNA replicated semiconservatively, the daughter molecules should be half old and half new and therefore of intermediate density. To carry out their experiment, Meselson and Stahl grew E. coli cells in a medium containing the heavy isotope of nitrogen (15N) F i g u r e 7-12 Of three alternative rather than the normal light (14N) form. This isotope was inserted into the nitromodels for DNA replication, the Watson– gen bases, which then were incorporated into newly synthesized DNA strands. Crick model of DNA structure would After many cell divisions in 15N, the DNA of the cells were well labeled with the produce the first (semiconservative) heavy isotope. The cells were then removed from the 15N medium and put into a model. Gold lines represent the newly 14N medium; after one and two cell divisions, samples were taken and the DNA synthesized strands. was isolated from eachsample. Meselson and Stahl were able to distinguish DNA of different densities because the molecules can be separated from one another by a procedure called cesium chloride gradient centrifugation. If cesium chloride (CsCl) is spun in a centrifuge at tremendously high speeds (50,000 rpm) for many hours, the cesium and chloride ions tend to be pushed by centrifugal force toward the bottom of the tube. Ultimately, a gradient of ions is established in the tube, with the highest ion concentration, or density, at the bottom. DNA centrifuged with the cesium chloride forms a band at a position identical with its density in the gradient (Figure 7-13). DNA of different densities will form bands at different places. Cells initially grown in the heavy isotope 15N showed DNA of high density. This DNA is shown in blue at the left-hand side of Figure 7-13a. After growing these cells in the light isotope 14N for one generation, the researchers found that the DNA was of intermediate density, shown half blue (15N) and half gold (14N) in the middle of Figure 7-13a. Note that Meselson and Stahl continued the experiment through two E. coli generations so that they could distinguish semiconservative replication from dispersive. After two generations, both intermediate- and low-density DNA was observed (righthand side of Figure 7-13a), precisely as predicted by Watson–Crick’s semiconservative replication model. K e y C o n c e p t DNA is replicated by the unwinding of the two strands of the double helix and the building up of a new complementary strand on each of the separated strands of the original double helix.
272 CHAPTER 7 DNA: Structure and Replication
DNA is copied by semiconservative replication (a) Predictions of semiconservative model Parental
1st generation
2nd generation
14N/14N (light) DNA 14N/15N (hybrid) DNA 15N/15N
(heavy) DNA
(b) Predictions of conservative model Parental
1st generation
2nd generation
14N/14N (light) DNA
15N/15N (heavy) DNA
(c) Predictions of dispersive model Parental
1st generation
14N/15N (hybrid) DNA 15N/15N (heavy) DNA
2nd generation
14N/15N (hybrid) DNA
F i g u r e 7-13 The Meselson–Stahl experiment demonstrates that DNA is copied by semiconservative replication. DNA centrifuged in a cesium chloride (CsCl) gradient will form bands according to its density. (a) When the cells grown in 15N are transferred to a 14N medium, the first generation produces a single intermediate DNA band and the second generation produces two bands: one intermediate and one light. This result matches the predictions of the semiconservative model of DNA replication. ( b and c ) The results predicted for conservative and dispersive replication, shown here, were not found.
The replication fork Another prediction of the Watson–Crick model of DNA replication is that a replication zipper, or fork, will be found in the DNA molecule during replication. This fork is the location at which the double helix is unwound to produce the two single strands that serve as templates for copying. In 1963, John Cairns tested this prediction by allowing replicating DNA in bacterial cells to incorporate tritiated thymidine ([3H]thymidine)—the thymine nucleotide labeled with a radioactive hydrogen isotope called tritium. Theoretically, each newly synthesized daughter molecule should then contain one radioactive (“hot”) strand (with 3H) and another nonradioactive (“cold”) strand. After varying intervals and varying numbers of replication cycles in a “hot” medium, Cairns carefully lysed the bacteria and allowed the cell contents to settle onto grids designed for electron microscopy. Finally, Cairns covered the grid with photographic emulsion and exposed it in the dark for 2 months. This procedure, called autoradiography, allowed Cairns to develop a picture of the location of 3H in the cell material. As 3H decays, it emits a beta particle (an energetic electron). The photographic emulsion detects a chemical reaction that takes place wherever a beta particle strikes the emulsion. The emulsion can then be developed like a photographic print so that the emission track of the beta particle appears as a black spot or grain. After one replication cycle in [3H]thymidine, a ring of dots appeared in the autoradiograph. Cairns interpreted this ring as a newly formed radioactive strand in a circular daughter DNA molecule, as shown in Figure 7-14a. It is thus apparent that the bacterial chromosome is circular—a fact that also emerged from genetic analysis described earlier (see Chapter 5). In the second replication cycle, the forks predicted by the model were indeed seen. Furthermore, the density of grains in the three segments was such that the interpretation shown in Figure 7-14b could be made: the thick curve of dots cutting through the interior of the circle of DNA would be the newly synthesized daughter strand, this time consisting of two radioactive strands. Cairns saw
7.3 Semiconservative Replication 273
A replicating bacterial chromosome (a) Chromosome after one round of replication
(b) Chromosome during second round of replication Replication forks
Autoradiograph
Interpretation
Autoradiograph
F i g u r e 7-14 A replicating bacterial chromosome has two replication forks. (a) Left: Autoradiograph of a bacterial chromosome after one replication in tritiated thymidine. According to the semiconservative model of replication, one of the two strands should be radioactive. Right: Interpretation of the autoradiograph. The gold helix represents the tritiated strand. (b) Left: Autoradiograph of a bacterial chromosome in the second round of replication in tritiated (3H) thymidine. In this molecule, the newly replicated double helix that crosses the circle could consist of two radioactive strands (if the parental strand were the radioactive one). Right: The double thickness of the radioactive tracing on the autoradiogram appears to confirm the interpretation shown here.
all sizes of these moon-shaped, autoradiographic patterns, corresponding to the progressive movement of the replication forks, around the ring.
DNA polymerases A problem confronted by scientists was to understand just how the bases are brought to the double-helix template. Although scientists suspected that enzymes played a role, that possibility was not proved until 1959, when Arthur Kornberg isolated DNA polymerase from E. coli and demonstrated its enzymatic activity in vitro. This enzyme adds deoxyribonucleotides to the 3′ end of a growing nucleotide chain, using for its template a single strand of DNA that has been exposed by localized unwinding of the double helix (Figure 7-15). The substrates for DNA polymerase are the triphosphate forms of the deoxyribonucleotides, dATP, dGTP, dCTP, and dTTP. The addition of each base to the growing polymer is accompanied by the removal of two of the three phosphates in the form of pyrophosphate (PPi). The energy produced by cleaving this high-energy bond and the subsequent hydrolysis of pyrophosphate to two inorganic phosphate molecules helps drive the endergonic process of building a DNA polymer. There are now known to be five DNA polymerases in E. coli. The first enzyme that Kornberg purified is now called DNA polymerase I, or pol I. This enzyme has three activities, which appear to be located in different parts of the molecule: 1. a polymerase activity, which catalyzes chain growth in the 5′-to-3′ direction; 2. a 3′-to-5′ exonuclease activity, which removes mismatched bases; and 3. a 5′-to-3′ exonuclease activity, which degrades single strands of DNA or RNA. We will return to the significance of the two exonuclease activities later in this chapter. Although pol I has a role in DNA replication (see next section), some scientists suspected that it was not responsible for the majority of DNA synthesis because it
Interpretation
274 CHAPTER 7 DNA: Structure and Replication
Reaction catalyzed by DNA polymerase DNA template strand 3'
5'
–O
P
5'
–O
O
O
H2C H
O
P O–
O O
P O–
O
P O H2C
G H
3'
H
H PPi –O
O
G
H
H
O
H
P
C
H
O
O
H 2C O
5'
O
O
O– C
H
H
HO
H
H
O
H 2C
C
H
HO •• O
P O
O
H
DNA template strand 3'
G
H
H
HO
H
T
5'
C H
G
H
H
T
5'
Figure 7-15 DNA polymerase catalyzes the chain-elongation reaction. Energy for the reaction
comes from breaking the high-energy phosphate bond of the triphosphate substrate. ANIMATED ART: The nucleotide polymerization process
was too slow (~20 nucleotides/second) and too abundant (~400 molecules/cell) and because it dissociated from the DNA after incorporating from only 20 to 50 nucleotides. In 1969, John Cairns and Paula DeLucia settled this matter when they demonstrated that an E. coli strain harboring a mutation in the gene that Introduction to Genetic Analysis, 11e encodes DNA pol I was still able to grow normally and replicate its DNA. They Figure 07.15 #719 concluded that another DNA polymerase, now called pol III, catalyzes DNA syn05/02/14 Dragonfly Media Group thesis at the replication fork.
7.4
Overview of DNA Replication
As DNA pol III moves forward, the double helix is continuously unwinding ahead of the enzyme to expose further lengths of single DNA strands that will act as templates (Figure 7-16). DNA pol III acts at the replication fork, the zone where the double helix is unwinding. However, because DNA polymerase always adds nucleotides at the 3′ growing tip, only one of the two antiparallel strands can serve as a template for replication in the direction of the replication fork. For this strand, synthesis can take place in a smooth continuous manner in the direction of the fork; the new strand synthesized on this template is called the leading strand.
7.4 Overview of DNA Replication 275
Synthesis on the other template also takes place at 3′ DNA replication at the growing fork growing tips, but this synthesis is in the “wrong” direction, because, for this strand, the 5′-to-3′ direction of synis 5′ hes ynt thesis is away from the replication fork (see Figure 7-16). s f o 3′ ion As we will see, the nature of the replication machinery ect Dir requires that synthesis of both strands take place in the Template strands region of the replication fork. Therefore, synthesis movLagging strand ing away from the growing fork cannot go on for long. It 5′ 5′ must be in short segments: polymerase synthesizes a 3′ Leading strand 3′ segment, then moves back to the segment’s 5′ end, where Fork movement the growing fork has exposed new template, and begins the process again. These short (1000–2000 nucleotides) stretches of newly synthesized DNA are called Okazaki Dir ect ion fragments. 5′ of s ynt hes Another problem in DNA replication arises because is 3′ DNA polymerase can extend a chain but cannot start a chain. Therefore, synthesis of both the leading strand and each Okazaki fragment must be initiated by a primer, or short chain of nucleF i g u r e 7-16 The replication fork moves in DNA synthesis as the double otides, that binds with the template strand to form a segment of duplex nucleic helix continuously unwinds. Synthesis of acid. The primer in DNA replication can be seen in Figure 7-17. The primers are the leading strand can proceed smoothly synthesized by a set of proteins called a primosome, of which a central compowithout interruption in the direction of nent is an enzyme called primase, a type of RNA polymerase. Primase synthesizes movement of the replication fork, but syna short (~8–12 nucleotides) stretch of RNA complementary to a specific region of thesis of the lagging strand must proceed the chromosome. On the leading strand, only one initial primer is needed because, in the opposite direction, away from the after the initial priming, the growing DNA strand serves as the primer for continureplication fork. ous addition. However, on the lagging strand, every Okazaki fragment needs its own primer. The RNA chain composing the primer is then extended as a DNA chain by DNA pol III. A different DNA polymerase, pol I, removes the RNA primers with its 5′ to 3′ exonuclease activity and fills in the gaps with its 5′-to-3′ polymerase activity. As mentioned earlier, pol I is the enzyme originally purified by Kornberg. Another enzyme, DNA ligase, joins the 3′ end of the gap-filling DNA to the 5′ end of the F i g u r e 7-17 Steps in the synthesis of downstream Okazaki fragment. The new strand thus formed is called the lagging the lagging strand. DNA synthesis proceeds bycontinuous synthesis on the strand. DNA ligase joins broken pieces of DNA by catalyzing the formation of a leading strand and discontinuous synthesis on the laggingstrand.
Synthesizing the lagging strand 3. DNA polymerase I removes RNA at 5' end of neighboring fragment and fills gap.
1. Primase synthesizes short RNA oligonucleotides (primers) copied from DNA. 3' 3' 5'
5'
RNA primer
5'
5'
3'
3' 5' 3'
4. DNA ligase connects adjacent fragments.
2. DNA polymerase III elongates RNA primers with new DNA.
3'
3' 5'
New DNA
Okazaki fragment
5'
Ligation
276 CHAPTER 7 DNA: Structure and Replication
phosphodiester bond between the 5′-phosphate end of one fragment and the adjacent 3′-OH group of another fragment. A hallmark of DNA replication is its accuracy, also called fidelity: overall, less than one error per 1010 nucleotides is inserted. Part of the reason for the accuracy of DNA replication is that both DNA pol I and DNA pol III possess 3′-to-5′ exonuclease activity, which serves a “proofreading” function by excising erroneously inserted mismatched bases. Given the importance of proofreading, let’s take a closer look at how it works. A mismatched base pair occurs when the 5′-to-3′ polymerase activity inserts, for example, an A instead of a G next to a C. The addition of an incorrect base is often due to a process called tautomerization. Each of the bases in DNA can appear in one of several forms, called tautomers, which are isomers that differ in the positions of their atoms and in the bonds between the atoms. The forms are in equilibrium. The keto form of each base is normally present in DNA, but in rare instances a base may shift to the imino or enol form. The imino and enol forms may pair with the wrong base, forming a mispair (Figure 7-18). When a C shifts to
Bases may take on rare tautomeric forms prone to mismatch (a) Normal base pairing H 3C
H
N
6 1
5 2
4 3
N N
H
H
O
H Cytosine
O
N1
N
2
6
N
7
5 4
3
9
5 2
4 3
O N
8
H
H
O
H
N 6
N1 2
Thymine
N
N
H
N
6 1
N
7
5 4
3
9
8
N
N Adenine
Guanine
(b) Mismatched bases H N N
N
H 3C H
H O Rare imino form of cytosine (C*)
N
H N
N
N
N
Adenine
H N N
N
O Cytosine
H
O N
N
N
H N
N N
N
Rare imino form of adenine (A*)
H
O
N
N O
Thymine
H
O H N
O N
N
H Rare enol form N of thymine (T*) H CH3
H
H
N
N
Guanine
H
O N
N N
N
H Rare enol form of guanine (G*)
F i g u r e 7-18 Normal base pairing compared with mismatched bases.
(a) Pairing between the normal (keto) forms of the bases. (b) Rare tautomeric forms of bases result in mismatches.
7.5 The Replisome: A Remarkable Replication Machine 277
its rare imino form, the polymerase adds an A rather than a G (Figure 7-19). Fortunately, such a mismatch is usually detected and removed by the 3′-to-5′ exonuclease activity. Once the mismatched base is removed, the polymerase has another chance to add the correct complementary G base. As you would expect, mutant strains lacking a functional 3′-to-5′ exonuclease have a higher rate of mutation. In addition, because primase lacks a proofreading function, the RNA primer is more likely than DNA to contain errors. The need to maintain the high fidelity of replication is one reason that the RNA primers at the ends of Okazaki fragments must be removed and replaced with DNA. Only after the RNA primer is gone does DNA pol I catalyze DNA synthesis to replace the primer. The subject of DNA repair will be covered in detail in Chapter 16.
Proofreading removes mispaired bases DNA polymerase I and III
A
G 3′ T G G A C T A C C T G A C G G 3′ 5′
5′
Extension: incorrect base (A) bonded to imino form of C
K e y C o n c e p t DNA replication takes place at the replication fork, where the double helix is unwinding and the two strands are separating. DNA replication proceeds continuously in the direction of the unwinding replication fork on the leading strand. DNA is synthesized in short segments, in the direction away from the replication fork, on the lagging strand. DNA polymerase requires a primer, or short chain of nucleotides, to be already in place to begin synthesis.
7.5 The Replisome: A Remarkable Replication Machine Another hallmark of DNA replication is speed. The time needed for E. coli to replicate its chromosome can be as short as 40 minutes. Therefore, its genome of about 5 million base pairs must be copied at a rate of about 2000 nucleotides per second. From the experiment of Cairns, we know that E. coli uses only two replication forks to copy its entire genome. Thus, each fork must be able to move at a rate of as many as 1000 nucleotides per second. What is remarkable about the entire process of DNA replication is that it does not sacrifice speed for accuracy. How can it maintain both speed and accuracy, given the complexity of the reactions at the replication fork? The answer is that DNA polymerase is part of a large “nucleoprotein” complex that coordinates the activities at the replication fork. This complex, called the replisome, is an example of a “molecular machine.” You will encounter other examples in later chapters. The discovery that most of the major functions of cells—replication, transcription, and translation, for example—are carried out by large multisubunit complexes has changed the way that we think about the cell. To begin to understand why, let’s look at the replisome more closely. Some of the interacting components of the replisome in E. coli are shown in Figure 7-20. At the replication fork, the catalytic core of DNA pol III is part of a much larger complex, called the pol III holoenzyme, which consists of two catalytic cores and many accessory proteins. One of the catalytic cores handles the synthesis of the leading strand while the other handles lagging-strand synthesis. Some of the accessory proteins (not visible in Figure 7-20) form a connection that bridges the two catalytic cores, thus coordinating the synthesis of the leading and lagging strands. The lagging strand is shown looping around so that the replisome can coordinate the synthesis of both strands and move in the direction of the replication fork. An important accessory protein called the β clamp encircles the DNA like a donut and keeps pol III attached to the DNA molecule. Thus, pol III is transformed from an enzyme that can add only 10 nucleotides before falling off the template (termed a distributive enzyme) into an enzyme that stays at the moving fork and adds tens of thousands of nucleotides (a processive enzyme). In sum, through the action of accessory proteins, the synthesis of both the leading and the lagging strands is rapid and highly coordinated.
G 3′ T G G A C T A A C C T G A C G G 3′ 5′
5′
Proofreading: incorrect base detected and removed A
G 3′ T G G A C T A C C T G A C G G 3′ 5′
5′
Extension: correct base G added
5′ 3′
T G G A C T G A C C T G A C G G
5′
F i g u r e 7-19 DNA polymerase backs up to remove the A-C mismatch using its 3 ′-to-5 ′ exonuclease activity.
Introduction to Genetic Analysis, 11e Figure 07.19 #719 06/30/14 Dragonfly Media Group
278 CHAPTER 7 DNA: Structure and Replication
Proteins at work at the replication fork 5′
3′
Topoisomerase
Replication fork movement Helicase Next Okazaki fragment will start here. RNA primer Primase RNA primer Okazaki fragment Single-strandbinding proteins
β clamp DNA polymerase III dimer
Leading strand
DNA polymerase I
Lagging strand
Ligase
3′
5′
5′
3′
F i g u r e 7-2 0 The replisome and accessory proteins carry out a number of steps at the
replication fork. Topoisomerase and helicase unwind and open the double helix in preparation for DNA replication. When the double helix has been unwound, single-strand-binding proteins prevent the double helix from re-forming. The illustration is a representation of the so-called trombone model (named for its resemblance to a trombone owing to the looping of the lagging strand) showing how the two catalytic cores of the replisome are envisioned tointeract to coordinate the numerous events of leading- and lagging-strand replication. ANIMATED ART: Leading- and lagging-strand synthesis
Note that primase, the enzyme that synthesizes the RNA primer, is not touching the clamp protein. Therefore, primase acts as a distributive enzyme—it adds only a few ribonucleotides before dissociating from the template. This mode of action makes sense because the primer need be only long enough to form a suitable duplex starting point for DNA pol III.
7.5 The Replisome: A Remarkable Replication Machine 279
Unwinding the double helix When the double helix was proposed in 1953, a major objection was that the replication of such a structure would require the unwinding of the double helix at the replication fork and the breaking of the hydrogen bonds that hold the strands together. How could DNA be unwound so rapidly and, even if it could, wouldn’t that overwind the DNA behind the fork and make it hopelessly tangled? We now know that the replisome contains two classes of proteins that open the helix and prevent overwinding: they are helicases and topoisomerases, respectively. Helicases are enzymes that disrupt the hydrogen bonds that hold the two strands of the double helix together. Like the clamp protein, the helicase fits like a donut around the DNA; from this position, it rapidly unzips the double helix ahead of DNA synthesis. The unwound DNA is stabilized by single-strand-binding (SSB) proteins, which bind to single-stranded DNA and prevent the duplex from re-forming. Circular DNA can be twisted and coiled, much like the extra coils that can be introduced into a twisted rubber band. The unwinding of the replication fork by helicases causes extra twisting at other regions, and supercoils form to release the strain of the extra twisting. Both the twists and the supercoils must be removed to allow replication to continue. This supercoiling can be created or relaxed by enzymes termed topoisomerases, of which an example is DNA gyrase (Figure 7-21). Topoisomerases relax supercoiled DNA by breaking either a single DNA strand or
DNA gyrase removes extra twists Unwound parental duplex
Overwound region (a)
1 DNA gyrase cuts DNA strands.
2
DNA rotates to remove the coils.
3 DNA gyrase rejoins the DNA strands.
F i g u r e 7-2 1 DNA gyrase, a topoisomerase,
Replication fork (b)
removes extra twists during replication. (a) Extratwisted (positively supercoiled) regions accumulate ahead of the fork as the parental strands separate for replication. (b) A topoisomerase such as DNA gyrase removes these regions, by cutting the DNA strands, allowing them to rotate, and then rejoining the strands.
28 0 CHAPTER 7 DNA: Structure and Replication
both strands, which allows DNA to rotate into a relaxed molecule. Topoisomerases finish by rejoining the strands of the now relaxed DNA molecule.
Prokaryotic initiation of replication AT-rich
DnaA boxes
K e y C o n c e p t A molecular machine called the replisome carries
Multiple DnaA proteins
Origin (oriC) recognition and unwinding
out DNA synthesis. It includes two DNA polymerase units to handle synthesis on each strand and coordinates the activity of accessory proteins required for priming, unwinding the double helix, and stabilizing the single strands.
Assembling the replisome: replication initiation Helicase loading
Sliding of helicase
Recruitment of replisome
F i g u r e 7-2 2 DNA synthesis is initiated at origins of replication in prokaryotes. Proteins bind to the origin (oriC), where they separate the two strands of the double helix and recruit replisome components to the two replication forks. Introduction to Genetic Analysis, 11e Figure 07.22 #205 05/06/14 Dragonfly Media Group
Assembly of the replisome is an orderly process that begins at precise sites on the chromosome (called origins) and takes place only at certain times in the life of the cell. E. coli replication begins from a fixed origin (a locus called oriC) and then proceeds in both directions (with moving forks at both ends, as shown in Figure 7-14) until the forks merge. Figure 7-22 shows the process of replisome assembly. The first step is the binding of a protein called DnaA to a specific 13-basepair (bp) sequence (called a “DnaA box”) that is repeated five times in oriC. In response to the binding of DnaA, the origin is unwound at a cluster of A and T nucleotides. Recall that AT base pairs are held together with only two hydrogen bonds, whereas GC base pairs are held together with three. Thus, it is easier to separate (melt) the double helix at stretches of DNA that are enriched in A and T bases. After unwinding begins, additional DnaA proteins bind to the newly unwound single-stranded regions. With DnaA coating the origin, two helicases (the DnaB protein) now bind and slide in a 5′-to-3′ direction to begin unzipping the helix at the replication fork. Primase and DNA pol III holoenzyme are now recruited to the replication fork by protein–protein interactions, and DNA synthesis begins. You may be wondering why DnaA is not present in Figure 7-20, showing the replisome machine. The answer is that, although it is necessary for the assembly of the replisome, it is not part of the replication machinery. Rather, its job is to bring the replisome to the correct place in the circular chromosome for the initiation of replication.
7.6 Replication in Eukaryotic Organisms DNA replication in both prokaryotes and eukaryotes uses a semiconservative mechanism and employs leading- and lagging-strand synthesis. For this reason, it should not come as a surprise that the components of the prokaryotic replisome and those of the eukaryotic replisome are very similar. However, as organisms increase in complexity, the number of replisome components also increases.
Eukaryotic origins of replication Bacteria such as E. coli usually complete a replication–division cycle in 20 to 40 minutes but, in eukaryotes, the cycle can vary from 1.4 hours in yeast to 24hours in cultured animal cells and may last from 100 to 200 hours in some cells. Eukaryotes have to solve the problem of coordinating the replication of more than one chromosome. To understand eukaryotic replication origins, we will first turn our attention to the simple eukaryote yeast. Many eukaryotic proteins having roles at replication origins were first identified in yeast because of the ease of genetic analysis in
7.6 Replication in Eukaryotic Organisms 281
DNA replication proceeds in two directions (a)
5′ 3′
Origin of replication
3′ 5′
5′ 3′
Growth (b)
3′ 5′
Growth
Replication beginning at three origins
Chromosome DNA
Sister chromatids DNA replicas (daughter molecules)
yeast research (see the yeast Model Organism box in Chapter 12). The origins of replication in yeast are very much like oriC in E. coli. The 100- to 200-bp origins have a conserved DNA sequence that includes an AT-rich region that melts when an initiator protein binds to adjacent binding sites. Unlike prokaryotic chromosomes, each eukaryotic chromosome has many replication origins to replicate the much larger eukaryotic genomes quickly. Approximately 400 replication origins are dispersed throughout the 16 chromosomes of yeast, and there are estimated to be thousands of growing forks in the 23 chromosomes of humans. Thus, in eukaryotes, replication proceeds in both directions from multiple points of origin (Figure 7-23). The double helices that are being produced at each origin of replication elongate and eventually join one another. When replication of the two strands is complete, two identical daughter molecules of DNA result. K e y C o n c e p t Where and when replication takes place are carefully controlled by the ordered assembly of the replisome at a precise site called the origin. Replication proceeds in both directions from a single origin on the circular prokaryotic chromosome. Replication proceeds in both directions from hundreds or thousands of origins on each of the linear eukaryotic chromosomes.
DNA replication and the yeast cell cycle DNA synthesis takes place in the S (synthesis) phase of the eukaryotic cell cycle (Figure 7-24). How is the onset of DNA synthesis limited to this single stage? In yeast, the method of control is to link replisome assembly to the cell cycle. Figure 7-25 shows the process. In yeast, three proteins are required to begin assembly of the replisome. The origin recognition complex (ORC) first binds to sequences in yeast origins, much as DnaA protein does in E. coli. The presence of ORC at the origin serves to recruit two other proteins, Cdc6 and Cdt1. Both proteins plus ORC then
F i g u r e 7-2 3 DNA replication proceeds in both directions from an origin of replication. Black arrows indicate the direction of growth of daughter DNA molecules. (a) Starting at the origin, DNA polymerases move outward in both directions. Long yellow arrows represent leading strands and short joined yellow arrows represent lagging strands. (b) How replication proceeds at the chromosome level. Three origins of replication are shown in this example. ANIMATED ART: DNA replication: replication of a chromosome
282 CHAPTER 7 DNA: Structure and Replication
F i g u r e 7-2 4 DNA is replicated during the S phase of the cell cycle.
Stages of the cell cycle
Original cell
Daughter cells
Stages of the cell cycle M = mitosis S = DNA synthesis G1 = gap 1 G2 = gap 2
M
G2
G1
S
+
recruit the helicase, called the MCM complex, and the other components of the replisome. Replication is linked to the cell cycle through the availability of Cdc6 and Cdt1. In yeast, these proteins are synthesized during late mitosis and gap 1 (G1) and are destroyed by proteolysis after synthesis has begun. In this way, the replisome can be assembled only before the S phase. When replication begins, new replisomes cannot form at the origins, because Cdc6 and Cdt1 are degraded during the S phase and are no longer available.
Replication origins in higher eukaryotes As already stated, most of the approximately 400 origins of replication in yeast are composed of similar DNA sequence motifs (100–200 bp in length) that are recognized by the ORC subunits. Interestingly, although all characterized eukaryotes have similar ORC proteins, the origins of replication in higher eukaryotes are much longer, possibly as long as tens of thousands or hundreds of thousands of nucleotides. Significantly, they have limited sequence similarity. Thus, although the yeast ORC recognizes specific DNA sequences in yeast chromosomes, what the related ORCs of higher eukaryotes recognize is not clear at this time, but the feature recognized is probably not a specific DNA sequence. What this uncertainty means in practical terms is that it is much harder to isolate origins from humans and other higher eukaryotes because scientists cannot use an isolated DNA sequence of one human origin, for example, to perform a computer search of the entire human genome sequence to find other origins. If the ORCs of higher eukaryotes do not interact with a specific sequence scattered throughout the chromosomes, then how do they find the origins of replication? These ORCs are thought to interact indirectly with origins by associating with other protein complexes that are bound to chromosomes. Such a recognition
7.7 Telomeres and Telomerase: Replication Termination 28 3
F i g u r e 7-2 5 This example from yeast shows the initiation of DNA
synthesis at an origin of replication in a eukaryote. As with prokaryotic initiation (see Figure 7-20), proteins of the origin recognition complex (ORC) bind to the origin, where they separate the two strands of the double helix and recruit replisome components at the two replication forks. Replication is linked to the cell cycle through the availability of two proteins: Cdc6 and Cdt1.
mechanism may have evolved so that higher eukaryotes can regulate the timing of DNA replication during S phase (see Chapter 12 for more about euchromatin and heterochromatin). Gene-rich regions of the chromosome (the euchromatin) have been known for some time to replicate early in S phase, whereas gene-poor regions, including the densely packed heterochromatin, replicate late in S phase. DNA replication could not be timed by region if ORCs were to bind to related sequences scattered throughout the chromosomes. Instead, ORCs may, for example, have a higher affinity for origins in open chromatin and bind to these origins first and then bind to condensed chromatin only after the gene-rich regions have been replicated.
Eukaryotic initiation of replication
AT-rich
11-bp consensus sequence
ORC Origin recognition
ORC
Loading of helicase, Cdc6, and Cdt1
Key Concept The yeast origin of replication, like the origin in prokaryotes, contains a conserved DNA sequence that is recognized by the ORC and other proteins needed to assemble the replisome. In contrast, the origins of higher eukaryotes have been difficult to isolate and study because they are long and complex and do not contain a conserved DNA sequence.
7.7 Telomeres and Telomerase: Replication Termination
Unwinding of helix and sliding of helicase
Replication of the linear DNA molecule in a eukaryotic chromosome proceeds in both directions from numerous replication origins, as shown in Figure 7-23. This process replicates most of the chromosomal DNA, but there is an inherent problem in replicating the two ends of linear DNA molecules, the regions called telomeres. Continuous synthesis on the leading strand can proceed right up to the Recruitment of very tip of the template. However, lagging-strand synthesis requires DNA polymerase primers ahead of the process; so, when the last primer is removed, sequences are missing at the end of that strand. As a consequence, a single-stranded tip remains in one of the daughter DNA molecules (Figure 7-26). If the daughter chromosome with this DNA molecule were replicated again, the strand missing sequences at the end would become a shortened double-stranded molecule after replication. At each subsequent replication cycle, the telomere would continue to shorten, until eventually essential coding information would be lost. Cells have evolved a specialized system to prevent this loss. The solution involves the addition of multiple copies of a simple noncoding sequence to the DNA at the chromosome tips. Thus, every time aIntroduction chromo- to Genetic Analysis, 11e Figure 07.25 #729 some is duplicated, it is shortened and only these repeating sequences, which contain no information, are lost. The lost repeats are then added back to06/25/14 the chro07/18/14 mosome ends. Dragonfly Media Group The discovery that the ends of chromosomes are made up of sequences repeated in tandem was made in 1978 by Elizabeth Blackburn and Joe Gall, who were studying the DNA in the unusual macronucleus of the single-celled ciliate
28 4 CHAPTER 7 DNA: Structure and Replication
F i g u r e 7-2 6 Top: The replication of each Okazaki fragment on the lagging strand begins with the insertion of a primer. Bottom: The fate of the bottom strand in the transcription bubble. When the primer for the last Okazaki fragment of the lagging strand is removed, there is no way to fill the gap by conventional replication. A shortened chromosome would result when the chromosome containing the gap was replicated.
The replication problem at chromosome ends Origin of replication
3′
Lagging strand
Leading strand
5′
5′
Leading strand
Lagging strand
3′
Replication fork
Primer 3′ 5′
Leading strand
Lagging strand
5′ 3′
Primer degraded Internal gap 3′ 5′
Terminal gap 5′ 3′
All internal gaps filled, terminal gap not filled 3′ 5′
5′ 3′ 3′ overhang
Tetrahymena. Like other ciliates, Tetrahymena has a conventional micronucleus and an unusual macronucleus in which the chromosomes are fragmented into thousands of gene-size pieces with new ends added to each piece. With so many chromosome ends, Tetrahymena has about 40,000 telomeres and, as such, was the perfect choice to determine telomere composition. Blackburn and Gall were able to isolate the fragments containing the genes for ribosomal RNA (fragments called rDNA; see Chapter 9 for more on ribosomes) by using CsCl gradient centrifugation, the technique developed by Meselson and Stahl to isolate newly replicated E. coli DNA (see page 271). The ends of rDNA fragments contained tandem arrays of the sequence TTGGGG. We now know that virtually all eukaryotes have short tandem repeats at their chromosome ends; however, the sequence is not exactly the same. Human chromosomes, for example, end in about 10 to 15 kb of tandem repeats of the sequence TTAGGG. The question of how these repeats are actually added to chromosome ends after each round of replication was addressed by Elizabeth Blackburn and Carol Grieder. They hypothesized that an enzyme catalyzed the process. Working again with extracts from the Tetrahymena macronucleus, they identified an enzyme, which they called telomerase, that adds the short repeats to the 3′ ends of DNA molecules. Interestingly, the telomerase protein carries a small RNA molecule, part of which acts as a template for the synthesis of the telomeric repeat unit. In all vertebrates, including humans, the RNA sequence 3′-AAUCCC-5′ acts as the template for the 5′-TTAGGG-3′ repeat unit by a mechanism shown in Figure 7-27. Briefly, the telomerase RNA first anneals to the 3′ DNA overhang, which is then extended with the use of the telomerase’s two components: the small RNA (as template) and the protein (as polymerase activity). After the addition of a few nucleotides to the 3′ overhang, the telomerase RNA moves along the DNA so that the 3′ end can be further extended by its polymerase activity. The 3′ end continues to be extended by repeated movement of the telomerase RNA. Primase and
7.7 Telomeres and Telomerase: Replication Termination 28 5
F i g u r e 7-2 7 Telomerase carries a short RNA molecule
(red letters) that acts as a template for the addition of a complementary DNA sequence, which is added to the 3 ′ overhang (blue letters). To add another repeat, the telomerase translocates to the end of the repeat that it just added. The extended 3 ′ overhang can then serve as template for conventional DNA replication.
DNA polymerase then use the very long 3′ overhang as a template to fill in the end of the other DNA strand. Working with Elizabeth Blackburn, a third researcher, Jack Szostak, went on to show that telomeres also exist in the less unusual eukaryote yeast. For contributing to the discovery of how telomeres protect chromosomes from shortening, Blackburn, Grieder, and Szostak were awarded the 2009 Nobel Prize in Medicine or Physiology. One notable feature of this reaction is that RNA is serving as the template for the synthesis of DNA. As you saw in Chapter 1 (and will revisit in Chapter 8), DNA normally serves as the template of RNA synthesis in the process called transcription. It is for this reason that the polymerase of telomerase is said to have reverse transcriptase activity. We will revisit reverse transcriptase in Chapters 10 and 15. In addition to preventing the erosion of genetic material after each round of replication, telomeres preserve chromosomal integrity by associating with proteins to form protective caps. These caps sequester the 3′ singlestranded overhang, which can be as much as 100 nucleotides long (Figure 7-28). Without this protective cap, the double-stranded ends of chromosomes would be mistaken for double-stranded breaks by the cell and dealt with accordingly. As you will see later, in Chapter 16, doublestranded breaks are potentially very dangerous because they can result in chromosomal instability that can lead to cancer and a variety of phenotypes associated with aging. For this reason, when a double-stranded break is detected, the cell responds in a variety of ways, depending, in part, on the cell type and the extent of the damage. For example, the double-stranded break can be fused to another break or the cell can limit the damage to the organism by stopping further cell division (called senescence) or by initiating a cell-death pathway (called apoptosis).
Telomere lengthening (a) Lengthening of the 3′ overhang Telomerase anneals to the 3′ overhang Telomerase 3′ 5′
Elongation
3′ 5′
AACCC 5′ 3′ AACCCCAAC 5′ TTG 3′ TTGGGGTTGGGGTTGGGGTTG
Translocation 3′ 5′
AACCC 5′ 3′ AACCCCAAC 5′ TTG 3′ TTGGGGTTGGGGTTGGGGTTG
Elongation 3′ 5′
AACCC 5′ 3′ AACCCCAAC 5′ TTGGGGT T G 3′ TTGGGGTTGGGGTTGGGGTTGGGGT
(b) Replication of complementary strand
A primer is synthesized 3′ 5′
Primase
AACCC 5′ 3′ AAC 5′ TTGGGGT T G 3′ TTGGGGTTGGGGTTGGGGTTGGGGT Polymerase fills in the gap
DNA polymerase
3′ 5′
K e y C o n c e p t Telomeres are specialized structures at the ends of chromosomes that contain tandem repeats of a short DNA sequence that is added to the 3′ end by the enzyme telomerase. Telomeres stabilize chromosomes by preventing the loss of genomic information after each round of DNA replication and by associating with proteins to form a cap that “hides” the chromosome ends from the cell’s DNA-repair machinery.
AACCC 5′ AACCCCAAC 5′ 3′ AACCCCAAC TTGGGGTTGGGGTTGGGG 3′
AACCC CAACCCCAAC CCCAACCCCAAC 5′ TTGGGGTTGGGGTTGGGGTTGGGGT T G 3′
The primer is removed and ligase seals the gap 3′ 5′
AACCC CAACCCCAACCCCAACC 5′ TTGGGGTTGGGGTTGGGGTTGGGGTTG 3′
28 6 CHAPTER 7 DNA: Structure and Replication
The telomeric cap structure
WRN TRF1 TRF2 5′
F i g u r e 7-2 8 A “cap” protects the telomere at the end of a chromosome. The 3 ′ overhang is “hidden” when it displaces a DNA strand in a region where the telomeric repeats are double stranded. The proteins TRF1 and TRF2 bind to the telomeric repeats, and other proteins, including WRN, bind to TRF1 and TRF2, thus forming the protective telomeric cap.
3′
Surprisingly, although most germ cells have ample telomerase, somatic cells produce very little or no telomerase. For this reason, the chromosomes of proliferating somatic cells get progressively shorter with each cell division until the cell stops all divisions and enters a senescence phase. This observation led many investigators to suspect that there was a link between telomere shortening and aging. Geneticists studying human diseases that lead to a premature-aging phenotype have recently uncovered evidence that supports such a connection. People with Werner syndrome experience the early onset of many age-related events, including wrinkling of the skin, cataracts, osteoporosis, graying of the hair, and cardiovascular disease (Figure 7-29). Genetic and biochemical studies have found that afflicted people have shorter telomeres than those of normal people owing to a mutation in a gene called WRN, which encodes a protein (a helicase) that associates with proteins that comprise the telomere cap (TRF2, see Figure 7-28). This mutation is hypothesized to disrupt the normal telomere, resulting in chromosomal instability and the premature-aging phenotype. Patients with another premature-aging syndrome called dyskeratosis congenita also have shorter telomeres than those of healthy people of the same age, and they, too, harbor mutations in genes required for telomerase activity.
Werner syndrome causes premature aging
Figure 7-29 A woman with Werner syndrome
at ages 15 and 48. [ International Registry of Werner Syndrome, www.wernersyndrome.org.]
Summary 287
Geneticists are also very interested in connections between telomeres and cancer. Unlike normal somatic cells, most cancerous cells have telomerase activity. The ability to maintain functional telomeres may be one reason that cancer cells, but not normal cells, can grow in cell culture for decades and are considered to be immortal. As such, many pharmaceutical companies are seeking to capitalize on this difference between cancerous and normal cells by developing drugs that selectively target cancer cells by inhibiting telomerase activity.
s u m m a ry Experimental work on the molecular nature of hereditary material has demonstrated conclusively that DNA (not protein, lipids, or carbohydrates) is indeed the genetic material. Using data obtained by others, Watson and Crick deduced a double-helical model with two DNA strands, wound around each other, running in antiparallel fashion. The binding of the two strands together is based on the fit of adenine (A) to thymine (T) and guanine (G) to cytosine (C). The former pair is held by two hydrogen bonds; the latter, by three. The Watson–Crick model shows how DNA can be replicated in an orderly fashion—a prime requirement for genetic material. Replication is accomplished semiconservatively in both prokaryotes and eukaryotes. One double helix is replicated to form two identical helices, each with their nucleotides in the identical linear order; each of the two new double helices is composed of one old and one newly polymerized strand of DNA. The DNA double helix is unwound at a replication fork, and the two single strands thus produced serve as templates for the polymerization of free nucleotides. Nucleotides are polymerized by the enzyme DNA polymerase, which adds new nucleotides only to the 3′ end of a growing DNA chain. Because addition is only at 3′ ends, polymerization on one template is continuous, producing the leading strand, and, on the other, it is discontinuous in short stretches (Okazaki fragments), producing the lagging strand. Synthesis of the leading strand and of every Okazaki fragment is primed by a short
RNA primer (synthesized by primase) that provides a 3′ end for deoxyribonucleotide addition. The multiple events that have to occur accurately and rapidly at the replication fork are carried out by a biological machine called the replisome. This protein complex includes two DNA polymerase units, one to act on the leading strand and one to act on the lagging strand. In this way, the more time-consuming synthesis and joining of the Okazaki fragments into a continuous strand can be temporally coordinated with the less complicated synthesis of the leading strand. Where and when replication takes place is carefully controlled by the ordered assembly of the replisome at certain sites on the chromosome called origins. Eukaryotic genomes may have tens of thousands of origins. The assembly of replisomes at these origins can take place only at a specific time in the cell cycle. The ends of linear chromosomes (telomeres) present a problem for the replication system because there is always a short stretch on one strand that cannot be primed. The enzyme telomerase adds a number of short, repetitive sequences to maintain length. Telomerase carries a short RNA that acts as the template for the synthesis of the telomeric repeats. These noncoding telomeric repeats associate with proteins to form a telomeric cap. Telomeres shorten with age in somatic cells because telomerase is not made in those cells. Individuals who have defective telomeres experience premature aging.
key terms antiparallel (p. 267) base (p. 264) complementary base (p. 270) conservative replication (p. 271) daughter molecule (p. 281) deoxyribose (p. 264) distributive enzyme (p. 277) DNA ligase (p. 275) double helix (p. 267) enol (p. 276) genetic code (p. 270) helicase (p. 279) imino (p. 276)
keto (p. 276) lagging strand (p. 275) leading strand (p. 274) major groove (p. 267) minor groove (p. 267) nucleotide (p. 264) Okazaki fragment (p. 275) origin (p. 280) phosphate (p. 264) polymerase III (pol III) holoenzyme (p. 277) primase (p. 275) primer (p. 275) primosome (p. 275)
processive enzyme (p. 277) purine (p. 264) pyrimidine (p. 264) replication fork (p. 274) replisome (p. 277) semiconservative replication (p. 271) single-strand-binding (SSB) protein (p. 279) tautomerization (p. 276) telomerase (p. 284) telomere (p. 283) template (p. 270) topoisomerase (p. 279)
28 8 CHAPTER 7 DNA: Structure and Replication
s olv e d p r obl e m s SOLVED PROBLEM 1. Mitosis and meiosis were presented in Chapter 2. Considering what has been covered in this chapter concerning DNA replication, draw a graph showing DNA content against time in a cell that undergoes mitosis and then meiosis. Assume a diploid cell.
Solution 4 3 2 1 0
Mitosis
Solution Refer to Figure 7-13 for an additional explanation. In conservative replication, if bacteria are grown in the presence of 15N and then shifted to 14N, one DNA molecule will be all 15N after the first generation and the other molecule will be all 14N, resulting in one heavy band and one light band in the gradient. After the second generation, the 15N DNA will yield one molecule with all 15N and one molecule with all 14N, whereas the 14N DNA will yield only 14N DNA. Thus, only all 14N or all 15N DNA is generated, again yielding a light band and a heavy band:
Meiosis Incubation of heavy cells in 14N
is 56 percent, what are the percentages of the four bases (A, T, G, and C) in this molecule? Solution If the GC content is 56 percent, then, because G = C, the content of G is 28 percent and the content of C is 28 percent. The content of AT is 100 − 56 = 44 percent. Because A = T, the content of A is 22 percent and the content of T is 22 percent.
First generation
Controls
SOLVED PROBLEM 2. If the GC content of a DNA molecule
Second generation
14 N
15 N
SOLVED PROBLEM 3. Describe the expected pattern of
bands in a CsCl gradient for conservative replication in the Meselson–Stahl experiment. Draw a diagram.
p r obl e m s Most of the problems are also available for review/grading through the launchpad/iga11e. Working with the Figures
1. In Table 7-1, why are there no entries for the first four tissue sources? For the last three entries, what is the most likely explanation for the slight differences in the composition of human DNA from the three tissue sources? 2. In Figure 7-7, do you recognize any of the components used to make Watson and Crick’s DNA model? Where have you seen them before? 3. Referring to Figure 7-20, answer the following questions: a. What is the DNA polymerase I enzyme doing? b. What other proteins are required for the DNA polymerase III on the left to continue synthesizing DNA? c. What other proteins are required for the DNA polymerase III on the right to continue synthesizing DNA? 4. What is different about the reaction catalyzed by the green helicase in Figure 7-20 and the yellow gyrase in Figure 7-21?
http://www.whfreeman.com/
5. In Figure 7-23(a), label all the leading and lagging strands. B a s i c P r obl e m s
6. Describe the types of chemical bonds in the DNA double helix. 7. Explain what is meant by the terms conservative and semiconservative replication. 8. What is meant by a primer, and why are primers necessary for DNA replication? 9. What are helicases and topoisomerases? 10. Why is DNA synthesis continuous on one strand and discontinuous on the opposite strand? 11. If the four deoxynucleotides showed nonspecific base pairing (A to C, A to G, T to G, and so on), would the unique information contained in a gene be maintained through round after round of replication? Explain. 12. If the helicases were missing during replication, what would happen to the replication process?
Problems 28 9
13. What would happen if, in the course of replication, the topoisomerases were unable to reattach the DNA fragments of each strand after unwinding (relaxing) the DNA molecule? 14. Which of the following is not a key property of hereditary material? a. It must be capable of being copied accurately. b. It must encode the information necessary to form proteins and complex structures. c. It must occasionally mutate. d. It must be able to adapt itself to each of the body’s tissues. 15. It is essential that RNA primers at the ends of Okazaki fragments be removed and replaced by DNA because otherwise which of the following events would result? a. The RNA would interfere with topoisomerase function. b. The RNA would be more likely to contain errors because primase lacks a proofreading function. c. The β-clamp of the DNA pol II dimer would release the DNA and replication would stop. d. The RNA primers would be likely to hydrogen bond to each other, forming complex structures that might interfere with the proper formation of the DNA helix. 16. Polymerases usually add only about 10 nucleotides to a DNA strand before dissociating. However, during replication, DNA pol III can add tens of thousands of nucleotides at a moving fork. How is this addition accomplished? 17. At each origin of replication, DNA synthesis proceeds bidirectionally from two replication forks. Which of the following would happen if a eukaryotic mutant arose having only one functional fork per replication bubble? (See diagram.)
19. If thymine makes up 15 percent of the bases in a specific DNA molecule, what percentage of the bases is cytosine? www
Unpacking the Problem
www
20. If the GC content of a DNA molecule is 48 percent, what are the percentages of the four bases (A, T, G, and C) in this molecule? 21. Bacteria called extremophiles are able to grow in hot springs such as Old Faithful at Yellowstone National Park in Wyoming. Do you think that the DNA of extremophiles would have a higher content of GC or AT base pairs? Justify your answer. 22. Assume that a certain bacterial chromosome has one origin of replication. Under some conditions of rapid cell division, replication could start from the origin before the preceding replication cycle is complete. How many replication forks would be present under these conditions? 23. A molecule of composition 5′-AAAAAAAAAAA-3′ 3′-TTTTTTTTTTTTT-5′ is replicated in a solution containing unlabeled (not radioactive) GTP, CTP, and TTP plus adenine nucleoside triphosphate with all its phosphorus atoms in the form of the radioactive isotope 32P. Will both daughter molecules be radioactive? Explain. Then repeat the question for the molecule 5′-ATATATATATATAT-3′ 3′-TATATATATATATA-5′ 24. Would the Meselson and Stahl experiment have worked if diploid eukaryotic cells had been used instead? 25. Consider the following segment of DNA, which is part of a much longer molecule constituting a chromosome: 5′.…ATTCGTACGATCGACTGACTGACAGTC….3′ 3′.…TAAGCATGCTAGCTGACTGACTGTCAG….5′
Normal
Mutant
a. No change at all in replication. b. Replication would take place only on one half of the chromosome. c. Replication would be complete only on the leading strand. d. Replication would take twice as long. 18. In a diploid cell in which 2n = 14, how many telomeres are there in each of the following phases of the cell cycle? (a) G1 (c) prophase of mitosis (b) G2 (d) telophase of mitosis
If the DNA polymerase starts replicating this segment from the right, a. which will be the template for the leading strand? b. Draw the molecule when the DNA polymerase is halfway along this segment. c. Draw the two complete daughter molecules. d. Is your diagram in part b compatible with bidirectional replication from a single origin, the usual mode of replication? 26. The DNA polymerases are positioned over the following DNA segment (which is part of a much larger molecule) and moving from right to left. If we assume that an
29 0 CHAPTER 7 DNA: Structure and Replication
Okazaki fragment is made from this segment, what will be the fragment’s sequence? Label its 5′ and 3′ ends. 5′.…CCTTAAGACTAACTACTTACTGGGATC….3′ 3′.…GGAATTCTGATTGATGAATGACCCTAG….5′ 27. E. coli chromosomes in which every nitrogen atom is labeled (that is, every nitrogen atom is the heavy isotope 15N instead of the normal isotope 14N) are allowed to replicate in an environment in which all the nitrogen is 14N. Using a solid line to represent a heavy polynucleotide chain and a dashed line for a light chain, sketch each of the following descriptions: a. The heavy parental chromosome and the products of the first replication after transfer to a 14N medium, assuming that the chromosome is one DNA double helix and that replication is semiconservative. b. Repeat part a, but now assume that replication is conservative. c. If the daughter chromosomes from the first division in 14N are spun in a cesium chloride density gradient and a single band is obtained, which of the possibilities in parts a and b can be ruled out? Reconsider the Meselson and Stahl experiment: What does it prove? C h a ll e n g i n g P r obl e m s
28. If a mutation that inactivated telomerase occurred in a cell (telomerase activity in the cell = zero), what do you expect the outcome to be? 29. On the planet Rama, the DNA is of six nucleotide types: A, B, C, D, E, and F. Types A and B are called marzines, C and D are orsines, and E and F are pirines. The following rules are valid in all Raman DNAs: Total marzines = total orsines = total pirines A=C=E B=D=F
a. Prepare a model for the structure of Raman DNA. b. On Rama, mitosis produces three daughter cells. Bearing this fact in mind, propose a replication pattern for your DNA model. c. Consider the process of meiosis on Rama. What comments or conclusions can you suggest? 30. If you extract the DNA of the coliphage φ X174, you will find that its composition is 25 percent A, 33 percent T, 24 percent G, and 18 percent C. Does this composition make sense in regard to Chargaff’s rules? How would you interpret this result? How might such a phage replicate its DNA? 31. In Chapter 5 you saw that bacteria transfer DNA from one member of their species to another in a process called conjugation. Recently it has been shown that the transfer of DNA from one bacterial cell to another is not limited to members of the same species. A microbiologist studying the bacteria Diplococcus pneumonia hypothesizes that a region of its chromosome was in fact transferred from Mycobacterium tuberculosis. Based on the data presented in Table 7-1, what distinguishing feature of the transferred DNA would provide support for this hypothesis? 32. Given what you know about the structure and function of telomerase, provide a plausible model to explain how a species could exist with a combination of two different repeats (for example, TTAGGG and TTGTGG) on each of their telomeres. 33. Do bacteria require telomerase? Explain why or why not. 34. Watson and Crick used an approach called model building to deduce the structure of the DNA double helix. How does this differ from the more conventional experimental approach that is undertaken in a research laboratory? In this regard, why was the experiment of Meselson and Stahl considered to be of such critical importance?
344
8
C h a p t e r
RNA: Transcription and Processing
Learning Outcomes After completing this chapter, you will be able to • Describe how the structure of RNA differs from that of DNA. • Differentiate among the different classes of RNA in a cell. • Explain the function of promoters and the features necessary to start transcription. • Diagram the steps in RNA processing from its transcription to its transport out of the nucleus. • Appraise why the discovery of self-splicing introns is considered to be so important. • Describe the different types of noncoding RNAs (ncRNAs).
RNA polymerase in action. A very small RNA polymerase (blue), made by the bacteriophage T7, transcribes DNA into a strand of RNA (red). The enzyme separates the DNA double helix (yellow, orange), exposing the template strand to be copied into RNA. [ David S. Goodsell, Scripps Research Institute.]
outline 8.1 RNA 8.2 Transcription 8.3 Transcription in eukaryotes 8.4 Intron removal and exon splicing 8.5 Small functional RNAs that regulate and protect the eukaryotic genome
291
292 CHAPTER 8 RNA: Transcription and Processing
U
sing their newly acquired knowledge of the DNA sequences of entire genomes, scientists have been able to determine the approximate number of genes in several organisms, both simple and complex. At first there were no surprises: the bacterium Escherichia coli has about 4400 genes, the unicellular eukaryote yeast Saccharomyces cerevisiae has about 6300 genes, and the multicellular fruit fly Drosophila melanogaster has about 13,600 genes. Scientists assumed that more complex organisms would require more genes, and so early estimates were that our genome would have 100,000 genes. At a conference focused on genome research in 2000, scientists started an informal betting pool called GeneSweep that would be won by the person who most closely predicted the actual number of genes in the human genome. The entries ranged from ~26,000 to ~150,000 genes. With the release of the first draft sequence, a winner was announced. Surprisingly, the winner was the entrant with the very lowest estimate, 25,947 genes. How could Homo sapiens with their complex brains and sophisticated immune systems have only twice as many genes as the roundworm and approximately the same number of genes as the first sequenced plant genome, the mustard weed Arabidopsis thaliana? Part of the answer to this question has to do with a remarkable discovery made in the late 1970s. At that time, the proteins of many eukaryotes were found to be encoded in DNA not as continuous stretches (as they are in bacteria and yeast) but in pieces. Thus, the genes of higher eukaryotes are usually composed of pieces called exons (for expressed region) that encode parts of proteins and pieces called introns (for intervening regi ) that separate exons. As you will learn in this chapter, an RNA copy containing both exons and introns is synthesized from a gene. A biological machine (called a spliceosome) removes the introns and joins the exons (in a process called RNA splicing) to produce a mature RNA that contains the continuous information needed to synthesize a protein. What do exons and introns have to do with the low human gene count? For now, suffice it to say that the RNA transcribed from a gene can be spliced in alternative ways. Although we have only about 21,000 genes, these genes encode more than 100,000 proteins, thanks to the process of alternative splicing of RNA. Even more surprising is the finding that only a small fraction of the genome actually codes for proteins (a little more than 2 percent for most complex multicellular organisms). The content of genomes will be the subject of future chapters. For now it is important to note that despite having such a small proportion of coding DNA, most of the genome still encodes RNA. The story of this aptly named non-protein-coding RNA (ncRNA) is a work in progress. That story will be introduced in this chapter and developed in succeeding chapters. In this chapter, we see the first steps in the transfer of information from genes to gene products. Within the DNA sequence of any organism’s genome is encoded information specifying each of the gene products that the organism can make. These DNA sequences also contain information specifying when, where, and how much of the product is made. To utilize the information, an RNA copy of the gene must be synthesized in a process called transcription. The transfer of information from gene to gene product takes place in several steps. The first step, which is the focus of this chapter, is to copy (transcribe) the information into a strand of RNA with the use of DNA as a template. In prokaryotes, the information in protein-encoding RNA is almost immediately converted into an amino acid chain (protein) by a process called translation. This second step is the focus of Chapter 9. In eukaryotes, transcription and translation are spatially separated: transcription takes place in the nucleus and translation in the cytoplasm. However, before RNAs are ready to be transported into the cytoplasm for translation or other uses, they undergo extensive processing, including the removal of introns and the addition of a special 5′ cap and a 3′ tail of adenine
8.1 RNA 29 3
nucleotides. One fully processed type of RNA, called messenger RNA (mRNA), is the intermediary in the synthesis of proteins. In addition, in both prokaryotes and eukaryotes, there are other types of RNAs that are never translated. These ncRNAs perform many essential roles. DNA and RNA function during transcription is based on two principles: 1. Complementarity of bases is responsible for determining the sequence of the RNA transcript in transcription. Through the matching of complementary bases, the information encoded in the DNA passes into RNA, and protein complexes associated with ncRNAs are guided to specific regions in the RNA to regulate their expression. 2. Certain proteins recognize particular base sequences in DNA and RNA. These nucleic-acid-binding proteins bind to these sequences and act on them. We will see these principles at work throughout the detailed discussions of transcription and translation that follow in this chapter and in chapters to come. K e y C o n c e p t The transactions of DNA and RNA take place through the matching of complementary bases and the binding of various proteins to specific sites on the DNA or RNA.
Eukaryotic mRNA moves from nucleus to cytoplasm
8.1 RNA
Cytoplasm
Nucleus
Early investigators had good reason for thinking that information is not transferred directly from DNA to protein. In a eukaryotic cell, DNA is found in the nucleus, whereas protein is synthesized in the cytoplasm. An intermediate is needed.
Early experiments suggest an RNA intermediate In 1957, Elliot Volkin and Lawrence Astrachan made a significant observation. They found that one of the most striking molecular changes that takes place when E. coli is infected with the phage T2 is a rapid burst of RNA synthesis. Furthermore, this phage-induced RNA “turns over” rapidly; that is, its lifetime is brief, on the order of minutes. Its rapid appearance and disappearance suggested that RNA might play some role in the expression of the T2 genome necessary to make more virus particles. Volkin and Astrachan demonstrated the rapid turnover of RNA by using a protocol called a pulse–chase experiment. To conduct a pulse–chase experiment, the infected bacteria are first fed (pulsed with) radioactive uracil (a molecule needed for the synthesis of RNA but not DNA). Any RNA synthesized in the bacteria from then on is “labeled” with the readily detectable radioactive uracil. After a short period of incubation, the radioactive uracil is washed away and replaced (chased) by uracil that is not radioactive. This procedure “chases” the label out of the RNA because, as the pulse-labeled RNA breaks down, only the unlabeled precursors are available to synthesize new RNA molecules (the labeled nucleotides are “diluted” by the huge excess of unlabeled uracil added in the chase). The RNA recovered shortly after the pulse is labeled, but RNA recovered somewhat later is unlabeled, indicating that the RNA has a very short lifetime. A similar experiment can be done with eukaryotic cells. Cells are first pulsed with radioactive uracil and, after a short time, they are transferred to medium with unlabeled uracil. In samples taken after the pulse, most of the label is in the nucleus. In samples taken after the chase, the labeled RNA is also found in the cytoplasm (Figure 8-1). Apparently, in eukaryotes, the RNA is synthesized in
After the pulse Chased with nonradioactive RNA precursors
After the chase
F i g u r e 8 -1 The pulse–chase experiment showed that mRNA moves into the cytoplasm. Cells are grown briefly in radioactive uracil to label newly synthesized RNA (pulse). Cells are washed to remove the radioactive uracil and are then grown in excess nonradioactive uracil (chase). The red dots indicate the location of the RNA containing radioactive uracil over time.
29 4 CHAPTER 8 RNA: Transcription and Processing
the nucleus and then moves into the cytoplasm, where proteins are synthesized. Thus, RNA is a good candidate for an information-transfer intermediary between DNA and protein.
Properties of RNA Let’s consider the general features of RNA. Although both RNA and DNA are nucleic acids, RNA differs from DNA in several important ways: O
CH2
Base CH2 1′
4′
H H
H
3′
OH
2′
H
Deoxyribose
O Introduction to Genetic Analysis, 11e Figure 08UN1 #8UN1 C 05/05/14 HC 5 4 3 N H Dragonfly Media Group 1
H
3′
OH
N
1′
H H
H
Ribose
HC 6
Base
4′
2′
OH
O
2C
O
O NH N H
H Uracil
O
H
1. RNA has ribose sugar in its nucleotides, rather than the deoxyribose found in DNA. As the names suggest, the two sugars differ in the presence or absence of just one oxygen atom. The RNA sugar contains a hydroxyl group (OH) bound to the 2 ′-carbon atom, whereas the DNA sugar has only a hydrogen atom bound to the 2 ′-carbon atom. 2. RNA is usually a single-stranded nucleotide chain, not a double helix like DNA. A consequence is that RNA is more flexible and can form a much greater variety of complex three-dimensional molecular shapes than can double-stranded DNA. An RNA strand can bend in such a way that some of its own bases pair with each other. Such intramolecular base pairing is an important determinant of RNA shap