1 httpcs273astanfordedu Bejerano Winter 20 2021 2 Mon Wed 1130 AM 1250 on Zoom Prof Gill Bejerano CA Boyoung Bo Yoo Track class on Piazza CS273A Gill Lecture 9 Molecular Evolution Population Genetics ID: 908547
Download Presentation The PPT/PDF document "Whiteboard http://cs273a.stanford.edu [B..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Whiteboard
http://cs273a.stanford.edu [Bejerano Winter 2020/21]
1
Slide2http://cs273a.stanford.edu [Bejerano Winter 20
20/21
]
2
Mon, Wed 11:30 AM -
12:50, on Zoom*Prof: Gill BejeranoCA: Boyoung (Bo) Yoo* Track class on Piazza
CS273A
Gill Lecture 9: Molecular Evolution, Population Genetics
The
Human
Genome
Source
Code
Slide3http://cs273a.stanford.edu [Bejerano Winter 2020/21]
3
Announcements
Remember thy honor code
Slide4http://cs273a.stanford.edu [Bejerano Winter 2020/21]
4
Class Topics
(0) Genome context:
cells, DNA, central dogma
(1) Genome content / genome function:genes, gene regulation, epigenetics, repeats, SARS-CoV-2(2) Genome sequencing: technologies, assembly/analysis, technology dependence (3) Genome evolution: evolution = mutation + selection, main forces of evolution:Neutral evolution, Negative selection, Positive selection(4) Population genomics:Human migration, paternity testing, forensics, cryptogenomics(5) Genomics of human disease:personal genomics, GxE disease types, deep dive monogenics(6) Comparative Genomics :Genomics of amazing animal adaptations, ultraconservationgood morning!
Slide5TTATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATACATATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTCAGTAATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTCCGTGCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACTAGCTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATGATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAAAAGCTGCATAACCACTTTAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAAGTATCAACAAAAAATTGTTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAATGACTAAATCTCATTCAGAAGAAGTGATTGTACCTGAGTTCAATTCTAGCGCAAAGGAATTACCAAGACCATTGGCCGAAAAGTGCCCGAGCATAATTAAGAAATTTATAAGCGCTTATGATGCTAAACCGGATTTTGTTGCTAGATCGCCTGGTAGAGTCAATCTAATTGGTGAACATATTGATTATTGTGACTTCTCGGTTTTACCTTTAGCTATTGATTTTGATATGCTTTGCGCCGTCAAAGTTTTGAACGATGAGATTTCAAGTCTTAAAGCTATATCAGAGGGCTAAGCATGTGTATTCTGAATCTTTAAGAGTCTTGAAGGCTGTGAAATTAATGACTACAGCGAGCTTTACTGCCGACGAAGACTTTTTCAAGCAATTTGGTGCCTTGATGAACGAGTCTCAAGCTTCTTGCGATAAACTTTACGAATGTTCTTGTCCAGAGATTGACAAAATTTGTTCCATTGCTTTGTCAAATGGATCATATGGTTCCCGTTTGACCGGAGCTGGCTGGGGTGGTTGTACTGTTCACTTGGTTCCAGGGGGCCCAAATGGCAACATAGAAAAGGTAAAAGAAGCCCTTGCCAATGAGTTCTACAAGGTCAAGTACCCTAAGATCACTGATGCTGAGCTAGAAAATGCTATCATCGTCTCTAAACCAGCATTGGGCAGCTGTCTATATGAATTAGTCAAGTATACTTCTTTTTTTTACTTTGTTCAGAACAACTTCTCATTTTTTTCTACTCATAACTTTAGCATCACAAAATACGCAATAATAACGAGTAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTTTCCTACGCATAATAAGAATAGGAGGGAATATCAAGCCAGACAATCTATCATTACATTTAAGCGGCTCTTCAAAAAGATTGAACTCTCGCCAACTTATGGAATCTTCCAATGAGACCTTTGCGCCAAATAATGTGGATTTGGAAAAAGAGTATAAGTCATCTCAGAGTAATATAACTACCGAAGTTTATGAGGCATCGAGCTTTGAAGAAAAAGTAAGCTCAGAAAAACCTCAATACAGCTCATTCTGGAAGAAAATCTATTATGAATATGTGGTCGTTGACAAATCAATCTTGGGTGTTTCTATTCTGGATTCATTTATGTACAACCAGGACTTGAAGCCCGTCGAAAAAGAAAGGCGGGTTTGGTCCTGGTACAATTATTGTTACTTCTGGCTTGCTGAATGTTTCAATATCAACACTTGGCAAATTGCAGCTACAGGTCTACAACTGGGTCTAAATTGGTGGCAGTGTTGGATAACAATTTGGATTGGGTACGGTTTCGTTGGTGCTTTTGTTGTTTTGGCCTCTAGAGTTGGATCTGCTTATCATTTGTCATTCCCTATATCATCTAGAGCATCATTCGGTATTTTCTTCTCTTTATGGCCCGTTATTAACAGAGTCGTCATGGCCATCGTTTGGTATAGTGTCCAAGCTTATATTGCGGCAACTCCCGTATCATTAATGCTGAAATCTATCTTTGGAAAAGATTTACAATGATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATAAAG
5
Genome Evolution
http://cs273a.stanford.edu [Bejerano Winter 2020/21]
Slide6Evolution
Vast & fascinating topic We’ll get a meaningful tasteInevitable in a class about the human genome, because “Nothing in Biology Makes Sense Except in the Light of Evolution
”
Theodosius DobzhanskyOne definition: “Changes in the proportions of biological types in a population over time” Stanford Encyclopedia of PhilosophyWe will mostly discussMutation, selection, neutral, negative and positive selectionWe’ll lightly mentionMigration, non-random mating, linkageWe’ll visit and revisit evolution for the remainder.http://cs273a.stanford.edu [Bejerano Winter 2020/21]
6
Slide7http://cs273a.stanford.edu [Bejerano Winter 2020/21]
7
Mutation
+
Selection
EvolutionMistakes can happen during DNA replication. Mistakes are oblivious to DNA segment function. But then selection kicks in.
...ACGTACGACTGACTAGCATCGACTACGA...
chicken
egg
...ACGTACGACTGACTAGCATCGACTACGA...
functional
junk
TT
CAT
“anything
goes”
many changes
are
not
tolerated
chicken
This has bad implications – disease,
and good implications – adaptation.
Slide8My Genome is like
A programming language?Not quite…An Operating System?Not quite…The disk of an operating system developed in place (for 3 billion years…)Quite.Developed HOW?!..
http://cs273a.stanford.edu [Bejerano Winter 2020/21]
8
Slide9http://cs273a.stanford.edu [Bejerano Winter 2020/21]
9
Mutation
Slide10Chromosomal (
ie big) MutationsFive types exist:DeletionInversionDuplicationTranslocation
(Nondisjunction)
Fusion/Fission
Slide11Deletion
Due to breakageA piece of a chromosome is lost
Slide12Inversion
Chromosome segment breaks offSegment flips around backwardsSegment
reattaches
This reverses and complements the sequence.
Slide13Duplication
Occurs when a genomic region is repeated
Slide14Translocation
Involves two chromosomes that aren’t homologousPart of one chromosome is transferred to another chromosomes
Slide15Nondisjunction
Failure of chromosomes to separate during meiosisCauses gamete to have too many
or
too few chromosomesDisorders:Down Syndrome – three 21st chromosomesTurner Syndrome – single X chromosomeKlinefelter’s Syndrome – XXY chromosomes
Slide16Whole Chromosome Fusion/Fission
http://cs273a.stanford.edu [Bejerano Winter 2020/21]
16
Human chromosome 2
Slide17Genomic (
ie small) Mutations
Six types exist:
Substitution (
eg GT)DeletionInsertionInversion
DuplicationTranslocation
Slide18Mutation
Mutation is random but not uniformly randomFor example, it depends a lot on local sequence contentSome examples we’ve met:
Mutation is however, oblivious to sequence function
That’s where selection kicks in…http://cs273a.stanford.edu [Bejerano Winter 2020/21]
18simple repeatsinterspersed repeatsrepeatmediateddeletion / inversion
Slide19http://cs273a.stanford.edu [Bejerano Winter 2020/21]
19
Selection
Slide20http://cs273a.stanford.edu [Bejerano Winter 2020/21]
20
Mutation + Selection
Evolution
Mistakes can happen during DNA replication. Mistakes are oblivious to DNA segment function. But then selection kicks in.
...ACGTACGACTGACTAGCATCGACTACGA...
chicken
egg
...ACGTACGACTGACTAGCATCGACTACGA...
functional
junk
TT
CAT
“anything
goes”
many changes
are
not
tolerated
chicken
This has bad implications – disease,
and good implications – adaptation.
Slide21Example
Imagine a single UCA codon in a single exon coding gene.
Assume these three bases’
only role is to code for Ser.http://cs273a.stanford.edu [Bejerano Winter 2020/21]
21SerUC
Human genome
Slide22If the 3rd position is under neutral
evolution
http://cs273a.stanford.edu [Bejerano Winter 2020/21]
22
SerUCC C
C A A A A A T A3rd position composition in population in generationtt+1t+2t+3…G
Slide23If the 2nd position is under negative selection
http://cs273a.stanford.edu [Bejerano Winter 2020/21]
23
Ser
UC2nd position composition in population in generation
tt+1t+2t+3…A A A C C C C C T CG
Slide24If the 1st position experiences positive selection
http://cs273a.stanford.edu [Bejerano Winter 2020/21]
24
Ser
UC1st position composition in population in generation
tt+1t+2t+3…C C C T T T T T A TG
Slide25http://cs273a.stanford.edu [Bejerano Winter 2020/21]
25
Random Drift
Random drift acknowledges the fact that population size is in fact finite, and thus transmission becomes probabilistic.
For example under neutral evolution, imagine a population of size 10, 5 A and 5 T. Even though both have 50% chance of contributing each allele in the next generation, with probability 1/1024 all 10 alleles may be A next generation.
Similarly, random drift of finite size populations may “derail” negative & positive selection.
Slide26http://cs273a.stanford.edu [Bejerano Winter 2020/21]
26
Biological Types
We cautiously defined evolution
as “Changes in the proportions of biological types in a population over time
”We talked about the evolution of a single basepair at a time.But the same hold for other biological types: gene, enhancer, pathway, individuals.(Aside: Also note that many other forms of evolution exist).
Slide27http://cs273a.stanford.edu [Bejerano Winter 2020/21]
27
Genomic Transmission
For repeat copies to accumulate through human generations they must make it into the
germline
cells (eggs & sperms).Equally true for any genomic mutation.cell
genome =
all DNA
chicken ≈ 10
13
copies
(DNA) of egg (DNA)
chicken
egg
egg
egg
cell
division
DNA strings =
Chromosomes
Slide28Human Mutation Rate
Recent sequencing analysis suggests ~40-60 new mutations in a child that were not present in either parent.~1 mutation per genome replicationMutations range from the smallest possible (single base pair change) to the largest – whole genome duplication (to be discussed).
Selection does not tolerate all of these mutation, but it sure does tolerate
many.http://cs273a.stanford.edu [Bejerano Winter 2020/21]28
chicken
egg
chicken
Slide29http://cs273a.stanford.edu [Bejerano Winter 2020/21]
29
Class Topics
(0) Genome context:
cells, DNA, central dogma
(1) Genome content / genome function:genes, gene regulation, epigenetics, repeats, SARS-CoV-2(2) Genome sequencing: technologies, assembly/analysis, technology dependence (3) Genome evolution: evolution = mutation + selection, main forces of evolution:Neutral evolution, Negative selection, Positive selection(4) Population genomics:Human migration, paternity testing, forensics, cryptogenomics(5) Genomics of human disease:personal genomics, GxE disease types, deep dive monogenics(6) Comparative Genomics :Genomics of amazing animal adaptations, ultraconservationNeutral evolution
Slide30TTATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATACATATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTCAGTAATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTCCGTGCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACTAGCTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATGATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAAAAGCTGCATAACCACTTTAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAAGTATCAACAAAAAATTGTTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAATGACTAAATCTCATTCAGAAGAAGTGATTGTACCTGAGTTCAATTCTAGCGCAAAGGAATTACCAAGACCATTGGCCGAAAAGTGCCCGAGCATAATTAAGAAATTTATAAGCGCTTATGATGCTAAACCGGATTTTGTTGCTAGATCGCCTGGTAGAGTCAATCTAATTGGTGAACATATTGATTATTGTGACTTCTCGGTTTTACCTTTAGCTATTGATTTTGATATGCTTTGCGCCGTCAAAGTTTTGAACGATGAGATTTCAAGTCTTAAAGCTATATCAGAGGGCTAAGCATGTGTATTCTGAATCTTTAAGAGTCTTGAAGGCTGTGAAATTAATGACTACAGCGAGCTTTACTGCCGACGAAGACTTTTTCAAGCAATTTGGTGCCTTGATGAACGAGTCTCAAGCTTCTTGCGATAAACTTTACGAATGTTCTTGTCCAGAGATTGACAAAATTTGTTCCATTGCTTTGTCAAATGGATCATATGGTTCCCGTTTGACCGGAGCTGGCTGGGGTGGTTGTACTGTTCACTTGGTTCCAGGGGGCCCAAATGGCAACATAGAAAAGGTAAAAGAAGCCCTTGCCAATGAGTTCTACAAGGTCAAGTACCCTAAGATCACTGATGCTGAGCTAGAAAATGCTATCATCGTCTCTAAACCAGCATTGGGCAGCTGTCTATATGAATTAGTCAAGTATACTTCTTTTTTTTACTTTGTTCAGAACAACTTCTCATTTTTTTCTACTCATAACTTTAGCATCACAAAATACGCAATAATAACGAGTAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTTTCCTACGCATAATAAGAATAGGAGGGAATATCAAGCCAGACAATCTATCATTACATTTAAGCGGCTCTTCAAAAAGATTGAACTCTCGCCAACTTATGGAATCTTCCAATGAGACCTTTGCGCCAAATAATGTGGATTTGGAAAAAGAGTATAAGTCATCTCAGAGTAATATAACTACCGAAGTTTATGAGGCATCGAGCTTTGAAGAAAAAGTAAGCTCAGAAAAACCTCAATACAGCTCATTCTGGAAGAAAATCTATTATGAATATGTGGTCGTTGACAAATCAATCTTGGGTGTTTCTATTCTGGATTCATTTATGTACAACCAGGACTTGAAGCCCGTCGAAAAAGAAAGGCGGGTTTGGTCCTGGTACAATTATTGTTACTTCTGGCTTGCTGAATGTTTCAATATCAACACTTGGCAAATTGCAGCTACAGGTCTACAACTGGGTCTAAATTGGTGGCAGTGTTGGATAACAATTTGGATTGGGTACGGTTTCGTTGGTGCTTTTGTTGTTTTGGCCTCTAGAGTTGGATCTGCTTATCATTTGTCATTCCCTATATCATCTAGAGCATCATTCGGTATTTTCTTCTCTTTATGGCCCGTTATTAACAGAGTCGTCATGGCCATCGTTTGGTATAGTGTCCAAGCTTATATTGCGGCAACTCCCGTATCATTAATGCTGAAATCTATCTTTGGAAAAGATTTACAATGATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATAAAG
30
Population Genetics
http://cs273a.stanford.edu [Bejerano Winter 2020/21]
Slide31We’re diploids
http://cs273a.stanford.edu [Bejerano Winter 2020/21]
31
Slide32STRs identify individuals, paternity
http://cs273a.stanford.edu [Bejerano Winter 2020/21]
32
Every human is reduced
to a unique profile.Current U.S. forensics use a fixed 20 STR profile.A unique human being =20 number pairs.“up to monozygotic twins”
ShortTandemRepeats
Slide33Most mutations are far from deleterious
We just said ~1 mutation per whole genome replication.Corollary: No two cells are identical.Corollary: Identical twins are not really identical.
http://cs273a.stanford.edu [Bejerano Winter
2020/21]33
Slide34Identical twins are not really identical
Our technology is getting good enough to see this:http://cs273a.stanford.edu [Bejerano Winter
20
20/21]34
Slide35Meiotic Crossover complicate things
http://cs273a.stanford.edu [Bejerano Winter 2020/21]
In humans:
1 crossover in ~100Mb
Slide36.. but only quantitatively
http://cs273a.stanford.edu [Bejerano Winter 2020/21]36
Slide37Identity by descent (IBD) stretches identify relatives
http://cs273a.stanford.edu [Bejerano Winter 2020/21]37
Slide38http://cs273a.stanford.edu [Bejerano Winter 2020/21]
38
Slide39Let’s go population-wide
Let’s keep >0 mutations per whole genome replication.But let’s forget sex, diploidity, and crossover.We can draw similar qualitative conclusions without themEvery human has a big genomeEvery offspring is almost identical to their single parent, except for a handful of mutations
http://cs273a.stanford.edu [Bejerano Winter
2020/21]
39
(not to scale: the genome is 3*10
9
bp, changes as small as a 1bp SNP!)
SNP =
Single
Nucleotide
Polymorphism
Slide40Inferring population migration patterns
Now imagine people lived in one continent for a long time, then recently a handful of people left continent A, and successively populated continents B,C and D.What would the current genome pool in A,B,C,D look like?Lots more genomic variation in continent A than B,C,DBottleneck effect
Continent D shares private variation with continent C,
not seen in continents A or BEtc.That’s the essence of populationmigration pattern reconstructionfrom genomic datahttp://cs273a.stanford.edu [Bejerano Winter 2020/21]
40ABCD
Slide41Population Sequencing –
1000 Genomes Project
Slide42Why humans are so similar
Out of Africa
Oppenheimer S Phil. Trans. R. Soc. B 2012;367:770-784
Slide43Low dimensional embedding
http://cs273a.stanford.edu [Bejerano Winter
20
20/21]
43Imagine you measured common SNPs in many Europeans.Any two individuals are separated by some SNP differences.Imagine you wanted to represent each person as a dot, and embed all dots into a 2-D space, such that the 2D distance between any pair was proportional to their SNP distance. What would the embedding look like?
Slide44Global Ancestry Inference
Nature. 2008 November 6; 456(7218): 98–101.
Turns out:
PCA embedding the distances between genomes (SNPs) of many Europeans onto a 2D plane reconstructs the map of Europe!Why?
Travel was hard, dangerous and unpopular for the longest time.Corollary: People married (procreated) with people who lived nearby.
Slide45Genome painting
http://cs273a.stanford.edu [Bejerano Winter 20
20/21
]45
If human populations are so well defined, we can sample many and define sets of SNPs that separate different ancestries from each other.E.g., DK variants not seen elsewhere in Europe or beyond.
Slide46Identity by descent (IBD) stretches identify relatives
http://cs273a.stanford.edu [Bejerano Winter 2020/21]
46
What if you did IBD only on population level SNPs?
Slide47Ancestry Painting
?
Danish
French
Spanish
Mexican
Slide48Ancient DNA
Find ancient remains (e.g. bone, teeth). Sequence them. Fear degradation, contamination, post-mortem mutations. Look for similarity blocks to extant sequenced individuals. Consider individuals’ locations and known migrations. Revise best estimate of migration and mating patterns.Repeat.
Theoretically can sequence 0.4-1.5 million year old DNA.
DNA degrades faster in warmer places (Africa vs Europe)Human samples mostly 10-40,000 yr old.http://cs273a.stanford.edu [Bejerano Winter 2020/21]48
Slide49The Neanderthal
Slide50From bones, compared genomes of three different Neanderthals with five genomes from modern humans from different areas of the world
The Neanderthal Genome
Figure 1- R. E. Green et al., Science 328, 710-722 (2010)
Slide51Neanderthal Genome
Slide52Neanderthal heritage
http://cs273a.stanford.edu [Bejerano Winter 2020/21]52A major risk locus for COVID coincides with genomic variants some humans have inherited from Neanderthals.
Slide53Denisovan
– Another human relative
Slide54Coalescent Theory
http://cs273a.stanford.edu [Bejerano Winter 2020/21]54
The coalescent is a probability model for the tree underlying a sample of homologous DNA sequences drawn from a within-species population
.The focus of interest can be the underlying genealogical tree, mutation or recombination rates, or demographic parameters such as historic population sizes or migration rates.
Slide55Human population migrations
Out of Africa, ReplacementSingle mother of all humans (Eve) ~190,000yrSingle father of all humans (Adam) ~340,000yr
Humans out of Africa
~50000 years ago replaced others (e.g., Neandertals)Multiregional EvolutionGenerally debunked, however,~5% of human genome in Europeans, Asians is Neanderthal, Denisova
Recent most likely migration & mating pattern estimates. They continue to be revised.
Slide56Discoveries Continue
http://cs273a.stanford.edu [Bejerano Winter 2020/21]56
Slide57http://cs273a.stanford.edu [Bejerano Winter 2020/21]
57
Class Topics
(0) Genome context:
cells, DNA, central dogma
(1) Genome content / genome function:genes, gene regulation, epigenetics, repeats, SARS-CoV-2(2) Genome sequencing: technologies, assembly/analysis, technology dependence (3) Genome evolution: evolution = mutation + selection, main forces of evolution:Neutral evolution, Negative selection, Positive selection(4) Population genomics:Human migration, paternity testing, forensics, cryptogenomics(5) Genomics of human disease:personal genomics, GxE disease types, deep dive monogenics(6) Comparative Genomics :Genomics of amazing animal adaptations, ultraconservationNegative selection