/
http://cs273a.stanford.edu [Bejerano Winter 201 http://cs273a.stanford.edu [Bejerano Winter 201

http://cs273a.stanford.edu [Bejerano Winter 201 - PowerPoint Presentation

gristlydell
gristlydell . @gristlydell
Follow
342 views
Uploaded On 2020-10-22

http://cs273a.stanford.edu [Bejerano Winter 201 - PPT Presentation

8 1 9 1 TTh   130250pm mostly Always M106 Prof Gill Bejerano CAs Boyoung Bo Yoo amp Yatish Turakhia Track class on Piazza CS273A Gill Lecture 15 Comparative Genomics ID: 814407

stanford cs273a winter bejerano cs273a stanford bejerano winter 2018 species http trait vitamin ancestral phenotype human selection information genome

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "http://cs273a.stanford.edu [Bejerano Win..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

1

TTh  1:30-2:50pm, mostly Always M106*Prof: Gill BejeranoCAs: Boyoung (Bo) Yoo & Yatish Turakhia* Track class on Piazza

CS273A

Gill Lecture 15: Comparative Genomics II

The

Human

Genome

Source

Code

Slide2

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

2Announcements

Slide3

TTATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATACATATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTCAGTAATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTCCGTGCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACTAGCTTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATGATAATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAAAAGCTGCATAACCACTTTAACTAATACTTTCAACATTTTCAGTTTGTATTACTTCTTATTCAAATGTCATAAAAGTATCAACAAAAAATTGTTAATATACCTCTATACTTTAACGTCAAGGAGAAAAAACTATAATGACTAAATCTCATTCAGAAGAAGTGATTGTACCTGAGTTCAATTCTAGCGCAAAGGAATTACCAAGACCATTGGCCGAAAAGTGCCCGAGCATAATTAAGAAATTTATAAGCGCTTATGATGCTAAACCGGATTTTGTTGCTAGATCGCCTGGTAGAGTCAATCTAATTGGTGAACATATTGATTATTGTGACTTCTCGGTTTTACCTTTAGCTATTGATTTTGATATGCTTTGCGCCGTCAAAGTTTTGAACGATGAGATTTCAAGTCTTAAAGCTATATCAGAGGGCTAAGCATGTGTATTCTGAATCTTTAAGAGTCTTGAAGGCTGTGAAATTAATGACTACAGCGAGCTTTACTGCCGACGAAGACTTTTTCAAGCAATTTGGTGCCTTGATGAACGAGTCTCAAGCTTCTTGCGATAAACTTTACGAATGTTCTTGTCCAGAGATTGACAAAATTTGTTCCATTGCTTTGTCAAATGGATCATATGGTTCCCGTTTGACCGGAGCTGGCTGGGGTGGTTGTACTGTTCACTTGGTTCCAGGGGGCCCAAATGGCAACATAGAAAAGGTAAAAGAAGCCCTTGCCAATGAGTTCTACAAGGTCAAGTACCCTAAGATCACTGATGCTGAGCTAGAAAATGCTATCATCGTCTCTAAACCAGCATTGGGCAGCTGTCTATATGAATTAGTCAAGTATACTTCTTTTTTTTACTTTGTTCAGAACAACTTCTCATTTTTTTCTACTCATAACTTTAGCATCACAAAATACGCAATAATAACGAGTAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTTTCCTACGCATAATAAGAATAGGAGGGAATATCAAGCCAGACAATCTATCATTACATTTAAGCGGCTCTTCAAAAAGATTGAACTCTCGCCAACTTATGGAATCTTCCAATGAGACCTTTGCGCCAAATAATGTGGATTTGGAAAAAGAGTATAAGTCATCTCAGAGTAATATAACTACCGAAGTTTATGAGGCATCGAGCTTTGAAGAAAAAGTAAGCTCAGAAAAACCTCAATACAGCTCATTCTGGAAGAAAATCTATTATGAATATGTGGTCGTTGACAAATCAATCTTGGGTGTTTCTATTCTGGATTCATTTATGTACAACCAGGACTTGAAGCCCGTCGAAAAAGAAAGGCGGGTTTGGTCCTGGTACAATTATTGTTACTTCTGGCTTGCTGAATGTTTCAATATCAACACTTGGCAAATTGCAGCTACAGGTCTACAACTGGGTCTAAATTGGTGGCAGTGTTGGATAACAATTTGGATTGGGTACGGTTTCGTTGGTGCTTTTGTTGTTTTGGCCTCTAGAGTTGGATCTGCTTATCATTTGTCATTCCCTATATCATCTAGAGCATCATTCGGTATTTTCTTCTCTTTATGGCCCGTTATTAACAGAGTCGTCATGGCCATCGTTTGGTATAGTGTCCAAGCTTATATTGCGGCAACTCCCGTATCATTAATGCTGAAATCTATCTTTGGAAAAGATTTACAATGATTGTACGTGGGGCAGTTGACGTCTTATCATATGTCAAAGTCATTTGCGAAGTTCTTGGCAAGTTGCCAACTGACGAGATGCAGTAACACTTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCACAAACTTTAAAACACAGGGACAAAATTCTTGATATGCTTTCAACCGCTGCGTTTTGGATACCTATTCTTGACATGATATGACTACCATTTTGTTATTGTTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATGTTTTCAATGTAAGAGATTTCGATTATCTTATAGTTCATACATGCTTCAACTACTTAATAAATGATTGTATGATAATAAAG

3Genome Evolutionhttp://cs273a.stanford.edu [Bejerano Winter 2018/19]

Slide4

Life’s Amazing Diversity – On Your Laptop NowMammals (250+)Birds (130+)Reptiles (35+)Amphibians (5+)Fish (200+)Genome papers barely scratch the surface of the mysteries genomes holdshttp://cs273a.stanford.edu [Bejerano Winter 2018/19]4

Slide5

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

5Our closest living relative species

Slide6

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

6What human-chimp changes do we find?

SmallLarge

Medium

Slide7

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

7What functional instructions can change?

Human/chimp genome: ~3*109 bpRough composition:Genes 2%Non-coding RNAs 2%Regulatory DNA 10-15%Repeats 40%Other 40%

Slide8

8

8

Humans and Chimpanzees Possess

Many Vastly Different Phenotypes

A: Chimp B: Human

A B

[Varki, A. and Altheide, T.,

Genome Res.

, 2005]

A B

Slide9

Phenotype

Genotype

Genetic basis of human phenotypes?

Number of rearrangements

9

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

Most mutations

are near/neutral.

Slide10

The Genotype - Phenotype dividehttp://cs273a.stanford.edu [Bejerano Winter 2018/19]

10

Can we find evolutionary patterns that are distinct enough to be phenotypically revealing?Species ASpecies B

Problem #1

:Too many nucleotide changes between any pair of related species (or individuals).

The vast majority of these are near/neutral.

Slide11

Genotype -> Phenotype screenshttp://cs273a.stanford.edu [Bejerano Winter 2018/19]

11

deleted!

Chimp

Human

conserved

Define a “dramatic” (non-neutral) genomic scenario:

hCONDEL

[McLean, Pollen, Reno et al, 2011]

Problem #2

:

What is the phenotype?

Slide12

Testing is a humbling experiencehttp://cs273a.stanford.edu [Bejerano Winter 2018/19]

12

“Wild rides”: often not what we expected, often not what we can understand.Did we have the right timepoint?Did we find the right processes?

Slide13

What about a tree of related species?http://cs273a.stanford.edu [Bejerano Winter 2018/19]

13

What if we could find evolutionary patterns that were distinct enough to be phenotypically revealing?ancestorSpecies A

Species H

Genomes:

Inherited with Modifications.

Traits:

Come and Go.

Species B

.

.

.

Slide14

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

14Fixation, Positive & Negative Selection

Neutral Drift

Positive Selection

Negative Selection

Time

Slide15

ancestral trait information

Trait information is no longer under selection

Erodes away over evolutionary time

ancestor

What happens when an ancestral trait “

goes

”?

Phenotype

Genome

15

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

Slide16

ancestral trait information

Trait information is no longer under selection

Erodes away over evolutionary time

ancestor

Phenotype

Genome

A lot of DNA and many traits

vary between any

two

species.

16

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

Slide17

ancestral trait information

Trait information is no longer under selection

Erodes away over evolutionary time

ancestor

Phenotype

Genome

17

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

A lot of DNA and many traits

vary between any

two

species.

What about

independent

trait loss?

vitamin C synthesis, tail, body hair,

dentition features, etc. etc.

Slide18

ancestral trait information

Trait information is no longer under selection

Erodes away over evolutionary time

ancestor

Phenotype

Genome

18

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

Slide19

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

matches trait presence/absence pattern

The P

G screen

[Hiller et al., 2012a]

19

Slide20

The PG screenhttp://cs273a.stanford.edu [Bejerano Winter 2018/19]

20

Capture the independent genomic switch from purifying selection  neutral evolution in all and only the trait loss species.Robust to: Different trait disabling times.Different trait disabling mutations.

Slide21

Forward Genetics:Search for mutations that segregate with a trait of interestForward Genomics:Search for regions that are lost only in species lacking the trait

phenotype

genotype21http://cs273a.stanford.edu [Bejerano Winter 2018/19]

Branding ;-)

But does it work?

Slide22

Vitamin C Synthesis

synthesize vitamin C

cannot synthesize vitamin C

rats & mice

human

22

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

Slide23

vitamin C synthesis was lost3-4 times independently

in mammalian evolution

23http://cs273a.stanford.edu [Bejerano Winter 2018/19]The Vitamin C synthesis “phenotree”Fwd Genomics asks:Do one or moregenomic locilook like THAT?

Slide24

We quantify divergence by comparing sequences to the reconstructed ancestral sequencereconstruct ancestral sequence ancestor

24species 1

outgroupspecies 2ACCCTATCGATT-CAACCCTATCGATTGCAT

CCGTATCG-TT-CA

species 1

species 2

14 identical bases

11 identical bases

Mutation in

species 1 or 2?

species 1

species 2

93%

79%

percent of identical bases:

 more diverged

Insertion in species 1 or

deletion in species 2 ?

A

CC

C

TATCG

A

TT

G

CA

T

CC

G

TATCG

-

TT

-

CA

ACTCT-TCGATT-AA

Slide25

Sequencing errors mimic divergence25high sequencing error rate

 treat species 2 as missing datasequence quality scoresancestor

ACCCTATCGATT-CAATGGACCCTATCGATTGCAAGGGspecies 1species 289% identical bases61% identical bases

T

CC

G

TA

A

CG

--

T-C

T

AT

C

G

Slide26

Assembly gaps mimic divergence26

?????????

species 1Sanger readsassembly gap

conserved region

 treat species 1 as missing data

species 2

species 3

species 4

species 5

Slide27

...Reconstruct the evolutionary history of all conserved regions, coding and non-coding

85%

70%93%

matrix:

33

species

x

544,549

regions

544,549 conserved regions

Reconstruct ancestral sequence

Measure extant species divergence

Avoid

Low quality sequence

Assembly gaps

Seek perfect

phenotree

match

27

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

reconstruct

ancestral

locus

Slide28

We quantify the match to the vitamin C pattern by counting the number of species that violate the patternPercent identity

0

100Percent identity0100

1 violation

2 violations

28

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

Slide29

8Regions matching the vitamin C trait are clustered

 these conserved regions are all exons of a single gene544,549 conserved regions

no. of violating species0

1

2

3

4

5

7

9

10

6

no

match

perfect

match

29

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

Slide30

This gene is more diverged in all non-vitamin C synthesizing specieshttp://cs273a.stanford.edu [Bejerano Winter 2018/19]

30

Slide31

What is the function of this gene ?http://cs273a.stanford.edu [Bejerano Winter 2018/19]

31

encodes the enzyme responsible for vitamin C biosynthesisVitamin C pattern

Gulo

- gulonolactone (L-) oxidase

33 genomes X 544,549 regions

Note:

No likely shared disabling mutation.

We learned about

both

evolution and function.

Slide32

The Power of Forward Genomicshttp://cs273a.stanford.edu [Bejerano Winter 2018/19]

32

Vitamin C pattern

Gulo

- gulonolactone (L-) oxidase

33 genomes X 544,549 regions

Forward genomics works.

Can it work for continuous traits?

With only two independent losses?

And many unknown values?

Slide33

BileBile is a fluid produced by the liver that aids the digestion of lipids in the small intestine.http://cs273a.stanford.edu [Bejerano Winter 2018/19]

33

Slide34

Bile Phospholipidshttp://cs273a.stanford.edu [Bejerano Winter 2018/19]

34

Different mammals have remarkably different levels of biliary phospholipids:

Slide35

ABCB4 is a phospholipid transporterhttp://cs273a.stanford.edu [Bejerano Winter 2018/19]

35

Slide36

Find “Cure” Models for Human Diseasehttp://cs273a.stanford.edu [Bejerano Winter 2018/19]

36

Human ABCB4 mutations lower patient biliary phospholipid levels to guinea pig levels but are detrimental. Our discovery: Guinea pig and horse have inactivated the Abcb4 gene in their natural state. How can they do it?create KO gene

try to fix/treat

Natural KO

find nature’s cure!

Slide37

“Reverse Genomics” of Enhancershttp://cs273a.stanford.edu [Bejerano Winter 2018/19]

37

Slide38

“Reverse Genomics” of Enhancers

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

38(Marcovitz et al, MBE, 2016)

Slide39

“Reverse Genomics” of Enhancers

http://cs273a.stanford.edu [Bejerano Winter 2018/19]

39(Marcovitz et al, MBE, 2016)

Slide40

We uncover many enhancer-trait correlationshttp://cs273a.stanford.edu [Bejerano Winter 2018/19]

40