/
Evolution of the genes and genomes of 76 arthropod species Evolution of the genes and genomes of 76 arthropod species

Evolution of the genes and genomes of 76 arthropod species - PowerPoint Presentation

dardtang
dardtang . @dardtang
Follow
344 views
Uploaded On 2020-08-27

Evolution of the genes and genomes of 76 arthropod species - PPT Presentation

Gregg Thomas Indiana University greggwcthomas Arthropod Genomics Symposium 060917 Arthropods are the largest group of multicellular organisms 2 70 Arthropods are the largest group of multicellular organisms ID: 805996

families species genes gene species families gene genes lica genome insect i5k rapidly evolving single number copy order sequenced

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Evolution of the genes and genomes of 76..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Evolution of the genes and genomes of 76 arthropod species

Gregg Thomas

Indiana University @greggwcthomasArthropod Genomics Symposium, 06.09.17

Slide2

Arthropods are the largest group of multicellular organisms

2 / 70

Slide3

Arthropods are the largest group of multicellular organisms

https://fivethirtyeight.com/features/the-bugs-of-the-world-could-squish-us-all

/ 3 / 70

Slide4

Arthropods exhibit great phenotypic and behavioral diversity

4

/ 70

Slide5

https://

www.hgsc.bcm.edu/arthropods/i5k

The i5k pilot project has sequenced 27 insect genomes 5 / 70

Slide6

27

genomes sequenced as part of the i5k pilot

project

6

/ 70

Slide7

27

genomes sequenced as part of the i5k pilot

project49 previously sequenced arthropod genomes+7 / 70

Slide8

(4 species)

(2 species)

(7 species)(24 species)(6 species)(5 species)(14 species)+27 genomes sequenced as part of the i5k pilot project49 previously sequenced arthropod genomesSpanning 21 arthropod orders

8 / 70

Slide9

Questions

What are the relationships among species and orders?What are the patterns of genome evolution?

What did the genome of the last insect common ancestor (LICA) look like?9 / 70

Slide10

Questions

What are the relationships among species and orders?What are the patterns of genome evolution?

What did the genome of the last insect common ancestor (LICA) look like?10 / 70

Slide11

Orthology

prediction from OrthoDB

Rob Waterhouse38,195 ortho-groups across 76 arthropod species11 / 70

Slide12

Orthologs for

phylogeneticsWe rely on single-copy orthologs for species tree reconstruction to minimize gene tree discordance due to duplications and losses

How many are in our data?12 / 70

Slide13

Orthologs for

phylogeneticsWe rely on single-copy orthologs for species tree reconstruction to minimize gene tree discordance due to duplications and losses

How many are in our data?013 / 70

Slide14

EOG8DFS3J

Single-copy in all but one species (2 copies in Plutella

xylostella)14 / 70

Slide15

EOG8DFS3J

Single-copy in all but one species

(2 copies in Plutella xylostella)Problem: Hemiptera not monophyleticProblem: Lepidoptera and Trichoptera within

Diptera

15

/ 70

Slide16

As the number of species increases, the number of sequenced single-copy genes decreases

16 / 70

Slide17

As the number of species increases, the number of sequenced single-copy genes decreases

How can we turn our

species rich data into sequence rich data?17 / 70

Slide18

As the number of species increases, the number of sequenced single-copy genes decreases

How can we turn our

species rich data into sequence rich data?

Construct a backbone phylogeny by using single-copy

orthologs

among

orders

rather than

species

18

/ 70

Slide19

Backbone Phylogeny Construction

Phylum

# Orders# single-copy orthologsArthropoda2115019 / 70

Slide20

Backbone Phylogeny Construction

2 alignment methods

Maximum Likelihood gene trees3 species tree methodsMUSCLEPASTAConsensusConcatenationCoalescentPhylum

# Orders

# single-copy

orthologs

Arthropoda

21

150

20

/ 70

Slide21

(4 species)

(2 species)

(7 species)(24 species)(6 species)(5 species)(14 species)The backbone phylogeny based on:-150 genes-Pasta alignment-ASTRAL21 / 70

Slide22

(4 species)

(2 species)

(7 species)(24 species)(6 species)(5 species)(14 species)Monophyletic Crustacea??The backbone phylogeny based on:-150 genes-Pasta alignment

-ASTRAL

22

/ 70

Slide23

Monophyletic Crustacea

?Our inferred topology…

23 / 70

Slide24

Monophyletic Crustacea

?

Our inferred topology……differs from other inferred topologies24 / 70

Slide25

Slide26

We observe 4 of the 15 possible

pancrustacea

topologies

Slide27

We observe 4 of the 15 possible

pancrustacea

topologies

What happens if we decrease the number of species?

Slide28

Slide29

And if we decrease the number of species again?

Slide30

Slide31

Monophyletic crustacea

?Even with more genes, methods still disagree on the correct topologyOf the 15 possible topologies, we recovered 8

Half the methods support a monophyletic crustacea31 / 70

Slide32

Monophyletic crustacea

?Even with more genes, methods still disagree on the correct topologyOf the 15 possible topologies, we recovered 8

Half the methods support a monophyletic crustacea

Slide33

Multi-species order phylogeny construction

2 alignment methods

Maximum Likelihood gene trees3 species tree methodsMUSCLEPASTAConsensusConcatenationCoalescentOrder

# Species

# single-copy

orthologs

Araneae

4

1627

Hemiptera

7

2053

Hymenoptera

24

2121

Coleoptera

6

3880

Lepidoptera

5

3660

Diptera

14

1324

Slide34

Araneae

: 1627 genesHemiptera: 2053 genes

Hymenoptera: 2121 genesColeoptera: 3880 genesLepidoptera: 3660 genesDiptera: 1324 genesThe Arthropod phylogeny

Slide35

All methods agree

Araneae

: 1627 genesHemiptera: 2053 genesHymenoptera: 2121 genesColeoptera

: 3880 genesLepidoptera: 3660 genes

Diptera

: 1324 genes

Slide36

Disagreement between methods

Araneae

: 1627 genesHemiptera: 2053 genesHymenoptera: 2121 genesColeoptera: 3880 genesLepidoptera: 3660 genesDiptera: 1324 genes

Slide37

Questions

What are the relationships among species and orders?What are the patterns of genome evolution?

What did the genome of the last insect common ancestor (LICA) look like?37 / 70

Slide38

What are the rates of evolution?

Amino acid substitution rates

Gene gain/loss rates

Slide39

Evolutionary rates require a time tree

Used a non-parametric method to smooth the treeUsed several fossil calibrations from

Misof et al.39 / 70

Slide40

LICA

350

myaArthropod Time TreeHolometabola311 mya40 / 70

Slide41

Substitution rates per site per year

41

/ 70

Slide42

Gene duplications can lead to important functional evolution

42

/ 70

Slide43

Tips: observed variables

xi: hidden variables

1

3

1

1

1

0

0

Gene family analysis: Example

43

/ 70

Slide44

Tips: observed variables

xi: hidden variablesOur goal is to infer the states of the internal nodes of the tree

1

3

1

1

1

0

0

Gene family analysis: Example

1

1

1

1

0

44

/ 70

Slide45

Tips: observed variables

x

i: hidden variablesOur goal is to infer the states of the internal nodes of the treeThen we can count changes along each lineage

1

3

1

1

1

0

0

Gene family analysis: Example

1

1

1

1

0

+2

-1

45

/ 70

Slide46

Genes gained/lost per year

46

/ 70

Slide47

Genes gained/lost per year

Substitutions per site per year

No correlation between gain/loss rates and substitution rates

Slide48

Rapidly evolving gene families

# of rapidly evolving families

48 / 70

Slide49

# of rapidly evolving families

Rapidly evolving gene families

Several families related to venom and silk production rapidly expanding among spidersJessica Garb49 / 70

Slide50

Rapidly evolving gene families

German cockroach has highest number of rapidly evolving families, despite low gene gain/loss rate

EOG8D294J rapidly evolving only in Blatella germanicaGained 34 genesresponse to light stimuluslocomotor rhythm# of rapidly evolving families50 / 70

Slide51

# of rapidly evolving families

Rapidly evolving gene families

German cockroach has highest number of rapidly evolving families, despite low gene gain/loss rateEOG8D294J rapidly evolving only in Blatella germanicaGained 34 genesresponse to light stimuluslocomotor rhythm51 / 70

Slide52

Families present only in a single species

Spikes occur in both species with low quality AND highly annotated genomes

Tip-specific gene families# of tip specific families52 / 70

Slide53

Families that are found only in that order AND in every species in that order

Order-specific gene families

# of order specific families53 / 70

Slide54

# of order specific families

Families that are found only in that order AND in every species in that order

Order-specific gene familiesLarge number of Lepidoptera specific families5 families with odorant/olfactory functions3 families involved in response to stress54 / 70

Slide55

Questions

What are the relationships among species and orders?What are the patterns of genome evolution?

What did the genome of the last insect common ancestor (LICA) look like?55 / 70

Slide56

What does the ancestral insect look like?

56

/ 70

Slide57

What does the ancestral insect look like?

How can we infer characteristics about the genome of the last insect common ancestor (LICA)?

57 / 70

Slide58

0

2

2212

4

2

2

0

2

2

0

5

2

3

1

1

1

0

1

x

17

x

16

x

18

x

19

x

15

x

13

x

12

x

14

x

11

x

9

x

10

x

8

x

6

x

5

x

7

x

4

x

3

x

2

x

1

LICA

How can we infer characteristics of the genome of LICA?

58

/ 70

Slide59

0

2

2212

4

2

2

0

2

2

0

5

2

3

1

1

1

0

1

x

17

x

16

x

18

x

19

x

15

x

13

x

12

x

14

x

11

x

9

x

10

x

8

x

6

x

5

x

7

x

4

x

3

x

2

x

1

LICA

How can we infer characteristics of the genome of LICA?

How many genes were present in the LICA genome?

59

/ 70

Slide60

0

2

22124

2

2

0

2

2

0

5

2

3

1

1

1

0

1

x

17

x

16

x

18

x

19

x

15

x

13

x

12

x

14

x

11

x

9

x

10

x

8

x

6

x

5

x

7

x

4

x

3

x

2

x

1

LICA

How many genes were present in the LICA genome?

How can we infer characteristics of the genome of LICA?

9,601 genes

60

/ 70

Slide61

Estimates of ancestral genome size are biased because of extinct gene families

LICA

61 / 70

Slide62

Corrected # of genes in the LICA genome:

14,615

# of genes LICALICA62 / 70

Slide63

0

0

0000

0

0

0

0

2

2

0

5

2

3

1

1

1

0

1

0

0

0

0

0

0

0

0

0

2

1

1

2

2

2

1

1

1

1

+1

How can we infer characteristics of the genome of LICA?

Which families were ‘born’ during the transition to insects?

63

/ 70

Slide64

0

0

0000

0

0

0

0

2

2

0

5

2

3

1

1

1

0

1

0

0

0

0

0

0

0

0

0

2

1

1

2

2

2

1

1

1

1

+1

147 novel insect families

Which families were ‘born’ during the transition to insects?

How can we infer characteristics of the genome of LICA?

64

/ 70

Slide65

Novel insect families correspond to insect lifestyle adaptations

7 chitin and cuticle production families

Changes in exoskeleton development65 / 70

Slide66

Novel insect families correspond to insect lifestyle adaptations

7 chitin and cuticle production families

1 visual learning and behavior family2 odorant binding families 5 families involved in neural activityChanges in exoskeleton developmentAbility to sense in a terrestrial environment66 / 70

Slide67

Novel insect families correspond to insect lifestyle adaptations

7 chitin and cuticle production families

1 visual learning and behavior family2 odorant binding families 5 families involved in neural activity1 larval behavior family 4 imaginal disk development familiesChanges in exoskeleton developmentAbility to sense in a terrestrial environmentUnique development67 / 70

Slide68

Novel insect families correspond to insect lifestyle adaptations

7 chitin and cuticle production families

1 visual learning and behavior family2 odorant binding families 5 families involved in neural activity1 larval behavior family 4 imaginal disk development families3 wing morphogenesis familiesChanges in exoskeleton developmentAbility to sense in a terrestrial environmentUnique developmentFlight68 / 70

Slide69

All data has been made available in our online

toolhttps://cgi.soic.indiana.edu/~grthomas/i5k/i5k_phylo.html

69 / 70

Slide70

All data has been made available in our online

toolhttps://cgi.soic.indiana.edu/~grthomas/i5k/i5k_phylo.html

70 / 70

Slide71

Acknowledgments

Matthew HahnStephen RichardsRob WaterhouseJessica GarbElias Dohmen

Ariel ChipmanGene family website: https://cgi.soic.indiana.edu/~grthomas/i5k/i5k_phylo.html i5k website: http://i5k.github.io/

The i5k community

The Hahn lab + Clara Boothby