/
A null model for phenotype-fitness landscapes and the A null model for phenotype-fitness landscapes and the

A null model for phenotype-fitness landscapes and the - PowerPoint Presentation

festivehippo
festivehippo . @festivehippo
Follow
343 views
Uploaded On 2020-08-29

A null model for phenotype-fitness landscapes and the - PPT Presentation

distribution of mutation effects on fitness using random matrix theory Guillaume Martin Institut des Sciences de lEvolution ISEM UMR 5554 Université Montpellier II CNRS IRD Martin G ID: 811404

distribution dfe fitness model dfe distribution model fitness genes traits mutations enko mar

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "A null model for phenotype-fitness lands..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

A null model for phenotype-fitness landscapes and the

distribution of mutation effects on fitness: using random matrix theory

Guillaume MartinInstitut des Sciences de l’Evolution ISEM UMR 5554Université Montpellier II - CNRS - IRD

Martin G

. (2014)

« 

Fisher’s geometrical model emerges as a property of complex integrated phenotypic networks”,

Genetics

197

(1): p. 237-255

Slide2

The DFE :

definition

DFE (Distribution of Fitness Effects)

: distribution of the change in fitness produced by

random mutations (sometimes alludes to beneficial ones only).

 

Deleterious mutations

Beneficial

mutations

% and

effect

of

each

type

Slide3
The DFE: implications

Adaptation

from de novo mutations: all models

of adaptation from new alleles:

Mutation-

selection balance = null model of standing variance: distribution of fitness at equilibrium:

Cost

of adaptation and evolution

in

heterogeneous

environments

:

DFE

across

environmental contextsEpistasis/ dominance and evolution of the

genetic system (sex, inbreeding

etc.): DFE across genetic backgrounds 

Slide4
The DFE: implications

The challenge:

Given:a set of genotypes

their initial frequency, their fitnessesthe demographic stochasticity (

)population genetics => fate of the population (adaptation, demography)

 

Dynamical system with fixed nb of components

Dynamical system with dynamical nb of components

MUTATION =>

need

to

predict

the

above

data for

these « yet to appear » types ~ DFE

Slide5
DFE:

empirical

measurementGenerating

mutants:Single random mutations in gene or genome (transposon inserts, site directed mutagenesis etc.)

Gene deletion sets (covers ~ all genes)mutation-accumulation

experimentsMeasuring mutant « fitness »:Survival, growth rate, competitive

index => one or two mutantsmolecular tags/deep sequencing => joint

growth of many mutants See e.g

.

Hietpas

et al. (2011)

PNAS

& (2013)

Evolution

.

Slide6

VSV

TEV

Carrasco et al 2007 J. Virol. Sanjuan & Elena PNAS 2004

X174

Domingo-

Calap

2009 PLoS Genetics

E

. coli

Elena et al 1998

Genetica

yeast

Szafraniec

et al

2003

Genetics

DFE:

beneficial

deleterious

lethal

Special issue: Phil. Trans. R. Soc.

Lond

. B. Biol.

Sci.

(2010)

The

Good,

the Bad

and the

Ugly

Slide7
DFE: patterns

Surpisingly simple on

some aspectsMajority

= deleterious, average = deleteriousSkewed to the rigth

A portion of lethalsWhy

such apparent generalities ?But context - dependent

Epistasis is pervasiveVariance (and mean to a lesser

extent) depend on environment/context (like sexes)

Predict

this

context

-

dependence

Slide8
Two

main

research programs

1.

Heuristic

models

: « top – down »Focus on DFE or quantitative traitsSet of simplifying assumptions

Aim

at

generality

Parameterized

from

the DFE

itself: « effective » parametersEx: Fisher’s (1930)

geometrical model FGM ~

Lande’s (1980) models2. Mechanistic models: « bottom

-up »

Focus on basic

underlying

traits (ex:

metabollic

fluxes in

cells

)

Optimality

assumptions

Specific

to

particular

known

species

x

environment

Set of

empirically

known

+

parameterized

relationships

Ex:

Flux - Balance

Analysis

FBA

Slide9
Here

anisotropic version

Fitness

Trait 2

Trait 1

Phenotype

z

Optim

al

p

h

e

notype

Mutant

phenotypes

w

mut

Parent

al

p

h

e

notype

ASSUMPTIONS

Existence

of an optimum for a set of

traits

Quadratic

/

Gaussian

fitness

function

Centered

gaussian

effects

of mutation on

phenotype

 

w

o

1. Fisher‘s Geometrical Model (FGM)

s

Slide10
1.

Fisher‘s

Geometrical Model (FGM)

Empirical

validations:

Gamma

shape

of DFEs in optimal conditionsEpistasis distribution among

random

pairs of mutations

Rate of adaptation

across

different

genetic backgrounds

Still few tests: virus, coli, yeast,

drosophila.Still issues about variation across genes, across environments

Perfeito

et al.

Evolution

(2013)

Martin et al.

Nature

Genetics

(2007)

Slide11
1.

Fisher‘s

Geometrical Model (FGM)

General

criticism

/

Limits

: Too heuristic/simple to be

realistic

Single optimum

Only

tests on

predictions

not on

premisces

« Traits » not satisfyingly definedNo mechanism

Slide12
2. Flux

- Balance Analysis

FBAmetabolic fluxes in a cell

=> growth ratemetabollic network => stoechiometric equations (all metabolites)Steady state (metabolic equilibrium

)Linear programming: constrain the solution to optimize some

criterion or set of criteria (growth rate, ATP yield etc.)DFE:

remove a gene = remove a reaction in the system=> New steady

state = mutant fitness

Slide13
2. Flux

- Balance Analysis

FBA

Empirical

validations

(

reviewed in

Harthcombe et al. 2013, Plos Comp. Biol.)

metabollic

fluxes in model microbes,

red

blood

cell

growth rates of gene deletion setsepistasis between

gene deletetions

Fong & Palsson Nature

Genetics

(2004)

Harthcombe

et al

PLoS

Comp

.

Biol

(2013)

Slide14
2. Flux

- Balance Analysis

FBA

General

criticism

/

Limits

(for the DFE purpose): Too specific

/

mechanistic

to

be

generalized

:

non model species, complex/unknown metabolic environments,

multicellulars Single optimumEffect of other mutations

than gene deletions ?Effects on other things than metabolism

Slide15
Common points /

differences

Some

g

enotype

phenotype – fitness mappingFGM: a priori FBA:

empirically

parameterized

(

partially

)

Optimization

:FGM : at some (small) distance

from optimumFBA : some trait is

optimized (ATP, growth rate etc.)Pleiotropy:FGM : assumed dimensionality

(

measurable

from

DFEs

)

FBA

:

each

deletion

changes

many

fluxes via the network

 

Slide16

Can

we

link

the

two approaches ?

System’s Biology

Experimental

evolution

Statistical

treatment

Laws

of large numbers

Set of general qualitative premisces

macroscopic

outcome

simplified

phenotype

– fitness

mapping

DFE

Slide17

Pleiotropy(many components

jointly affected by each mutation)2. Weak

unbiased mutation effects on basic functions(local analysis around the parent phenotype)

3. Phenotypic integration(Pyramidal integration from basic

functions to fitness)4. Optimization(existence of a locally

unique optimum)General

Assumptions

Slide18

A "hairball" depiction of the

E. coli metabolic network extracted from KEGG and visualized using

Cytoscape. Reactions are in magenta and chemical compounds are in greenSource: http://www.kavrakilab.org/bioinformatics/metapathReactions

Chemical compounds

High pleiotropy &

small world networks

E.Coli, Yeast metabolic network

Many

components

Some

are hubs :

« Small world »

property

« 

Scale

free » property

PLEIOTROPY

Slide19

Source: Yan

K et al.

PNAS (2010)

Hierarchy

:

phenotypic

network are « 

integrated

 »

FBA

Optimization

function

s

growth

rate = f(

metabolites

fluxes)

much

fewer

metabolite

enter the

function

Than

total system

Hierarchy

and

integration

:

Many

«mutable» traits =>

fewer

«

optimized

» traits

Slide20

Schuetz et al.

Science (2012)

Adaptation to glucose minimal meidum

in E. coliLenski & Travisano (1994).

0

2000

6000

10000

0.0

0.2

0.4

0.6

generations

Mean

Malthusian

relative fitness

Proximity

of a local optimum

Experimental

Evolution

:

fitness

curves

saturate

FBA

:

flux distribution are close to « optimal »

In

evolved

systems

UNIQUE LOCAL OPTIMUM

Slide21

Model

Assumptions

General distributionUnbiased

:

 

Around

a local optimum

3

1

4

2

1. Pleiotropy

4

. Local optimum

3.

Phenotypic

Integration

2

.

Weak

unbiased

mutations

Slide22

(4) Nearby fitness optimum => there is a basis for

where

 

Key local approximations

(2)

mutations have mild effects => linear approx. around

parent

Where

matrix

containing

all « 

pathway

coefficients »

from

mutable to

optimized

traits

 

Slide23

local analysis

&

Central Limit Theorem

:

Mutational

covariance

between

optimized

traits

mutational covariances between mutable traits

B:

pathway

coefficients between mutable and optimized traits 

Large

number

approximation 1: CLT

Anisotropic

Fisher’s

model

Optimized

traits are

G

aussian

 

 

Gaussian

effects

of mutation

Quadratic

fitness

function

on traits

Slide24
Isotropy

and the spectral distribution of M

Eigenvalues of

:

Spectral distribution of

M: distribution of the eigenvalues

across traits

 

isofitness

 

 

isofitness

 

 

Isotropy:

 

anisotropy

:

 

Direction, not

just

distance, has an

efffect

=>

Less

predictable

from

only

fitness data

Slide25

Distribution of the eigenvalues of

Many

coefficients, no idea of their valuesConsider them distributed : large random

matricesTheory of spectral distributions = Random Matrix TheoryMany known

results (a branch of probability theory, applied

to nuclear physics, statistics, neurology etc.)Overlooked in

ecology/evolution I think

 

Large

number

approximation 2:

random

matrix

theory

Slide26

pleiotropy plus integration (

and )

The spectral distribution of

converges toThe Marçenko – Pastur distributionwhen

contains

iid elements with

zero

mean

with

A

form

of central

limit

theorem: Independent of the distribution of the elements in

(same general conditions as in central limit theorem)

Convergence is fast in

 

Large number approximation 2: random matrix theory

V. A.

Marçenko

and L. A.

Pastur

(1967).

Distribution of eigenvalues in certain sets of random

matrices.

Mat

. Sb.

(N.S.), 72 (114):

507–536.

Slide27

Pb:

contains non iid elements

: iid

elements

with

zero mean : some fixed

covariance matrix

=> Approximation of the

limit

spectral distribution (LSD)

when

by a

Marçenko-Pastur

with equivalent

dimensionsWe have:

where

and

some

scale

 

Marçenko

Pastur

Approximation

and

When

with

: coefficient of variation of the

eigenvalues

of

 

Slide28
Marçenko

Pastur Approximation

Sketch of derivation: use of the S transform of the LSD of

,

approximate it to leading order in

,

retrieve the S transform of an MP lawThe S

transform

uniquely

detemrines

the LSD

 

and

When

with

: coefficient of variation of the

eigenvalues

of

 

Slide29

The

Marçenko

Pastur distribution

UniformNormalMixture of the

two

Distribution

Of

elements

in

B

Key ratio

shaping

the distribution of

:

: number of mutable / optimized traitsCovariance between pathway coefs & mutable traits

=> reduction to

 

 

 

Less

isotropic

Slide30
High

developmental

integration => isotropy

Developmental integration:

=> ISOTROPIC Fisher’s modelAll traits ~ equivalent for selection

& mutationFailry general conclusion as general conditions for :

Distribution of mutation effects on mutable traitsDevelopmental function relating

mutable to optimized traitsBut things can

be

slightly

more

complicated

 

Slide31

Pb:

contains elements with non zero

mean

« 

equivalent

 »

iid

elements

with

zero

mean :

some fixed fixed matrix of small rank

(

here

)

Use a

result

from

Beynach

&

Nadakuditi

(2011).

Advances

in

Mathematics

227(

1

)

Phase transition of the

first

eigenvalues

out of the MP

law

as the coefficient of variation of the

elements

decreases

(

bias

)

Here

:

just

the maximum eigenvalueAnd we may use the

result in the MP approximation limit

 

Effect of non zero

mean

(bias)

Slide32

iff

 

Phase transition

 

vector

of

means

of the coefficients in

matrix of

civariance

between

coefficients in

 

Slide33
Developmental

bias => anisotropy

So far, pathway coefficients = unbiased:

If there is bias in the distribution of coefficients (

sufficiently so)A predictable « Phase transition » to anisotropy:

 

= A single

leading

direction

Slide34
DFE

Cumulant

generating function of the DFE:

Link to spectral distribution of

:

:

shannon

transform

of the spectral distribution

of

fitness distance of the parent

genotype

to the optimal

genotype

Below phase transition

(

),

Marçenko-Pastur

limit

:

:

analytic

form

function

of

Approximately

isotropic

:

deviation

is

of

order

:

 

 

Beyond

phase transition

(

)

Coonvolution

of

Marçenko-Pastur

limit

for

smallest

and contribution

from

dominant

eigenvalue

Anisotropy

:

Analytic

DFE,

depends

on fitness distances in

two

subspaces

 

Slide35
Effect

of the transition on

DFEs

Simulationvs.Analytic

The isotropic Fisher model is accurate

Developmental bias => anisotropic FGMFatter

tailsNo directionality effects

Directionality critical:Behaves ~ as if one

leading

dimension

In all cases: network model converges to a

much

simpler

model

with 3 to 5 measurable parameters (FGM)

Slide36
Empirical

DFEs

Engineered SNPs

in two ribosomal genes of salmonella + estimate sNB: non-syn

had same DFE as syn mutations (!!)

Lind et al. Science (2010)

No anisotropy detected:

 

Weak

anisotropy

detected

:

 

FGM

DFEs

Slide37
Conclusions

Try

to justify the Fisher’s model « from first

principle »: ermerges from a much more complex network model, given a few qualitative properties

Perspective : analyze empirical network models

Variation of DFE among genes/modules:

varies but this has

little efffect, varies ?

scale

varies or

averages

up ?

Beyond

or

below

phase transition = essential vs. non-essential genes ? Cause of

parallel evolution ?Perspective:

study different gene specific DFEs 

a

clearer

definition

of traits in the FGM,

which

pleiotropy

is

important for adaptation(

, not

)

 

Slide38
Thank you !

Collaborators

on Fisher’s Geometric model:Luis M.

Chevin (CEFE Montpellier) (+ proofread the article)Thomas Lenormand (CEFE Montpellier)Sylvain Gandon (CEFE Montpellier)David Waxman (

Fudan University, Shangai)Ophélie Ronce (ISEM Montpellier)

For details see:Martin G. « Fisher’s geometrical model emerges as a property of complex integrated phenotypic networks”,

Genetics 197(1): p. 237-255

Slide39

When

,

, Marçenko Pastur law :

Parameters

: ratio

and

scale

set by

the

mean

of

PDF:

with

 

Marçenko

-

Pastur

Law

(

)

 

 

 

C

onvergence to

isotropy

as

with

higher

phenotypic

integration

 

Slide40

as long as

and

ratio

and

scale

 

Marçenko

-

Pastur

Approximation

under

anisotropy

 

:

coeffient

of variation of the eigenvalues of

 

Tools

:

approximate

some

generating

functions associated with

 

Equivalent pleiotropy:

Slide41
« Phase transition » to

anisotropy

So far key assumption:

Marçenko Pastur law If the

pathway coefficients have a strongly

biased distribution: phase transition

: ~ a coefficient of variation of the

If

: the maximal eigenvalue

rises above the bulk

 

 

 

 

Tools

: simple application of

Benaych

-Georges &

Nadakuditi

(2011) for

 

Slide42

 

 

 

« Phase transition » to anisotropy

Slide43

 

,

 

« Phase transition » to

anisotropy

in the

general

case

Anisotropy

: for a

given

total distance

,

directions

DFE’s

 

 

 

 

Slide44

Empirical

parameterization

Isotropic case or

anisotropic case: we

can estimate (

in permissive conditions

 

 

 

 

 

a

nisotropic

fit

isotropic

fit

Strikingly

similar

and

in the two genes, one suggests a significant

 

Ex:

random

single

nucleotide

substitutions in

two

ribosomal

genes

in

salmonella

Data:

Lind

& Anderson (2010),

Science

(

pooled

syn

and non

syn

mutations)

Slide45
Applications: phase transition and essential

genes

Essential genes

: lethal when deleted. genes/modules that sample a set of strongly biased

pathway coefficients

: potential to create lethals

Non-essential

genes: sample unbiased pathway distributions (

)

May all have

roughly

the

same

and

because

Same

set of traits under optimized selection (genes differ in their

subspace)

Randomly

sampling

pathway

coefficients

among

all

pathways

(

averages

up)

Testable

:

L

ook for essential

genes

Produce

single mutants and

estimate

as above

 

Slide46
Applications:

predictions

in the isotropic case

The model may explain whyGamma

shape of the DFE in permissive conditionspermissive = local analysis about

: our most assumption free predictions

Isotropic model seems

to workGenomic DFE is an

average

of all modules,

If

in most of them the genomic >DFE is ~ isotropic

Parrallel

evolution

in

response to

some stresses (e.g. antibiotic resistance etc.)Those genes that have

have strong response in one direction

If stress

requires

this

direction:

they

respond

more

than

average

testable

:

estimate

in the genes involved in

parralel

evolution

 

Slide47
Convergence to

isotropy

in the general case

method: approximate the cumulant generating functions of

by that of the isotropic distribution when

 

Cumulant

generating function (CGF) of the DFE:

,

 

Ignoring

:

 

:

 

 

Shannon transform of the spectral distribution of

 

Isotropy

Gamma

approx

Slide48

The DFE is

fully determined by The eigenvalues

of

The

parental positions in the diagonal systemCumulant

generating function (CGF) of the DFE:

,

 

Stochastic

representation

of the DFE

The CGF

fully

characterizes

the distribution (as the

pdf

does

)

The

derivatives

of the CGF

at

provides

all the

cumuants

of the DFE

The

three

first cumulants are the

mean

variance and

skewness

 

Slide49
The DFE:

empirical

measurement

1. Create a set of single random mutants: mutation accumulation in highly inbred conditions -> naturalSite-directed

mutagenesis -> SNPstransposon mutagenesis -> indelsSelect resistance

mutations and evaluate them in permissive conditions -> tricky (covariance between environments

)

Slide50

Yeast

E. coli

Transcriptional

regulatory

networks

Many

components

Pyramidal

hierarchy