/
Population genetics Dr Gavin Band Population genetics Dr Gavin Band

Population genetics Dr Gavin Band - PowerPoint Presentation

DynamicDiva
DynamicDiva . @DynamicDiva
Follow
342 views
Uploaded On 2022-08-03

Population genetics Dr Gavin Band - PPT Presentation

Wellcome Trust Advanced Courses Genomic Epidemiology in Africa 21 st 26 th June 2015 Africa Centre for Health and Population Studies University of KwaZuluNatal Durban South Africa ID: 933400

genetic population drift recombination population genetic recombination drift populations haplotypes generations differences snps correlations present africa patterns alleles ancestral

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Population genetics Dr Gavin Band" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Population genetics

Dr Gavin Band

Wellcome

Trust Advanced Courses; Genomic Epidemiology in Africa

,

21

st

– 26

th

June 2015

Africa

Centre for Health and Population Studies, University of KwaZulu-Natal, Durban, South Africa

Slide2

Introductions

meta-analysis and power of genetic

studies

Genetics

GWAS results and interpretation

GWAS QC

Basic principles of measuring disease in populations

Principal components analyses

Basic genotype data summaries and analyses

GWAS association analyses

Bioinformatics

Public databases and resources for genetics

whole genome sequencing and fine-mapping

Epidemiology

population genetics

Slide3

Let’s imagine we’ve collected

and

sequenced

some

samples...

ATAGAAAGACCAGACTCCATCGCTAGCAGCTACGCTAGAGTTA

ATTGAAAGACCATACTCCATCGCTAGCAGC-ACGCTAGAGTTA

ATAGAAAGACCAGACTCCATCGCAAGCAGC-ACCCTAGCGTTA

ATAGAAAGACCAGACTCCATCGCAAGCAGCTACGCTAGAGTTA

.

.

.

K

samples

ATAGATAGACCATACTGCATCGCAAGCAGCTACGCTAGCGTTA

Slide4

Let’s imagine we’ve collected and sequenced some samples...

AT

A

GA

A

AGACCA

G

ACT

C

CATCGC

T

AGCAGC

T

AC

G

CTAG

A

GTTA

AT

T

GA

A

AGACCA

T

ACT

C

CATCGCT

AGCAGC-ACGCTAG

AGTTA

AT

A

GA

A

AGACCAGACTCCATCGC

AAGCAGC-AC

CCTAGCGTTA

AT

A

GA

A

AGACCAGACT

C

CATCGC

A

AGCAGC

T

AC

G

CTAG

A

GTTA

SNPs

Insertion / deletion polymorphism

ATAGATAGACCATACTGCATCGCAAGCAGCTACGCTAGCGTTA

Slide5

Let’s imagine we’ve collected

and

sequenced some

samples...

AT

A

GA

A

AGACCA

G

ACT

C

CATCGC

T

AGCAGC

T

AC

G

CTAG

A

GTTA

AT

T

GA

A

AGACCA

TACTC

CATCGCTAGCAGC-ACG

CTAGAGTTA

AT

A

GA

A

AGACCAGACT

CCATCGCAAGCAGC-

ACCCTAGC

GTTA

AT

A

GA

AAGACCAG

ACT

C

CATCGC

A

AGCAGC

T

AC

G

CTAG

A

GTTA

ATAGATAGACCATACTGCATCGCAAGCAGCTACGCTAGCGTTA

Slide6

Yoruba from Ibadan, Nigeria

Utah residents, ancestrally Northern and Western European

24 haplotypes (12 individuals) 100 SNPs on chromosome 20

Slide7

What should we expect to observe?

How can we interpret observed patterns?

What processes generated this data?

Key questions

Slide8

Key

ancestral

processes

Genetic drift

Mutation

Recombination

(and selection)

Slide9

A simple model of a population

Present

Past

G generations

2N

chromosomes

Slide10

Present

Past

G generations

A simple model of a population

Slide11

Present

Past

G generations

A simple model of a population

Slide12

Present

Past

G generations

A simple model of a population

Slide13

Present

Past

G generations

A simple model of a population

Slide14

Genetic drift

Slide15

Present

Past

G generations

Genetic drift

π

=1.49

π

=0.35

(mean number of pairwise differences)

Genetic drift reduces diversity

(it makes everyone look the same)

Slide16

Present

Past

G generations

r

2

=0.33

Between and

r

2

=0.51

Between and

Genetic drift

Genetic drift creates correlations between alleles

(

it increases LD)

Slide17

Present

Past

G generations

p(1-p)

=0.24

Genetic drift

Genetic drift decreases heterozygosity

p(1-p)

=0.16

Slide18

Size matters

Approximate variance in allele frequency after s generations

K=100

50 generations

- Genetic drift acts faster.

E.g

:

In a smaller population:

Slide19

Size matters- Genetic drift acts faster.

E.g:- There is more relatedness. E.g

:

Approximate variance in allele frequency after s generations

2N

The expected

time to the

most recent common ancestor

of two samplesIn a smaller population:

1/2N

Probability two samples coalesce (i.e. have the same parent) in the previous generation

Slide20

Example: a bottleneck

Slide21

Yoruba from Ibadan, Nigeria

Utah residents, ancestrally Northern and Western European

24 haplotypes (12 individuals) 100 SNPs on chromosome 20

Slide22

Genetic driftsummary

Genetic drift decreases diversity by causing haplotypes to fluctuate in frequency, so that alleles are lost and everyone starts looking the same. This creates correlations between alleles along chromosomes (i.e. it creates LD).

Genetic drift acts faster in smaller populations. In the same way, individuals in smaller populations tend to be more closely related.Simple population genetic models are definitely wrong, but still useful in understanding genetic variation.

Slide23

An acknowledgementTo make these slides I’ve used modified version of code originally written by Graham Coop. I’ll make this code available on the course materials site, but the original code is here:

https://github.com/cooplab/popgen-notes/Graham’s group website www.gcbias.org

is also a good place to look for information on population genetics topics.

Slide24

Ancestral processes

Mutation

Recombination

Coalesce

2

μ

2r 1/2N

If

only drift were operating, we’d all look identical to each other. Something must be acting against drift.

Slide25

Present

Past

G generations

Mutation

2N

chromosomes

Genetic drift means most mutations that arise are lost.

Some survive and contribute to genetic variation in the population

Slide26

Ancestral processes

Mutation

Recombination

Coalesce

2

μ

2r 1/2N

If

only drift were operating, we’d all look identical to each other. Something must be acting against drift.

Slide27

Paternal (father)

Recombination

Maternal (mother)

No recombination

Recombination

Slide28

.

.

.

Recombination breaks down the correlation between alleles

.

.

.

Recombination

Slide29

Recombination

in humans has a complex, interesting structure

Slide30

Recombination clusters along chromosomes

Studies have shown that recombination is not uniform along chromosomes

centiMorgans

per Mb

Slide31

Hotspots can break down

correlations over short distances

Hotspots and haplotypes

Slide32

Hotspots and haplotypes

Recombination hotspots lead to regions of strong correlation separated by regions of low LD

Recombination rate

Slide33

Measuring correlations

In genetics correlation between alleles is called linkage disequilibrium (LD)There are several measures of LDUnderstanding LD in natural populations is important for genomic epidemiology

Slide34

AB

Ab

aB

A

a

B

b

ab

Linkage equilibrium

Here, haplotype

frequencies are determined by SNP allele frequencies (they are in equilibrium

).

f

AB

=

f

A

f

B

Slide35

AB

Ab

aB

ab

Here, haplotype

frequencies differ from those expected if the SNPs are independent (they are in disequilibrium

)

f

AB

f

A

f

B

Linkage disequilibrium

Slide36

D

0 when near linkage equilibrium

D

≠ 0

when there is linkage disequilibrium

Two

commonly-used measures:

Measuring LD

= the (squared) correlation

b

etween the two SNPs

Slide37

1

2

3

4

r

2

is less than one unless SNP A is a perfect surrogate of SNP B in the sample

D

statistic less than one if and only if all four haplotypes are present in sample

So

D

is

1 unless visible recombination has occurred

Haplotypes and LD

Slide38

1

2

3

4

r

2

is less than one unless SNP A is a perfect surrogate of SNP B in the sample

D

statistic less than one if and only if all four haplotypes are present in sample

So

D

is

1 unless visible recombination has occurred

Haplotypes and LD

r

2

=1, |

D’|

=1

r

2

<

1, |

D’|

=1

r

2

<

1, |

D’|

<1

Slide39

Recombination and LD

Slide40

Population genetic processes summary

Genetic drift decreases diversity and heterozygosity, and increases levels of LD. It acts faster in smaller populations.Mutations

occur at about 60 mutations per diploid genome per generation. But most are lost due to drift.Recombination breaks down correlations between alleles. It occurs in a highly nonuniform manner, clustered into recombination hotspots

.

Slide41

Population size matters

We’ve seen that in larger populations we have to go further back in time to time to find the common ancestor Consequently there is more opportunity for

Mutation, increasing genetic diversityRecombination, decreasing correlation between alleles

Slide42

The human genome is very large, and broken up into essentially independent chunks by recombination.This gives us many observations of the ancestral process, and considerable power to understand ancestry. Will give two examples.

The power of population genetic inference from a large genome

Slide43

An example

Li and Durbin, “Inference of human population history from individual whole-genome sequences

”, Nature 2011 Years in the past

Idea: a single genome gives us many observations of the ancestral process. As for the bottleneck example, more coalescence => smaller population size.

Slide44

Human population history

The recent migration of European from Africa has lead to small effective population sizes

Slide45

Differences between populations

The overall pattern of LD is conserved

The different ancestral histories lead to different levels of LD

Slide46

Population genetics

Genetic drift generates correlations between allelesRecombination breaks them down

The ancestral population size and history determines the amount of diversity and how it is structured Natural selection can generate strong differences between populations

Slide47

Real populations are more complex admixture

http://

admixturemap.paintmychromosomes.com

Slide48

Real populations are more complex

natural selection

When a beneficial mutation arises it spreads quickly through the population generating strong correlations between alleles

Slide49

Natural Selection

Big differences in the patterns of diversity between populations can be generated by natural selection

Slide50

Differences between populations

Big differences in the patterns of diversity between populations can be generated by natural selection

Slide51

Yoruba from Ibadan, Nigeria

Utah residents, ancestrally Northern and Western European

24 haplotypes (12 individuals) 100 SNPs on chromosome 20

Slide52

Differences in patterns of LD

An experiment:

Take genome-wide SNP data collected from a European population (A)

Take each SNP and find the SNPs which is most correlated with it (and remember how correlated it is)

Go to another European population (B) and compare the correlation between the two SNPs in the new population

(Measure correlation as r

2)

Slide53

Differences in patterns of LD

Across Europe

Within Kenya

We will look at this in the practical

Slide54

Thanks!

Slide55

Recombination and physical distance

r

2

=1

r

2

=0.9

r

2

=0.5

r

2

=0.1

Correlations decay with distance (due to recombination)

Slide56

Looking at patterns of LD

Low r

2

High r

2

LD patterns are complicated

Assume similar physical spacing

Slide57

Recombination clusters along chromosomes

Studies have shown that recombination is not uniform along chromosomes

Slide58

The power of population genetic inference from a large genome

Slide59

Yoruba from Ibadan, Nigeria

Utah residents, ancestrally Northern and Western Europe

24 haplotypes (12 individuals) 100 SNPs on chromosome 20

Slide60

LD and Recombination

There are lots of ways to measure LDRecombination is not uniform along chromosomesMuch of the recombination happens in hotspots and these demark breakdown in correlations

Correlations do persist across hot spots

Slide61

Differences between populations

The overall pattern of LD is conserved

The different ancestral histories lead to different levels of LD

Slide62

Population structure in Africa

There is evidence for widespread population structure across Africa

Slide63

Population structure in Africa

Add population differences between groups from the same region

Slide64

Luhya

in

Webuye

, Kenya

Maasai

in

Kinyawa

, Kenya

24 haplotypes (12 individuals) 100 SNPs on chromosome 20

Slide65

Slide66

LD terminology‘Causal’ variant – a variant that has a functional effect on a trait (such as disease).

Linkage disequilibrium – the pattern of correlations between alleles along a chromosomeTag SNP – a SNP that is in LD with a variant of interest (and that we may have typed directly)

Slide67

Summary

Different ancestral histories have led to different patterns of diversity Natural selection can generate strong differences in haplotype patterns

Population structure across Africa, and between groups in Africa, will lead to differences in the structure of LD

Slide68

Slide69

Slide70

Genetic drift

Allele frequencies change by chance over time

Slide71

Genetic diversity

180 haplotypes (90 individuals) from

Luhya

in

Webuye

, Kenya typed at 6856 SNPs in 10 Mb region on chromosome 20