HapMap Peter Castaldi January 29 2013 Objectives Introduce the concept of linkage disequilibrium LD Describe how the HapMap project provides publically available information on genetic variation and LD structure ID: 148943
Download Presentation The PPT/PDF document "Understanding GWAS Chip Design – Linka..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Understanding GWAS Chip Design – Linkage Disequilibrium and HapMap
Peter Castaldi
January 29, 2013Slide2
Objectives
Introduce the concept of linkage disequilibrium (LD)
Describe how the
HapMap
project provides publically available information on genetic variation and LD structure
Review how LD enables genome-wide screens with only a subset of genome-wide SNP markers
Describe the design of chip-based genotype assaysSlide3
Human Genome
3 billion base pairs, 23 paired chromosomes
99.9% sequence similarity between individuals
~12 million variant sitesSlide4
What are the Different Types of Genetic Variation?
Single base pair change (A
C
GT
A
T
GT), aka
S
ingle
N
ucleotide
P
olymorphism
~12 million across the genome
Insertions/Deletions (TGGT
TTC
TA TGGT---TA)
Can be of variable size
Trinucelotide
repeats (microsatellites)
Highly polymorphic, less common than SNPs
Responsible for certain clinic disorders (Huntington’s, Fragile X,
myotonic
dystrophy)Slide5
SNPs in detail
SNPs can have up to four possible alleles (A,C,G,
T
), most have only two alleles present in human populations
Each person has two SNP alleles (one for each copy of the chromosome)
when both copies are the same, you’re homozygous (i.e. AA, CC, GG, TT). When they’re different (AT), your heterozygous.
Each allele has a frequency in which it appears in a given population
major allele (more common), minor allele (less common)
they sum to 1 (or 100%)Slide6
SNPs are Used as Genetic Markers for GWAS Chips
Properties of SNPs that make them good markers for GWAS
densely spaced across the genome
usually bi-allelic (only 2 alleles in the population, simplifies statistical tests)
GWAS chips can effectively represent most common variation with just a subset of SNPs
with ~500,000 SNPs, most common variation can be captured
this is because there is significant correlation between neighboring SNPs Slide7
Linkage Disequilibrium Causes Correlation Between Neighboring SNPs
Mendel’s laws state that genes (alleles) are independently transferred across generations (random assortment – linkage equilibrium).
This is not the case when two genetic loci are physically close to each other.
When two physically close genetic loci are not randomly assorted, this is called linkage disequilibrium.Slide8
Linkage Equilibrium Arises Because of Meiotic Recombination
http://kenpitts.net/hbio/8cell_repro/meiosis_pics.htmSlide9
Linkage and Recombination
X
Y
Z
x
y
z
X
y
z
X
y
z
y
X
z
X
Y
z
Gametogenesis
Paternal DNA
Maternal DNA
From Paternal grandfather
From Paternal grandmotherSlide10
Recombination Breaks Up Chromosomal Segments Over Generations
recombination is not uniform across the genome (
recombination hotspots
).
SNPs within the yellow region are correlated with each other and form
haplotypes
.
Because of this correlation, one can often use a single SNP from a haplotype to represent all the SNP variation within a haplotype.Slide11
Haplotype Structure Reflects Evolutionary History
The structure of haplotype blocks varies across racial groups
African populations have short LD blocks, reflecting the longer evolutionary history of those populationsSlide12
~500,000 SNP Markers Can Reasonably Represent Most
of the Common Genetic Variation in European Genomes
GWAS relies upon linkage disequilibrium and the ubiquitous nature of SNP markers to enable genome-wide surveys of the impact of
common variation
on disease susceptibility
Pe’er
et al. Nat Gen. 2006Slide13
The HapMap Project is a catalog of human variation across populations
The Human Genome project provided the complete human sequence for a small number of individuals
To get an accurate sense of variable sites, data from many individuals is needed
HapMap
has three iterations
(
http://
hapmap.ncbi.nlm.nih.gov
/)
dense genotype data from multiple populations groups
CEU – individuals of Northern and Western European ancestry from Utah
YRI –
Yorubans
from Nigeria
JPT – Japanese from Tokyo
CHB – Han
Chinese
from BeijingSlide14
Data from the HapMap Project Enabled GWAS Chip Design
Information from
HapMap
Used in chip design
panel of potential SNPs to use in a genotype
chip
population specific LD structure to allow the identification of
tag SNPs
that effectively tag haplotypesSlide15
Using Linkage Disequilibrium to find Genes
Linkage disequilibrium (LD) means that sites of genetic variation can serve as “markers” for larger chromosomal segments.
Correlation between markers is quantified with r-squared and D’.Slide16
GWAS identify novel disease loci, but additional localization is often necessarySlide17
Genotype Chip Technology
http://science-education.nih.gov/newsnapshots/TOC_Chips/Chips_RITN/How_Chips_Work_1/how_chips_work_1.htmlSlide18
Kang et al. The
American Journal of Human Genetics Volume 74, Issue 3 2004 495 - 510Slide19
Summary
Genetic material is transmitted across generations in blocks called
haplotypes
.
Linkage disequilibrium and haplotype blocks allow for
SNP tagging approaches that enable GWAS chips to capture common genetic variation with a subset of genetic markers.
Haplotype structure varies across ancestral groups.
The
HapMap
project catalogs human genetic variation and LD structure
across populations.