/
Genetics of Race Bioinformatics Inquiry through Sequencing Genetics of Race Bioinformatics Inquiry through Sequencing

Genetics of Race Bioinformatics Inquiry through Sequencing - PowerPoint Presentation

udeline
udeline . @udeline
Follow
342 views
Uploaded On 2022-06-11

Genetics of Race Bioinformatics Inquiry through Sequencing - PPT Presentation

Adapted from http asetuftseduchemistrywaltsepageneticsofracehtml Uploaded January 8 2017 Genetics of Race Lesson 1 Introduction Goals Introduce module topic Provide necessary background ID: 916523

100 dna race sequencing dna 100 sequencing race similar sequence maya image synthesis similarity agtg mike genetic genetically genome

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Genetics of Race Bioinformatics Inquiry ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Genetics of Race

Bioinformatics Inquiry through Sequencing

Adapted

from:

http

://

ase.tufts.edu/chemistry/walt/sepa/geneticsofrace.html

Uploaded: January 8, 2017

Slide2

Genetics of Race

Lesson 1:

Introduction

Goals:

Introduce module topic

Provide necessary background

Pose research question and hypothesize

Slide3

Terms associated with race

What is race?

What is ancestry?

What features do people use to define someone’s race?

Slide4

Why is it important to talk about race?

Part of our personal identity, may reflect our heritage

Can change how we interact with the world

Talking openly allows us to understand other

perspectives

Why is it hard to talk about race?

Can we talk about race from a scientific perspective?

Slide5

Why is race hard to talk about?

Personal

Difficult to define

Is it based on language, place of origin, physical traits, cultural heritage?

Definitions change with time and place

Racial categories never seem to lead to anything good

Slavery

, the Holocaust,

Why talk about race?

Talking openly allows us to

understand other

perspectives, think through concepts

Slide6

Can you predict who you’re most genetically similar to?

Guiding Question

Are people of the same race more genetically similar?

Image Credit: http

://www.pbs.org/race

/

Slide7

Traits are inherited and expressed via DNA

DNA sequences

Proteins

Traits

Skin color

Height

Eye color

Public domain images from pixabay.com

Slide8

Different ways to represent DNA

5’-ATTAGCTAGAC-3’

3D structure

2D chemical structure

Sequence

Public domain images from

W

ikipedia

Slide9

Similarity of DNA

The human genome is over 3 billion base pairs long

Two random people are 99.9% identical

However, that still leaves 3 MILLION base pairs that can be different

Slide10

How can we determine genetic similarity?

Sequence the entire genome of everyone in the class?

Problems:

Way too long

Image from National Human Genome Institute:

https

://www.genome.gov/25020001/online-education-kit-bioinformatics-finding-genes/

Slide11

mtDNA

= mitochondrial DNA

Maternally inherited

Used

to study

ancestry

We will be looking at DNA sequences that do not provide any health information

What is mitochondrial DNA?

Slide12

What are mitochondria?

Powerhouse of the cell

Generates chemical energy for the cell

Many mitochondria in each cell

Microbial origin

Cell

Mitochondrion

Slide13

Tracking ancestry

AGTG

AGTG

AGTG

AGTG

A

C

TG

AG

A

T

AGTG

Two thousand years pass by…

AGTG

AGTG

AGTG

Slide14

Mitochondrial DNA and ancestry

Image from http://www.mitomap.org

How to interpret this map:

Numbers (black text) refer to years before present day

Letters and numbers (color text) refer to different ancestral lineages

Slide15

Genetic similarity example

Which pair of people is the most genetically similar?

Which pair of people is the least genetically similar?

A

G

TG

A

C

TG

AG

A

G

AG

A

T

AGTG

Emma

Laura

Marcus

Leslie

Martin

Slide16

Genetic similarity

Who is most genetically similar?

People who have identical sequences

4/4 bases same = 100% similarity

Who is least genetically similar?

A

C

TG and AG

A

T

They only have 1 base in common, which is the A in the beginning

1/4

bases same = 25% similarity

A

G

TG

A

G

TG

A

C

TG

AGAT

Emma

Marcus

Martin

Leslie

Slide17

Experimental Overview

collect samples

extract mitochondrial DNA

Next Gen. Sequencing (at

BioSeq

lab)

Analyze and interpret results

Sample collection: In class

You are most similar to

Mike___

You are

least similar

to

Maya___

Maya

Mike

Selina

Deanne

Hector

Maya

100%

94.44%

94.44%

88.24%

100%

Mike

94.44%

100%

88.89%

82.35%

94.44%

Selina

94.44%

88.89%

100%

82.35%

94.44%

Deanne

88.24%

82.35%

82.35%

100%

88.24%

Hector

100%

94.44%

94.44%

88.24%

100%

Slide18

Wrap-up

What does race mean? What does ancestry mean?

Why are we looking at mitochondrial DNA? What method are we using to look at the DNA?

What are we going to do next time?

Slide19

Genetics of RACE

Lesson 2:

Sample Collection

Goals:

Review research question and desired target (

mtDNA

)

Understand what’s happening during procedure

Collect cheek cells and extract DNA

Slide20

Experimental Overview

collect samples

extract mitochondrial DNA

Next Gen. Sequencing (at

BioSeq

lab)

Analyze and interpret results

Sample collection: In class

You are most similar to

Mike___

You are

least similar

to

Maya___

Maya

Mike

Selina

Deanne

Hector

Maya

100%

94.44%

94.44%

88.24%

100%

Mike

94.44%

100%

88.89%

82.35%

94.44%

Selina

94.44%

88.89%

100%

82.35%

94.44%

Deanne

88.24%

82.35%

82.35%

100%

88.24%

Hector

100%

94.44%

94.44%

88.24%

100%

Slide21

Quick micropipette basics

Holding the micropipette

Correct hand position

Always

keep pipette vertical, not horizontal

Changing volume

Each pipette designed for specific volume range

Don’t move beyond limits of pipette

U

sing tips

Use the pipette to pick up a pipette tip

Eject the pipette tip

T

ransferring volume

Notice two stops

21

Slide22

Pipetting example

Select the P-1000 pipette

Set the pipette to 225

uL

Pipette 225

uL

of water into your tube

22

0

2

←1000’s place

←100’s place

←10’s

place

3

2

Slide23

Protocol

Items you should have at your bench:

Sterile foam tipped swabs

Sterile

microcentrifuge

tubes

Extraction solution

Neutralizing solution

Pipettes and tips

Biohazard waste container

Shared equipment you will use:

Vortex

mixers

TimerHot plate set

to

95 °C

23

Slide24

Buccal swab

Swab the inside of your cheek for 10 seconds.

24

Slide25

Begin experiment!

We will continue once everyone’s samples are ready for the water

bath or hot plate

25

Slide26

Experiment Reflection

26

What

is in tubes right now?

Why

are we placing the tubes in warm water?

What will the results look like?

Slide27

What happens to your sample next?

What region of DNA are we looking at?

How do we focus on this region while ignoring the other parts of DNA?

27

Slide28

What happens to your sample next?

We will be sequencing part of your

mitochondrial DNA

We will amplify this region for sequencing by using a process called

PCR

28

Slide29

Project review

29

Can you predict who you’re most genetically similar to?

Are people of the same race more genetically similar?

Image Credit: http

://www.pbs.org/race

/

Slide30

Wrap up

What was the goal of today’s experiment?

What steps were performed to extract the DNA?

How will this DNA be used to determine genetic relatedness?

30

Slide31

Genetics of Race

Lesson 3:

Sequencing by synthesis

Goals:

Review central dogma and limits of DNA

Understand history and recent advances in sequencing

Understand the process (sequencing by synthesis) used to generate data in this module

Slide32

What is DNA?

10010110

(coding)

to

as

5’

GATTACA

3

(DNA)

to

LIFE

32

Slide33

The Central Dogma

DNA

5’

GATTACA

3’

33

Responsible for most of the

structure

and

function

of an organism

goes to

protein

Slide34

Different ways to represent DNA

34

5’-ATTAGCTAGAC-3’

3D structure

2D chemical structure

Sequence

DNA sequencing

simply means

reading the sequence of the DNA

Slide35

Very small yet very big

DNA is tiny

Each letter in the DNA sequence is less than one nanometer

Genomic information is massive

3 BILLION letters in the human genome

LIFE

35

Slide36

So what?

What can we do with DNA sequencing?

36

Are the following scenarios “Sci-Fi” or “Reality”?

Slide37

Identify whether someone is more or less likely to commit a

crime

37

Sci-fi.

Slide38

Figure out the identity of ancient human

remains

38

Reality.

Slide39

Use preserved DNA to re-create extinct plants and

animals

39

Sci-fi.

Slide40

Use human DNA to create a

clone with the same personality

40

Sci-fi.

Slide41

Track disease by monitoring toilet waste from airplanes

41

Reality.

Slide42

Next-Generation Sequencing makes these advances possible

1 human genome in 13 years

~40 sequencing institutions

$3,000,000,000 per genome

16 human genomes in 3 days

1 sequencing system

$1,000 per genome

1990-2003: Human Genome Project

Sanger sequencing”

technology

Today

“Next-generation sequencing” technology

42

Slide43

43

Slide44

DNA Sequencing

by Synthesis

Bioinformatics Inquiry through Sequencing

Slide45

DNA synthesis

Template strand

DNA polymerase

Primer

Nucleotides

ATGAGCTTAGCTA

TACTCG

T

A

C

T

G

G

C

A

T

45

Slide46

DNA synthesis

ATGAGCTTAGCTA

TACTCG

T

A

C

T

G

G

C

A

T

46

Slide47

DNA synthesis

ATGAGCTTAGCTA

TACTCGAATCGAT

47

Slide48

Sequencing By Synthesis

48

Sequence DNA by observing the synthesis of a complimentary strand

MiSeq

sequencer

Slide49

INLET

DNA is attached to the surface of a flow cell

OUTLET

49

DNA is fixed in place while various chemicals wash over it

The camera takes pictures of DNA synthesis while it happens

Slide50

Make identical copies

Because nucleotides are so small, they are difficult to see, even when attached to fluorescent dyes

T

he sequencer copies the sample sequence to form a large group of identical sequences.

This group is called a

cluster

The fluorescent signal from a cluster is much greater than the fluorescent signal from a single strand

50

Slide51

OUTLET

DNA is fixed to the surface of a flow cell

INLET

51

Slide52

Fluorescent Dye

A

T

G

C

When excited by a laser, fluorescent dyes emit brightly colored light

Each nucleotide is attached to a unique fluorescent dye

52

Blue

 A

Red

 T

Green

 G

Yellow

 C

Slide53

Sequencing By Synthesis

K.

Voelkerding

,

et al

, Clinical Chemistry, 2009

53

Slide54

Fluorescent Dye

Laser excites fluorophore.

Camera captures color

E

ach color indicates a specific base

(support.Illumina.com)

54

Slide55

Tracking colors

55

The sequencer uses a camera to identify the color of each nucleotide as it gets added

Write the first letter of the color you see:

R

ed

B

lue

Y

ellow or

G

reen

DNA synthesis happens fast and uncontrollable

Slide56

Blocking group controls speed

3’-ATGC-5’

5’-TA

x

C

??

56

Deblocker

Slide57

Speed can now be controlled

57

Without blocking groups

With blocking groups

Slide58

Sequencing by Synthesis

58

Slide59

Primer and polymerase attach

59

Slide60

1. Add nucleotide

60

Slide61

2. Image fluorescence

61

Slide62

3. Remove dye

62

Slide63

Sequencing by synthesis

63

Slide64

Sequencing by synthesis

64

Slide65

Sequencing by synthesis

65

Slide66

Sequencing by synthesis

66

Slide67

Sequencing by synthesis

67

Slide68

Sequencing by synthesis

68

Slide69

Complete sequence

69

Slide70

What the camera sees:

70

Image 1

Image 2

Image 3

Image 4

Image 5

Blue

 A

Red

 T

Green

 G

Yellow

 C

G

G

A

C

T

Slide71

What the camera sees:

71

Image 1

Image 2

Image 3

Image 4

Image 5

Blue

 A

Red

 T

Green

 G

Yellow

 C

G

T

C

G

A

Slide72

Modeling Activity

Everyone gets a role:

DNA Polymerase

Primer

Sequence complementary of primer

Laser

Camera

Nucleotides of the original sequence

Deblocker

Nucleotides

72

Slide73

Image Credits

Applications

Criminal baby: toddlerhalloweencostumes.com

Jurassic world: http

://www.jurassicworld.com

/

Airplane poop: http

://

www.wired.com/2015/08/airplane-poop-help-track-global-disease-outbreaks/

73

Slide74

Genetics of Race

Lesson 4:

Data Analysis

Goals:

Introduce DNA Assembly and Alignment

Practice determining genetic similarity

Review target DNA and limitations

Slide75

Experiment Overview

1: ATCCAGGAGATACGTCT

2: ACTTAGTACATACATAT

3: AGATACTACAAACTTAT

4: ATCTAGTACACTCAAAA

5: AACTAGTACATACATAT

75

Slide76

What does DNA sequencing data actually look like?

DNA sequencers usually produce short “reads”, which are fragments of a much larger sequence

Read 1:

GCCTACGGGTGGCAACAGTGGGGAATCTTCCGCAATGGACGAAAGTCTGACGGAGCAACGCCGCGTGAGTGATGACGGCCTTCGGGTTGTAAAGCTCTGTTAATCGGGACGAAAGGTCTTCTTGCGAATAGTTAGAAGAATTGACGGTACCGGAATAGAAAGCCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGCGCGCGCAGGCGGA

Read

2:

TGGCCTACAGTAGTCACTGTCTCTTATACACATCTCCGAGCCCACGAGACTGCAGCTAATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAAAAAAAGAAGTAATGCAGGGGTGTGAGTGACTAAGAGGAGAGTGGTATGACATAAAACTAAGAAAACAACTAAAACAAGGGGAGGGCACAATATAACGTATCTCTGAGATGGTACTATGTGTCTGTGTAGCATCTGACATAATAACGTCCATATTCA

Read 31,164:

CCTACGGGTGGCTGCAGTGGGGAATCTTCCGCAATGGACGAAAGTCTGACGGAGCAACGCCGCGTGAGTGATGACGGCCTTCGGGTTGTAAAGCTCTGTTAATCGGGACGAAAGGTCCTCTTGCGAATAGTTAGAGGAATTGACGGTACCGGAATAGAAAGCCACGGCTAACTACGTGCCAGCAGCCGCGGTAATACGTAGGTGGCAAGCGTTGTCCGGAATTATTGGGCGTAAAGCGCGCGCAGGCGGAT

76

Slide77

How do we turn these reads into usable information?

Step 1: Assemble the “reads” into a complete sequence

77

Slide78

Types of Sequencing Analysis

De Novo

Assembly

Used the first time a gene or genome is ever sequenced

Stitching together short

sequences

Resequencing

“Sequencing again”

Allows for comparison with a

reference

78

Read Assembly Activity

Slide79

Why do we need computers?

How long were the sequences we just looked at?

How long did aligning them by hand take?

How long is the human genome?

How long would aligning that by hand take?

Bioinformatics

: use of computers for

analyzing complex biological

data.

79

Slide80

Understanding the DNA sequence

Once a sequence is assembled, how do you know where the DNA sequence actually comes from or what it does?

Search DNA sequences using

BLAST

. Like Google, but for DNA.

80

GGTGTGCAGTTTGCAGTGGACAGTTAACTGAGCTTTAACAATGTCTGATCTTTTTTTAAAGATTTTATTCATTCATTCATGACAGACACAGAGAGAGAGAGAGAGAGGCAGAGATACAGGCAGAGGGAGAAGCAGGCCCCATGCAGGGAGCCCACTGTGGGACTCGATCCTGGGACTCCAGGGTGATGTCCTGGGTCGAAGGCAGGCACTCAACTGCTAAGCCACCCAGGCATCCCAACAATGTCTGATCTTAATATCAAAGCTTGCTATATACTTACATATTTCATCTAATGTAGGATGTCAATGTTTACAAAACAAATATTATGTTATGAACTGCTATAAACATGCTGCCACATAAACTGAACTTCTTTTTTTTTAATTTATTTATGATAGTCAGAGAGAGAGAGAGAGGCAGAGACACAGGCAGAGGGAGAAGCAGGCTCCATGCACCGGAAGCCCGACGTGGGATTCGATCCCGGGTCTCCAGGATTGCGCCCCGGGCCAAAGGCAGGCGCTAAACCGCTGCACCACCCAGGGATCCCTAAACTGAACTTCTCAACAAAAGAAAAGCACAACTGATGAGCTCACTTCCCCCAAAGCCAGAAAATTAAATTTTCAATAAATTCTGTTTTGGGTACTCATAAACCTAAACTCATGTTTTCATAAATGTTTTCAAGTTGTCAATATGTTTTTTCAACTGTTAACACTGAAACAACAAAGGACCTTCACAATATCATGAGGGCATCAGAAGGCAGGGACAGCTGGCCTTGCGGCGGCCAAGGGACTAGCCCATCCCTCCACGTGCTAG

Slide81

Using BLAST

https://blast.ncbi.nlm.nih.gov/Blast.cgi

81

Just copy in the sequence you want to identify

Then hit BLAST

Slide82

Results identify the DNA

82

BLAST

l

ike Google but for DNA

Image credit

:

boxercab

https

://www.flickr.com/photos/boxercab

/

286178208/in/set-72157594356190532

/

Slide83

Sequence alignment

Reference: 5’ AGTGCTAGCTTAGCTAGCTCAACGAT 3’

Read 1: 5’ AGCTTA 3’

Read 2: 5’ AGCTAA 3’

ATCGCGGATCGATTA

||| |||| |||| |

ATCCCGGAACGATAA

83

Which of these sequences has better alignment:

Read 1

or

Read 2

?

Slide84

Sequence alignment

84

Slide85

Genetic similarity example

Who is most genetically similar?

Who is least genetically similar?

To determine this, we must align each pair of sequences!

85

#

Student

Sequence

1

Maya

ACGGA

2

Mike

ACCGA

3

Selina

ACGGA

4

Deanne

ACCGT

Slide86

Genetic similarity

Align Maya & Mike

Maya:

ACGGA

Mike:

ACCGA

# of matches = 4

Total length = 5

Similarity = 4/5 x 100% = 80%

86

Slide87

Genetic similarity

Align Maya &

Selina

Maya:

ACGGA

Selina:

ACGGA

# of matches = 5

Total length = 5

Similarity = 5/5 x 100% = 100%

87

Slide88

Genetic similarity

Align Maya & Deanne

Maya:

ACGGA

Deanne:

ACCGT

# of matches = 3

Total length = 5

Similarity = 3/5 x 100% = 60%

88

Slide89

Genetic similarity across the whole class

89

Maya

Mike

Selina

Deanne

Maya

100%

80%

100%

60%

Mike

Selina

Deanne

Does 100% match mean that both individuals are the same?

Does a lower score mean that people are definitely not related?

There are lots of limitations to what we can conclude from genetic data!

Slide90

Wrap-up

How are short DNA reads pieced together into a complete sequence?

How is genetic similarity determined?

90

Slide91

Genetics Of Race

Lesson 5:

Results

Goals:

Review module, research question, and hypotheses

Return and discuss results (including limitations)

Connect project to broader social context

Slide92

Anonymity

Please do not reveal people’s results unless you both consent

Use ID code rather than name

Alternate samples are used if your data was not available

92

Slide93

Can you predict who you’re most genetically similar to?

Guiding Question

Are people of the same race more genetically similar?

Image Credit: http

://www.pbs.org/race

/

Slide94

Experimental Overview

collect samples

extract mitochondrial DNA

Next Gen. Sequencing (at

BioSeq

lab)

Analyze and interpret results

Sample collection: In class

You are most similar to

Mike___

You are

least similar

to

Maya___

Maya

Mike

Selina

Deanne

Hector

Maya

100%

94.44%

94.44%

88.24%

100%

Mike

94.44%

100%

88.89%

82.35%

94.44%

Selina

94.44%

88.89%

100%

82.35%

94.44%

Deanne

88.24%

82.35%

82.35%

100%

88.24%

Hector

100%

94.44%

94.44%

88.24%

100%

Slide95

Hypothesis

Write down on the back of your paper your expectations for your most similar peer and your least similar peer.

95

Slide96

Were Your Predictions Correct?

Who was most similar to someone they predicted?

Who

was least similar to someone they predicted?

Who was surprised by these results?

96

Slide97

Interpreting the Data

What was the range of genetic similarity in this experiment?

“Most similar” typically 99%+

“Least similar”

typically

94% - 99%

Sometimes more than one “most similar”

97

ATCC

G

TCC

ATCG

75% similar

75% similar

Slide98

Interpreting the Data

Not all “most similar” and “least similar” peers will be exclusive pairs.

How can I be most similar to someone, but they aren’t most similar to me?

Example

98

3

You

CTCC

2 Mike

ATCT

1 Maya

ATTT

Slide99

Interpreting the Data

What if “most similar” is the same race?

What are limitations of our experiment?

Small sample pool

We

only sequenced a relatively small region

Contamination and other

sequencing

errors

Chance – some “most similar” people will be of the same race just due to probability

99

Slide100

Ancestry and relatedness

CANNOT conclude that you’re actually related (unless you mean you shared an ancestor thousands of years ago)

Rate of mutation is about 0.01 mutations per generation

This means 1 mutation every 100 generations (~2,000 years!)

100

Slide101

Tracking migration

AGTG

AGTG

AGTG

AGTG

AGTG

AGTG

AGTG

A

C

TG

AG

A

T

AGTG

101

Two thousand years pass by…

Slide102

Mitochondrial DNA and Ancestry

102

How to interpret this map:

Numbers (black text) refer to years before present day

Letters and numbers (color text) refer to different ancestral lineages

Image from http://www.mitomap.org

Slide103

Perspectives on race

American Anthropological Association:

“Evidence

from the analysis of genetics (e.g., DNA) indicates that most physical variation, about 94%, lies 

within

 so-called racial groups. Conventional geographic "racial" groupings differ from one another only in about 6% of their genes. This means that there is greater variation within "racial" groups than between them.

…These

facts render any attempt to establish lines of division among biological populations both arbitrary and subjective

.”

103

Slide104

Perspectives on race

Human Genome

Project:

“DNA studies do not indicate that separate classifiable subspecies (races) exist within modern humans…People who have lived in the same geographic region for many generations may have some alleles in common, but no allele will be found in all members of one population and in no members of any other.”

104

Slide105

Is race meaningful…

… genetically?

… in society?

… in the justice system?

… on surveys (like the US census or standardized tests)?

105

Slide106

Wrap up

106

Could this experiment have been possible without next generation sequencing? How might it have been different?

What other kinds of experiments could you do with this technology?