/
Big Data in Biology: A focus on genomics Big Data in Biology: A focus on genomics

Big Data in Biology: A focus on genomics - PowerPoint Presentation

celsa-spraggs
celsa-spraggs . @celsa-spraggs
Follow
416 views
Uploaded On 2016-10-31

Big Data in Biology: A focus on genomics - PPT Presentation

Bioinformatics and Genomics Applications Personalized cancer medicines Disease determination Pathway Analysis Biomarker Discovery An Interesting Point One article estimated that the output from genomics may soon dwarf data heavyweights such as ID: 482603

people genomics bioinformatics variants genomics people variants bioinformatics medicine data 000 harder genetic cancer genomes researchers project diversity genomic

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Big Data in Biology: A focus on genomics" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Big Data in Biology: A focus on genomicsSlide2

Bioinformatics and Genomics

Applications:

Personalized cancer medicines

Disease determination

Pathway Analysis

Biomarker Discovery Slide3

An Interesting Point

“One

article estimated that the output from genomics may soon dwarf data heavyweights such as

YouTube”

“I don't know if a million genomes is the right number, but clearly we need more than we've got,” says Marc Williams, director of the

Geisinger

Genomic Medicine Institute.Slide4

Stephens, Z. D. et al. 

PLoS

Biol.

13

, e1002195 (2015)Slide5

Genomics in the Past

DNA can have 4 different bases, A, C, G, T

Exons (1%): parts of the DNA that code for proteins

Look at nucleotides

~13,000 single nucleotide variants.

Roughly 2% of these will affect protein

composition

Unfortunately, research used cell cultures or animal modes.

However:

Many of these associations were made with low levels of evidence. Slide6

Genomics Continued

Structural Variants – deletion, duplication, and translocation.

Much harder to detect than single mutations

Many genes do not code for proteins, but can still regulate protein creation, but it’s still not well known the function of many of these regions.

Capturing all such variation is desirable, but not the best in the short term

Tldr

;

genomics is hard.Slide7

Applications

Iceland

deCODE

Project: medical history records and genome data of 150,000 people

Led to Discovery of:

Genetic risk factors

Breast cancer

Alzheimer’s

Also

found 10,000 people missing 1,500 different copies of both genes.

Drug responsiveness:

ADHD medicine only works for one of ten preschoolers, cancer drugs are effective for 25% of patients, and depression drugs work with 6 of 10

patients.

Personalized MedicineSlide8

Issues with Bioinformatics

Icelandic work helped by a homogeneous population.

1000 Genomes project captured some diversity, but mainly captured Caucasian populations.

“Because they come from the genetic mother ship, so to speak, people of African ancestry carry a lot more genetic variants than non-Africans… Variants that seem unusual in Caucasians might be common in Africans, and may not actually cause disease.” - says Isaac

Kohane

, a

bioinformatician

at Harvard Medical School in Boston, Massachusetts.

Reference genome: the comparison tool that many researchers use is flawed.

1

st

iteration: random donors of unidentified ethnicity.

Currently it incorporates more human genomic diversity.Slide9

Solutions

Relationships between doctors and researchers to create models between diseases and genetics.

Harvesting genomes produces up to 40 Petabytes (PB) per a year.

Computational power: The

more variables you add, the more people you add, it gets harder and harder.

Silicon Valley Lure: people needed for bioinformatics need to be able to harness massive parallel computation. Slide10

Conclusion

Two Main Issues:

Difficulty of bioinformatics due to genomics

Computational power and the need for collaboration

Yet solving these problems, could easily lead to incredible improvements in medicine.