/
A Zero-Knowledge Based Introduction to Biology A Zero-Knowledge Based Introduction to Biology

A Zero-Knowledge Based Introduction to Biology - PowerPoint Presentation

min-jolicoeur
min-jolicoeur . @min-jolicoeur
Follow
374 views
Uploaded On 2017-03-23

A Zero-Knowledge Based Introduction to Biology - PPT Presentation

Karthik Jagadeesh Bo Yoo 28 September 201 6 Q What is your genome Q What is your genome A The sum of your hereditary information Human Genome 3 billion base pairs ATGC ID: 528306

transcription dna cell gene dna transcription gene cell genome rna genes proteins acid protein biology amino mutation sequence ser

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "A Zero-Knowledge Based Introduction to B..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

A Zero-Knowledge Based Introduction to Biology

Bo

Yoo

January 09, 2020Slide2

Announcements

Please sign up for PiazzaCA Office Hours (starting Monday 1/13):Mondays 2PM-4PM in Beckman B383 (third floor)

Beckman center access through main gate onlySlide3

Announcements

Attendance will be taken starting next Tuesday (1/14) lectureThere will be a sign-up sheet by the doorYou get two free absences

Attendance is 5% of your final gradeSlide4

Announcements

Homework 1 will be released next Thursday (1/16)Due 11:59PM 2/4 (via email)You have 3 late days (can use on homework only)

Read the instructions carefully (what files to submit etc.)Post questions on Piazza instead of emailing usInclude question number on the subject line 3 questions All three questions will require UCSC Genome Browser (tutorial 1/16) Slide5

Q: What is your genome?Slide6

Q: What is your genome?

A:

The sum of your hereditary information.Slide7

DNA: “

Blueprints

” for a cell

Genetic information encoded in long strings

Deoxyribonucleic acid

(DNA)

comes in four

bases

: adenine

(A)

, thymine

(T)

, guanine

(G)

, and cytosine

(C)Slide8

Nucleobase Complementary Pairing

Adenine (A)

Cytosine (C)

Guanine (G)

Thymine (T)

pyrimidines

purinesSlide9

DNA Double HelixSlide10

DNA PackagingSlide11

From DNA to Organism

You are composed of ~ 10 trillion cellsSlide12

From DNA to

Organism

CellSlide13

From DNA to

Organism

Cell Protein

Proteins do most of the work in biologySlide14

Human Genome3 billion base pairs: A,T,G,C

Full DNA sequence in virtually all cells DNA is the blueprint for life:Cookbook with many “recipes” for proteins - genes

Proteins do most of the work in biologySlide15

Protein coding genes

In human: set of 20-25K genes that eventually become translated to proteins The number of genes differ by species!

Seemingly less complex organisms may have large number of genesE.g. Human (20-25k genes) vs. Rice (51k genes)How are proteins made from DNA? Slide16

Central Dogma of BiologySlide17

Gene Transcription

DNA -> RNA Slide18

DNA

(

Deoxyribonucleic acid)

vs RNA (ribonucleic acid)

Deoxyribose in DNA Ribose in RNASlide19

RNA Nucleobases

Adenine (A)

Cytosine (C)

Guanine (G)

Uracil (U)

pyrimidines

purinesSlide20

5 prime and 3 primeSlide21

Genes are transcribed from the template strandSlide22

Gene Transcription

(DNA -> RNA)

3’

5’

5’

3’

G A T T A C A . . .

C T A A T G T . . .Slide23

Gene Transcription

3’

5’

5’

3’

G A T T A C A . . .

C T A A T G T . . .Slide24

Gene Transcription

3’

5’

5’

3’

G A T T A C A . . .

C T A A T G T . . .

Strands are separated (DNA helicase)Slide25

Gene Transcription

3’

5’

5’

3’

G A T T A C A . . .

C T A A T G T . . .

G A U U A C A

An RNA copy of the 5’→3’ sequence is created from the 3’→5’ templateSlide26

Gene Transcription

3’

5’

5’

3’

G A U U A C A . . .

G A T T A C A . . .

C T A A T G T . . .

pre-mRNA

5’

3’Slide27

Genes can be found on both strands

Coding and template strands are relative to the gene

A gene can be on the minus strand

In general genomic sequence are written in 5’->3’

3’

5’

5’

3’

G A T T A C A

C T A A T G T Slide28

RNA Processing

5’ cap

poly(A) tail

intron

exon

mRNA

5’ UTR

3’ UTRSlide29

Gene Structure

5’

3’

promoter

5’ UTR

exons

3’ UTR

introns

coding

non-codingSlide30

Gene Translation

RNA -> Protein Slide31

From RNA to Protein

Proteins are long strings of amino acids joined by peptide bonds

Translation from RNA sequence to amino acid sequence performed by ribosomes

20 amino acids

3 RNA letters required to specify a single amino acid

(codons)Slide32

Amino Acid

Alanine

Arginine

Asparagine

Aspartate

Cysteine

Glutamate

Glutamine

Glycine

Histidine

Isoleucine

Leucine

Lysine

Methionine

Phenylalanine

Proline

Serine

Threonine

Tryptophan

Tyrosine

Valine

There are 20 standard amino acidsSlide33

TranslationSlide34

Translation

5’

. . . A U U A U G G C C U G G A C U U G A . . .

3’

UTR

Met

Start Codon

Ala

Trp

Thr

Stop CodonSlide35

Translation

The ribosome (a complex of protein and RNA) synthesizes a protein by reading the mRNA in triplets (codons). Each codon is translated to an amino acid. Slide36

Central Dogma of BiologySlide37

Most of Our Genome Do Not Code for Proteins!Slide38

What does the rest of the genome do?

3 billion base pairs in our genome1-2% coding (codes for proteins)10-20% regulatory

These regulatory elements give rise to differentiation1 million Regulatory elements (switches) enable:Precise control for turning genes on/offDiverse cell types (lung, heart, skin)Analogy: Making specific recipes (genes) for a full meal from a large cookbook (genome) at a given timeSlide39

Gene Expression Regulation

Determines when

each gene

should

be expressed

Why? Every cell has

same DNA

but each cell expresses

different proteins

.

Signal transduction: One signal converted to another: cascade has “master regulators” turning on many proteins, which in turn each turn on many proteinsSlide40

Different Cell Types

Subsets of the DNA sequence determine the identity and function of different cellsSlide41

Regulatory Elements

Expression Modulated by Regulatory elementsEnhancer, Promoters, SilencersRegulates transcription (DNA -> RNA) of a gene

CS analogy:Genes are like variable assignments (a = 7)Regulatory elements are control flow, complex logicSlide42

Regulatory Elements

Transcription factors (TFs):Proteins that recognize sequence motifs in enhancers, promoters

Combinatorial switches that turn genes on/offComplex assists or inhibits formation of the RNA polymerase machinerySlide43

Transcription Factor Binding Sites

Short, degenerate DNA sequences recognized by particular transcription factors

For complex organisms, cooperative binding of multiple transcription factors required to initiate transcription

Binding Sequence LogoSlide44

Signal Transduction

Transcription Factor A

TF A

Binding Site

Gene BSlide45

Repeats

Sequences that repeat many times in the genomeAbout 50% of the genome Slide46

Repeats

Interspersed Repeats (Transposable elements)Using some unknown mechanic to multiply themselves and move around in the genomeSlide47

Repeats

Most repeat events are neutral Most copies are inactive (e.g. 5’ truncation) when they arrive at a new location Makes genome sizes to grow

To make it through generations the repeats must be in the germline cells (eggs and sperms)Slide48

Repeats

Simple repeats Every possible motif of mono-, di, tri- and tetranucleotide repeats is vastly overrepresented in the human genome.

These are called microsatellites,Longer repeating units are called minisatellites,The real long ones are called satellites.

AAAAAAAAA

CACACACAC

CAACAACAASlide49

Still a lot that we don’t know Slide50

Mutations in the Genome

Over our lifetime, our DNA replicates trillions of times with the help of DNA polymerase

But even polymerase is “imperfect”, every now and then (roughly 1 in every 100,000 bp), DNA polymerase makes a mistake in replication resulting in “mutations”There are other sources of mutation, including smoking, sunlight and radiationSlide51

Single Nucleotide ChangesSlide52

Single Nucleotide ChangesSlide53

Mutation:

Structural AbnormalitiesSlide54

How does the genome influence human disease?Slide55

Bejerano Lab

Disease Implications

SHH

MUTATIONS

Brain

Limb

OtherSlide56

Bejerano Lab

Limb Enhancer 1Mb away from Gene

SHH

limb

on

offSlide57

Bejerano Lab

SHH

Enhancer Deletion

limb

DELETE

Limb

on

offSlide58

Bejerano Lab

SHH

Enhancer 1bp Substitution

limb

MUTATIONS

Limb

Lettice et al.

HMG

2003 12: 1725-35

on

off

on

offSlide59

Genome Wide Association Study (GWAS):

80% of GWAS SNPs are noncoding (hard to interpret)Active area of research

Bejerano LabSlide60

Evolution = Mutation + SelectionSlide61

Human Mutation Rate

Recent sequencing analysis suggests ~40-60 new mutations in a child that were not present in either parent.Mutations range from the smallest possible (single base pair change) to the largest – whole genome duplication (to be discussed).

Selection does not tolerate all of these mutation, but it sure does tolerate some.Slide62

Selection

time

Harmful mutation

Beneficial mutation

Neutral

mutationSlide63

Summary

All hereditary information encoded in double-stranded DNA

Each cell in an organism has same DNA

DNA

RNA

protein

Proteins have many diverse roles in cell

Gene regulation diversifies protein products within different cellsSlide64

Summary

Very small portion of the genome actually code for proteins, a lot of it is repeats and regulatory elements

Mutations and repeats that made into the germline cells gets passed down generations

Evolutions happens through mutations and selection processes