/
Forensics and CS Forensics and CS

Forensics and CS - PowerPoint Presentation

min-jolicoeur
min-jolicoeur . @min-jolicoeur
Follow
406 views
Uploaded On 2017-01-30

Forensics and CS - PPT Presentation

Philip Chan CSI Crime Scene Investigation wwwcbscomshows csi high tech forensics tools DNA profiling Use as evidence in court cases DNA Deoxyribonucleic Acid Each person is unique in DNA except for twins ID: 515805

dna item checks comparisons item dna comparisons checks number algorithm profile speed find items start probability loci smallest case

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Forensics and CS" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Forensics and CS

Philip ChanSlide2

CSI: Crime Scene Investigation

www.cbs.com/shows/

csi

/

high tech forensics tools

DNA profiling

Use as evidence in court casesSlide3

DNA

Deoxyribonucleic Acid

Each person is unique in DNA (except for twins)

DNA samples can be collected at crime scenes

About .1% of human DNA varies from person to personSlide4

Forensics Analysis

Focus on loci (locations) of the DNA

Values at the those loci (DNA profile) are recorded for comparing DNA samples.Slide5

Forensics Analysis

Focus on loci (locations) of the DNA

Values at the those loci (DNA profile) are recorded for comparing DNA samples.

Two DNA profiles from the same person have matching values at all loci. Slide6

Forensics Analysis

Focus on loci (locations) of the DNA

Values at the those loci (DNA profile) are recorded for comparing DNA samples.

Two DNA profiles from the same person have matching values at all loci.

More or fewer loci are more accurate in identification?

Tradeoffs?Slide7

Forensics Analysis

Focus on loci (locations) of the DNA

Values at the those loci (DNA profile) are recorded for comparing DNA samples.

Two DNA profiles from the same person have matching values at all loci.

More or fewer loci are more accurate in identification?

Tradeoffs?

FBI uses 13 core loci

http://www.cstl.nist.gov/biotech/strbase/fbicore.htmSlide8

We do not want to wrongly accuse someone

How can we find out how likely another person has the same DNA profile?Slide9

We do not want to wrongly accuse someone

How can we find out how likely another person has the same DNA profile?

How many people are in the world?Slide10

We do not want to wrongly accuse someone

How can we find out how likely another person has the same DNA profile?

How many people are in the world?

How low the probability needs to be so that a DNA profile is unique in the world?Slide11

We do not want to wrongly accuse someone

How can we find out how likely another person has the same DNA profile?

How many people are in the world?

How low the probability needs to be so that a DNA profile is unique in the world?

Low probability doesn’t mean impossible

Just very unlikelySlide12

Review of basic probability

Joint probability of two independent events

P(A,B) = ?Slide13

Review of basic probability

Joint probability of two independent events

P(A,B) = P(A) * P(B)

Independent events mean knowing one event does not provide information about the other events

P(Die1=1, Die2=1)

= P(Die1=1) * P(Die2=1)

= 1/6 * 1/6 = 1/36. Slide14

Enumerating the events

1

2

3

4

5

6

1

1,1

1,2

2

3

4

5

6

36 events, each is equally likely, so 1/36Slide15

Joint probability

P(Die1=even, Die2=6) = ?Slide16

Joint probability

P(Die1=even, Die2=6)

= 1/2 * 1/6 = 1/12

P(Die1=1, Die2=5, Die3=4) = ?Slide17

Joint probability

P(Die1=even, Die2=6)

= 1/2 * 1/6 = 1/12

P(Die1=1, Die2=5, Die3=4)

= (1/6)

3

= 1/216Slide18

DNA profile probability

How to estimate?Slide19

DNA profile probability

How to estimate?

Assuming loci are independent

P(Locus1=value1, Locus2=value2, ...)

= P(Locus1=value1) * P(Locus2=value2) * ...Slide20

DNA profile probability

How to estimate?

Assuming loci are independent

P(Locus1=value1, Locus2=value2, ...)

= P(Locus1=value1) * P(Locus2=value2) * ...

How to estimate P(Locus1=value1)?Slide21

DNA profile probability

How to estimate?

Assuming loci are independent

P(Locus1=value1, Locus2=value2, ...)

= P(Locus1=value1) * P(Locus2=value2) * ...

How to estimate P(Locus1=value1)?

a random sample of size N from the population and

find out how many people out of N have value1 at Locus1Slide22

Database of DNA profiles

Id

Locus1

Locus2

Locus3

Locus13

A5212

A6921

…Slide23

Problem Formulation

Given

A sample profile (e.g. collected from the crime scene)

A database of known profiles

Find

The probability of the sample profile if it matches a known profile in the databaseSlide24

Breaking Down the Problem

Find

The probability of the sample profile if it matches a known profile in the database

What are the subproblems?Slide25

Breaking Down the Problem

Find

The probability of the sample profile if it matches a known profile in the database

What are the

subproblems

?

Subproblem

1

Find whether the sample profile matches

1a: ?

1b: ?

Subproblem

2

Calculate the probability of the profileSlide26

Breaking Down the Problem

Find

The probability of the sample profile if it matches a known profile in the database

What are the

subproblems

?

Subproblem

1

Find whether the sample profile matches

1a: check entries in the database

1b: check loci in each entry

Subproblem

2

Calculate the probability of the profileSlide27

Simpler Problem for 1a (very common)

Given

an array of integers (e.g. student IDs)

a

n integer (e.g. an ID)

Find

whether the integer is in the array

(e.g. whether you can enter your dorm)

int

[] directory; // student id’s

i

nt

id; // to be foundSlide28

Linear SearchSlide29

Linear/Sequential Search

Check one by one

Stop if you find it

Stop if you run out of items to check

Not foundSlide30

Number of Checks (speed of algorithm)

Consider N items in the array

Best-case scenario

When does it occur? How many checks?Slide31

Number of Checks (speed of algorithm)

Consider N items in the array

Best-case scenario

When does it occur? How many checks?

First item;1 check

Worst-case scenario

When does it occur? How many checks?Slide32

Number of Checks (speed of algorithm)

Consider N items in the array

Best-case scenario

When does it occur? How many checks?

First item;1 check

Worst-case scenario

When does it occur? How many checks?

Last item or not there; N checks

Average-case scenario

Average of all cases

(1 + 2 + … + N) / N = [N(N+1)/2] / N = (N+1)/2Slide33

Matching DNA profiles

Each profile has 13 loci

Do we always need to check all 13 loci to decide if a match occurs or not?Slide34

Can we do better? Faster algorithm?

What if the array is sorted, items are in an order

E.g. a phone bookSlide35

Binary SearchSlide36

Binary Search

Check the item at midpoint

If found, done

Otherwise, eliminate half and repeat 1 and 2Slide37

Breaking down the problem

While more items and not found in the mid point

What are the two

subproblems

?Slide38

Breaking down the problem

While more items and not found in the mid point

Eliminate half of the items

Determine the mid pointSlide39

Number of checks (Speed of algorithm)

Best-case scenario

When does it occur? How many checks?Slide40

Number of checks (Speed of algorithm)

Best-case scenario

When does it occur? How many checks?

In the middle; 1 checkSlide41

Number of checks (Speed of algorithm)

Best-case scenario

When does it occur? How many checks?

In the middle; 1 check

Worst-case scenario

When does it occur? How many checks?Slide42

Number of checks (Speed of algorithm)

Best-case scenario

When does it occur? How many checks?

In the middle; 1 check

Worst-case scenario

When does it occur? How many checks?

Dividing into two halves, half has only one item

? checksSlide43

Number of checks (Speed of algorithm)

T(1) = 1

T(N) = T(N/2) + 1

Slide44

Number of checks (Speed of algorithm)

T(1) = 1

T(N) =

T(N/2)

+ 1

Slide45

Number of checks (Speed of algorithm)

T(1) = 1

T(N) =

T(N/2)

+ 1

=

[ T(N/4) + 1 ]

+ 1Slide46

Number of checks (Speed of algorithm)

T(1) = 1

T(N) = T(N/2) + 1

= [

T(N/4)

+ 1 ] + 1Slide47

Number of checks (Speed of algorithm)

T(1) = 1

T(N) = T(N/2) + 1

= [

T(N/4)

+ 1 ] + 1

= [

[ T(N/8) + 1]

+ 1] + 1Slide48

Number of checks (Speed of algorithm)

T(1) = 1

T(N) = T(N/2) + 1

= [ T(N/4) + 1 ] + 1

= [ [ T(N/8) + 1] + 1] + 1

= … any pattern?

Slide49

Number of checks (Speed of algorithm)

T(1) = 1

T(N) = T(N/2) + 1

= [ T(N/4) + 1 ] + 1

= [ [ T(N/8) + 1] + 1] + 1

= …

= T(N/2

k

) + kSlide50

Number of checks (Speed of algorithm)

T(1) = 1

T(N) = T(N/2) + 1

= [ T(N/4) + 1 ] + 1

= [ [ T(N/8) + 1] + 1] + 1

= …

= T(N/2

k

) + k

N/2

k

gets smaller and eventually becomes 1Slide51

Number of checks (Speed of algorithm)

T(1) = 1

T(N) = T(N/2) + 1

= [ T(N/4) + 1 ] + 1

= [ [ T(N/8) + 1] + 1] + 1

= …

= T(N/2

k

) + k

N/2

k

gets smaller and eventually becomes 1

solve for kSlide52

Number of Checks (Speed of Algorithm)

N/2

k

= 1

N = 2

k

k = ?Slide53

Number of Checks (Speed of Algorithm)

N/2

k

= 1

N = 2

k

k = log

2

NSlide54

Number of Checks (Speed of Algorithm)

N/2

k

= 1

N = 2

k

k = log

2

N

T(N) = T(N/2

k

) + k

= T(1) + log

2

N

= ? + log

2

NSlide55

Number of Checks (Speed of Algorithm)

N/2

k

= 1

N = 2

k

k = log

2

N

T(N) = T(N/2

k

) + k

= T(1) + log

2

N

= 1 + log

2

NSlide56

N (Linear search) vs

log N + 1 (Binary search)

N

100

7.6

1,000

11.0

10,000

14.3

100,000

17.6

1,000,000

20.9

10,000,000

24.3

100,000,000

27.6

N

100

7.6

1,000

11.0

10,000

14.3

100,000

17.6

1,000,000

20.9

10,000,000

24.3

100,000,000

27.6Slide57

Before using Binary Search

The array needs to be sorted (in order)Slide58

SortingSlide59

Sorting (arranging the items in a

desired order)

How is the phone book arranged?

Why?

Why not arranged by numbers?Slide60

Sorting (arranging the items in a

desired order)

How is the phone book arranged?

Why?

Why not arranged by numbers?

Order

Alphabetical

Low to high numbers

DNA profile with 13 loci?Slide61

Sorting

Imagine you have a thousand numbers in an array

How would you systemically sort them?Slide62

Selection Sort (ascending)

Find/select the smallest item

Swap the smallest item with the first itemSlide63

Selection Sort (ascending)

Find/select the smallest item

Swap the smallest item with the first item

Find/select the second smallest item

Swap the second smallest item with the second item

…Slide64

Example

6

7

2

5

1Slide65

Example

6

7

2

5

1Slide66

Example

6

7

2

5

1

1

7

2

5

6Slide67

Example

6

7

2

5

1

1

7

2

5

6Slide68

Example

6

7

2

5

1

1

7

2

5

6

1

2

7

5

6Slide69

Example

6

7

2

5

1

1

7

2

5

6

1

2

7

5

6Slide70

Example

6

7

2

5

1

1

7

2

5

6

1

2

7

5

6

1

2

5

7

6Slide71

Example

6

7

2

5

1

1

7

2

5

6

1

2

7

5

6

1

2

5

7

6Slide72

Example

6

7

2

5

1

1

7

2

5

6

1

2

7

5

6

1

2

5

7

6

1

2

5

6

7Slide73

Breaking down the problem

Get all the items in ascending order

Get one item at the wanted position/index

What are the two subproblems?Slide74

Breaking down the problem

Get all the items in ascending order

Get one item at the wanted position/index

Find the smallest itemSlide75

Breaking down the problem

Get all the items in ascending order

Get one item at the wanted position/index

Find the smallest item

Swap the smallest item with the item at the wanted positionSlide76

Algorithm Summary (Selection Sort)

For each “desired” position

B

etween the “desired” position and the end

Find the smallest item

Swap the smallest item with the item at the “desired” positionSlide77

Number of comparisons (Speed of Algorithm)

Consider counting

Number of comparisons between array itemsSlide78

Number of comparisons (Speed of Algorithm)

Consider counting

Number of comparisons between array items

Best-case scenario (least # of comparisons)

When does it occur? How many comparisons?Slide79

Number of comparisons (Speed of Algorithm)

Consider counting

Number of comparisons between array items

Best-case scenario (least # of comparisons)

When does it occur? How many comparisons?

Worst-case scenario (most # of comparisons)

When does it occur? How many comparisons?Slide80

Number of comparisons (Speed of Algorithm)

Consider counting

Number of comparisons between array items

Best-case scenario (least # of comparisons)

When does it occur? How many comparisons?

Worst-case scenario (most # of comparisons)

When does it occur? How many comparisons?

Same number of comparisons

For all cases (ie best case = worst case)Slide81

Number of comparisons (Speed of Algorithm)

To find the smallest item

How many comparisons?Slide82

Number of comparisons (Speed of Algorithm)

To find the smallest item

How many comparisons?

N-1

To find the second smallest item

How many comparisons?Slide83

Number of comparisons (Speed of Algorithm)

To find the smallest item

How many comparisons?

N-1

To find the second smallest item

How many comparisons?

N-2

Total # of comparisons?Slide84

Number of comparisons (Speed of Algorithm)

To find the smallest item

How many comparisons?

N-1

To find the second smallest item

How many comparisons?

N-2

Total # of comparisons

(N-1) + (N-2) + … + 1Slide85

Number of comparisons (Speed of Algorithm)

To find the smallest item

How many comparisons?

N-1

To find the second smallest item

How many comparisons?

N-2

Total # of comparisons

(N-1) + (N-2) + … + 1

N(N-1)/2 = (N

2

– N)/2Slide86

Selection Sort

Not the fastest sorting algorithm

Learn faster algorithms in more advanced courses.Slide87

Revisiting Binary SearchSlide88

Binary Search

While more items and not found in the mid point

Eliminate half of the items

Determine the mid pointSlide89

Eliminate half of the

array

How to

specify the focus region?

Hint:

index/positionSlide90

Eliminate half of the

array

How to

specify the focus region?

Hint: index/position

Start and endSlide91

How to determine if the region has items

(is not empty)?

with

start and endSlide92

How to determine if the region has items

(is not empty)?

with

start and end

Start <= endSlide93

How do we adjust start and end?Slide94

How do we adjust start and end?

What are the two different cases?Slide95

How do we adjust start and end?

What are the two different cases?

Item is before the middle item

Item is after the middle itemSlide96

How do we adjust start and end?

What are the two different cases?

Item is before the middle item

Start:

End:

Item is after the middle itemSlide97

How do we adjust start and end?

What are the two different cases?

Item is before the middle item

Start: no change

End: position before the mid point

Item is after the middle itemSlide98

How do we adjust start and end?

What are the two different cases?

Item is before the middle item

Start: no change

End: position before the mid point

Item is after the middle item

Start:

End:Slide99

How do we adjust start and end?

What are the two different cases?

Item is before the middle item

Start: no change

End: position before the mid point

Item is after the middle item

Start: position after the mid point

End: no changeSlide100

How to determine the mid point?

with

start and end?Slide101

How to determine the mid point?

with

start and end

(start + end) / 2

Integer division will eliminate the fractional partSlide102

Algorithm Summary

Initialize start, end, and mid point (I)

While region has items and

item

is not

at the mid point ( C )

Eliminate half of the items by adjusting start or end (U)

Update the mid point (U)

If region has items

Position is mid point

else

Position is -1Slide103

Overall SummarySlide104

Overall Summary

DNA samples from crime scene

Identify people using known DNA profiles

If there is a match

estimate probability of DNA profile

Matching a sample to known DNA profiles

Linear/sequential search [N checks]

Binary search [log

2

N + 1 checks]

Faster but needs sorted data/profiles

Selection Sort [(N

2

– N)/2 comparisons]