CS 840 Unit 1: Models, Lower Bounds and getting around Lower Bounds - PowerPoint Presentation

342 views
Uploaded On 2020-11-06

CS 840 Unit 1: Models, Lower Bounds and getting around Lower Bounds - PPT Presentation

Searching Given a large set of distinct keys preprocess them so searches can be performed as quickly as possible 1 CS 840 Unit 1 Models Lower Bounds and getting around Lower bounds Searching ID: 816179

expected search book phone search expected phone book interpolation zeke comparisons cost good binary trees close keys searching bad

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/816179" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download The PPT/PDF document "CS 840 Unit 1: Models, Lower Bounds and ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

Slide1

CS 840 Unit 1: Models, Lower Bounds and getting around Lower Bounds

Searching: Given a large set of distinct keys, preprocess them so searches can be performed as quickly as possible

Slide2

CS 840 Unit 1: Models, Lower Bounds and getting around Lower bounds

Searching

: Given a large set of distinct keys, preprocess them so searches can be performed as quickly as possibleStart with sorted array:

n lg n

(or so) comparisons and (usually) a similar number of moves to create

Searching can be done by binary search runtime from the recursion T(n) = 1 + T(n/2) and T(1)=0Solving this, it takes at most comparisons, 1 more if we handle unsuccessful searches as well (T(1)=1)

Slide3

Binary Search is Optimal

Model: count comparisons, two way branch (

≥, >, or

tests)

No other operations on query valueEach comparison, at best on average, cuts number of possible answers in half.3

Slide4

Binary Search is Optimal

Model: count comparisons, two way branch (

≥, >, or

tests)

No other operations on query valueEach comparison, at best on average, cuts number of possible answers in half.Start with n (or 2n + 1) possible outcomes and at best on average divide by 2 each time⌈lg⁡ 𝑛⌉ +1 comparisons are necessary in worst case, closer to lg n +1 on average for search, giving location of value or saying it is not there and where it fits4

Slide5

Sorting: an aside

To sort we have to determine which of n! permutations to applyn! is about

2π

n (n/e)

(Stirling’s approximation)Taking the lg we get n lg n – n lg e + ½ lg n + O(1) lg e ≈1.4426…

Slide6

Sorting: an aside

To sort we have to determine which of n! permutations to applyn! is about

2π

n (n/e)

(Stirling’s approximation)Taking the lg we get n lg n – n lg e + O(1) lg e ≈1.4426…Mergesort is pretty close at n ⌈lg⁡ n⌉ – 2

⌈lg⁡ 𝑛⌉

+1 (to be picky)

But a very old method from the early 60’s is better (on comparison count)

Slide7

What’s good or bad about Binary Search?

Good Optimal # compares, under the model

Minimal space (OK close to), if keys are large Gives # elements smaller, so implicit reference to a larger record

“Generalizes” to search trees, and balanced search trees

Slide8

What’s good or bad about Binary Search?

Good Optimal # compares, under the model

Minimal space (OK close to), if keys are large Gives # elements smaller, so implicit reference to a larger record

“Generalizes” to search trees, and balanced search trees

Bad Can’t/don’t use any tricks with key valueSo what else could we do?8

Slide9

But is that how you would look up a name in a phone book? If you know what a phone book was

Names from Aa to Zz, and you want Zeke

Slide10

But is that how you would look up a name in a phone book? If you know what a phone book was

Names from Aa to Zz, and you want Zeke

Don’t start in the middle, interpolate, look in location ≈ n(Zeke-Aa)/(ZZ-Aa)

Aa…………………………………………………………

 try here for Zeke10

Slide11

But is that how you would look up a name in a phone book? If you know what a phone book was

Names from Aa to Zz, and you want Zeke

Don’t start in the middle, interpolate, look in location ≈ n(Zeke-Aa)/(ZZ-Aa)

Aa…………………………………………………………

 try here for ZekeIt’s called interpolation searchAssumption:11

Slide12

But is that how you would look up a name in a phone book? If you know what a phone book was

Names from Aa to Zz, and you want Zeke

Don’t start in the middle, interpolate, look in location ≈ n(Zeke-Aa)/(ZZ-Aa) Aa…………………………………………………………

 try here for ZekeIt’s called interpolation searchAssumption: Values uniformly distributed in a range, so we are dealing with expected case.12

Slide13

Interpolation Search: expected search cost

An easy analogy: Think of searching for an element about mid range

Values in structure have independent probability ½ of coming before it. Like flipping a coin n times

Slide14

Interpolation Search: expected search cost

An easy analogy: Think of searching for an element about mid range

Values in structure have independent probability ½ of coming before it. Like flipping a coin n times

Yes, expected number is n/2 before it, but how much do we miss by, what is expected values of

|heads – tails| ? It’s about n14

Slide15

Interpolation Search: expected search cost, cont’d

So expected cost T(n) = 2 + T(

n)So we expect to miss by n (or in general (current range)

(Actually we will tend to be close on one side)

How many times do you have to take  to get from n to 1?

Slide16

Interpolation Search: expected search cost, cont’d

So expected cost T(n) = 2 + T(

n)So we expect to miss by n (or in general (current range)

(Actually we will tend to be close on one side)

How many times do you have to take  to get from n to 1?

cuts the length of a binary representation in half, so cuts lg n in halfSo lg lg n (actually about 2 lg lg n ) comparisons are expected

Note: this also says where a “missing value” would fit.

Slide17

What is good/bad about interpolation search?

Good O(lg

lg n) expected time (2

lglg

n =10 for n=1,000,000,000)

Gives # elements smaller, so implicit reference to a larger record  generalization to search trees, and balanced search trees O(lg n) amortized update, O(lglg n) expected search. (Mehlhorn and Tsakalidas

ICALP 1985, LNCS vol 194)

Bad Relies on knowing and computing with distribution of keys

Any updating is “tricky”

what else could we do?