Week 13 Searching and Sorting 1 Searching for a number Lets say that I give you a list of numbers and I ask you Is 37 on this list As a human you have no problem answering this question as long as the list is reasonably short ID: 586428
Download Presentation The PPT/PDF document "CS 177" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
CS 177
Week 13: Searching and Sorting
1Slide2
Searching for a number
Lets say that I give you a list of numbers, and I ask you, “Is 37 on this list?”As a human, you have no problem answering this question, as long as the list is reasonably short
What if the list is an array, and I want you to write a Java program to find some number?
2Slide3
Search algorithm
Easy!We just look through every element in the array until we find it or run out
If we find it, we return the index, otherwise we return -1
public static
int
find(
int[] array, int number ) { for( int i = 0; i < array.length; i++ ) if( array[i] == number ) return i; return -1;}
3Slide4
How long does it take?
We talked about Big Oh notation last weekNow we have some way to measure how long this algorithm takes
How long, if
n
is the length of the array?
O
(
n
) time because we have to look through every element in the array, in the worst case4Slide5
Can we do better?
Is there any way to go smaller than O(
n
)?
What complexity classes even exist that are smaller than
O
(
n
)?O(1)O(log n)Well, on average, we only need to check half the numbers, that’s ½ n which is still O(n)Darn…5Slide6
We can’t do better unless…
We can do better with more informationFor example, if the list is sorted, then we can use that information somehow
How?
We can play a
High-Low
game
6Slide7
Binary search
Repeatedly divide the search space in halfWe’re looking for 37, let’s say
54
23
31
Check the middle
(Too high)
Check the middle
(Too low)
Check the middle
(Too low)
Check the middle
(Found it!)
37
7Slide8
So, is that faster than linear search?
How long can it take?What if you never find what you’re looking for?
Well, then, you’ve narrowed it down to a single spot in the array that doesn’t have what you want
And what’s the maximum amount of time that could have taken?
8Slide9
Running time for binary search
We cut the search space in half every timeAt worst, we keep cutting
n
in half until we get
1
The running time is
O
(log
n)For 64 items log n = 6, for 128 items log n = 7, for 256 items log n = 8, for 512 items log n = 9, ….9Slide10
Guessing game
We can apply this idea to a guessing gameFirst we tell the computer that we are going to guess a number between 1 and
n
We guess, and it tries to narrow down the number
It should only take log
n
tries
log
2(1,000,000) is only about 2010Slide11
Interview question
This is a classic interview question asked by Microsoft, Amazon, and similar companiesImagine that you have 9 red balls
One of them is just slightly heavier than the others, but so slightly that you can’t feel it
You have a very accurate two pan balance you can use to compare balls
Find the heaviest ball in the smallest number of
weighings
11Slide12
What’s the smallest possible number?
It’s got to be 8 or fewerWe could easily test one ball against every other ball
There must be some cleverer way to divide them up
Something that is related somehow to binary search
12Slide13
That’s it!
We can divide the balls in half each time
If those all balance, it must be the one we left out to begin with
13Slide14
Nope, we can do better
How?They key is that you can actually cut the number of balls into three parts each time
We weigh 3 against 3, if they balance, then we know the 3 left out have the heavy ball
When it’s down to 3, weigh 1 against 1, again knowing that it’s the one left out that’s heavy if they balance
14Slide15
Thinking outside the box,
er, ballThe cool thing is that
we are trisecting the search space each time
This means that it takes log
3
n
weighings to find the heaviest ballWe can do 8 balls in 2 weighings, 27 balls in 3 weighings, 81 balls in 4 weighings, etc.15Slide16
Sorting
Searching is really usefulThe idea of O
(log
n
) time makes all sorts of real world applications work
Google, for example
But, we can’t do binary search unless our list is sorted
Like searching, computer scientists have devoted a lot of thought to figuring out the best way to do
sorting16Slide17
Sorting
The importance of sorting should be evident to you by nowApplications:Sorting a column in Excel
Organizing your iTunes playlists by artist name
Ranking a high school graduating class
Finding a median score to report on an exam
Countless others…
17Slide18
But, is it interesting?
Yes!It’s trickyNo, it’s not! Give me 100 names written on 100 index cards and I can sort them, no problem
One way to remind yourself that it’s tricky is by increasing the problem size
What if I gave you 1,000,000 names written on 1,000,000 index cards
You might need some organizational system
18Slide19
Computers are stupid
A computer can’t “jump” to the M section, unless you explicitly create an M section or something
For most common sorts, the computer has to compare two numbers (or
String
s or whatever) at a time
Based on that comparison, it has to take another step in the algorithm
Remember, we
can swap
things around in an array19Slide20
Bubble sort is a classic sorting algorithm
It is very simple to understandIt is very simple to code
It is not very fast
The idea is simply to go through your array, swapping out of order elements until nothing is out of order
20Slide21
Code for a single pass
One “pass” of the bubble sort algorithm goes through the array once, swapping out of order elements
for
(
int
j = 0; j <
array.length
- 1; j++ )
if( array[j] > array[j + 1] ) { int temp = array[j]; array[j] = array[j + 1]; array[j + 1] = temp; } 21Slide22
Run through the whole array, swapping any entries that are out of order
7
45
0
54
37
108
51
Single pass exampleNo swapSwap450No swapSwap
54
37
No swap
Swap
108
51
22Slide23
How many passes do we need?
How bad could it be?What if the array was in reverse-sorted order?
One pass would only move the largest number to the bottom
We would need
n
– 1 passes to sort the whole array
7
6
543216754321
6
5
7
4
3
2
1
6
5
4
7
3
2
1
6
5
4
3
7
2
1
6
5
4
3
2
7
1
6
5
4
3
2
1
7
23Slide24
Full bubble sort code
The full Java method for bubble sort would require us to have at least n – 1 passes
Alternatively, we could keep a flag to indicate that no swaps were needed on a given pass
for
(
int
i
= 0; i < array.length – 1; i++ ) for( int j = 0; j < array.length - 1; j++ ) if( array[j] > array[j + 1] ) { int temp = array[j]; array[j] = array[j + 1]; array[j + 1] = temp; } 24Slide25
Ascending sort
The bubble sort we saw sorts integers in
ascending
order
What if you wanted to sort them in descending order?
Only a single change is needed to the inner loop:
for
(
int j = 0; j < array.length - 1; j++ ) if( array[j] < array[j + 1] ) { int temp = array[j]; array[j] = array[j + 1]; array[j + 1] = temp; } 25Slide26
What’s the running time of bubble sort?
The outer loop runs n – 1 times
The inner loop runs
n
– 1 times
The inner loop has a constant amount of work inside of it, call it
c
(
n – 1)(n – 1)c = cn2 – 2cn + c, which is…O(n2)Hmm, not great, let’s try another sort26Slide27
Insertion sort
Instead of “bubbling” down the largest (or smallest) number, keep the first k
elements sorted, and keep increasing
k
Philosophically, not that different from bubble sorting
The nice thing is that we can stop sorting whenever the new thing we added is in place
27Slide28
Insertion sort code
The nice thing is that each inner loop runs at most
i
times
for
(
int
i = 1; i < array.length; i++ ) for( int j = i; j > 0; j-- ) //count back if( array[j - 1] > array[j] ) { int temp = array[j]; array[j] = array[j - 1]; array[j - 1] = temp; } else break; 28Slide29
What’s the running time of insertion sort?
The outer loop runs n – 1 times
Well, each inner loop runs a maximum of
i
times, where
i
is the current iteration of the outer loop
1 + 2 + 3 + … + (
n – 1) = ? = (n)(n – 1)/2 = ½n2 – ½n, which is…O(n2)29Slide30
Better than quadratic?
Is there a way to sort things that is better than quadratic time?Yes!
Merge sort
Keep dividing your list in half, over and over, until you get down to two lists with one element in each
Merge the lists together, sorting them as you do, and merge the sorted list of 2 with another sorted list of 2, then merge lists of 4, and keep going until you have merged everything together
It takes O(
n
log
n), which is the best you can do for a comparison based sort30Slide31
Bucket sort paradigm
You use bucket sort when you know that your data is in a narrow range, like, the numbers between 1
and
10
or even
1
and
100
As long as the range of possible values is in the neighborhood of the length of your list, bucket sort can do wellExample: 150 students with integer grades between 1 and 100Doesn’t work for sorting doubles or Strings31Slide32
Bucket sort algorithm
Make an array with enough elements to hold every possible value in your range of values
If you need 1 – 100, make an array with length 100
Sweep through your original list of numbers, when you see a particular value, increment the corresponding index in the value array
To get your final sorted list, sweep through your value array and, for every entry with value
k
> 0, print its index
k
times32Slide33
Bucket sort example
We know our values will be in the range [1,10]
Our example array:
Our values array:
The result:
6
2
10
6127213000210
0
1
1
2
3
4
5
6
7
8
9
10
1
2
2
2
6
6
7
10
33Slide34
Bucket sort in code
Here’s bucket sort in code with a range of [min,
max
]:
int
[] values =
new
int
[max - min + 1]; for( int i = 0; i < array.length; i++ ) values[array[i] - min]++; int count = 0;for( int i = 0; i < values.length; i++ ) { for( int j = 0; j < values[i]; j++ ) {
array[count] =
i
+ min;
count++;
}
}
34Slide35
How long does bucket sort take?
It takes O(
n
) time to scan through the original array
But, now we have to take into account the number of values we expect
So, let’s say we have
m
possible
valuesIt takes O(m) time to scan back through the value array, with O(n) additional updates to the original arrayTime: O(n + m)35