/
CMPT 225 Algorithm Analysis: Big O Notation CMPT 225 Algorithm Analysis: Big O Notation

CMPT 225 Algorithm Analysis: Big O Notation - PowerPoint Presentation

daniella
daniella . @daniella
Follow
70 views
Uploaded On 2023-06-23

CMPT 225 Algorithm Analysis: Big O Notation - PPT Presentation

Objectives Determine the running time of simple algorithms Best case Average case Worst case Profile algorithms Understand O notations mathematical basis Use O notation to measure running time ID: 1002299

target case search algorithm case target algorithm search array int worst time arr john average cost sort size function

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "CMPT 225 Algorithm Analysis: Big O Notat..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. CMPT 225Algorithm Analysis: Big O Notation

2. ObjectivesDetermine the running time of simple algorithmsBest caseAverage caseWorst caseProfile algorithmsUnderstand O notation's mathematical basisUse O notation to measure running timeJohn Edgar2

3. Algorithm AnalysisAlgorithms can be described in terms ofTime efficiencySpace efficiencyChoosing an appropriate algorithm can make a significant difference in the usability of a systemGovernment and corporate databases with many millions of records, which are accessed frequentlyOnline search enginesReal time systems where near instantaneous response is requiredFrom air traffic control systems to computer gamesJohn Edgar3

4. Comparing AlgorithmsThere are often many ways to solve a problemDifferent algorithms that produce the same resultse.g. there are numerous sorting algorithmsWe are usually interested in how an algorithm performs when its input is largeIn practice, with today's hardware, most algorithms will perform well with small inputThere are exceptions to this, such as the Traveling Salesman ProblemJohn Edgar4

5. Measuring AlgorithmsIt is possible to count the number of operations that an algorithm performsBy a careful visual walkthrough of the algorithm or byInserting code in the algorithm to count and print the number of times that each line executes (profiling)It is also possible to time algorithmsCompare system time before and after running an algorithmE.g., in C++: #include <ctime> John Edgar5

6. Timing AlgorithmsIt may be useful to time how long an algorithm takes to runIn some cases it may be essential to know how long an algorithm takes on some systeme.g. air traffic control systemsBut is this a good general comparison method?Running time is affected by a number of factors other than algorithm efficiencyJohn Edgar6

7. Running Time is Affected ByCPU speedAmount of main memorySpecialized hardware (e.g. graphics card)Operating systemSystem configuration (e.g. virtual memory)Programming languageAlgorithm implementation Other programsSystem tasks (e.g. memory management)…John Edgar7

8. CountingInstead of timing an algorithm, count the number of instructions that it performsThe number of instructions performed may vary based onThe size of the inputThe organization of the inputThe number of instructions can be written as a cost function on the input size John Edgar8

9. A Simple Examplevoid printArray(int *arr, int n){for (int i = 0; i < n; ++i){cout << arr[i] << endl;}}John Edgar9Operations performed on an array of length 10|declare and initialize iperform comparison, print array element, and increment i:10 times||| ||| ||| ||| ||| ||| ||| ||| ||| ||||make comparison when i = 10C++

10. Cost FunctionsInstead of choosing a particular input size we will express a cost function for input of size nAssume that the running time, t, of an algorithm is proportional to the number of operationsExpress t as a function of nWhere t is the time required to process the data using some algorithm ADenote a cost function as tA(n)i.e. the running time of algorithm A, with input size nJohn Edgar10

11. A Simple Examplevoid printArray(int *arr, int n){for (int i = 0; i < n; ++i){cout << arr[i] << endl;}}John Edgar11Operations performed on an array of length n1declare and initialize iperform comparison, print array element, and increment i: n times3n1make comparison when i = nt = 3n + 2

12. Input VariesThe number of operations usually varies based on the size of the inputThough not always, consider array lookupIn addition algorithm performance may vary based on the organization of the inputFor example consider searching a large arrayIf the target is the first item in the array the search will be very quickJohn Edgar12

13. Best, Average and Worst CaseAlgorithm efficiency is often calculated for three broad cases of inputBest caseAverage (or “usual”) caseWorst caseThis analysis considers how performance varies for different inputs of the same sizeJohn Edgar13

14. Analyzing AlgorithmsIt can be difficult to determine the exact number of operations performed by an algorithmThough it is often still useful to do soAn alternative to counting all instructions is to focus on an algorithm's barometer instructionThe barometer instruction is the instruction that is executed the most number of times in an algorithmThe number of times that the barometer instruction is executed is usually proportional to its running timeJohn Edgar14

15. ComparisonsLet's analyze and compare some different algorithmsLinear searchBinary searchSelection sortInsertion sortJohn Edgar15

16. Cost Functions for Searching

17. SearchingIt is often useful to find out whether or not a list contains a particular itemSuch a search can either return true or falseOr the position of the item in the listIf the array isn't sorted use linear searchStart with the first item, and go through the array comparing each item to the targetIf the target item is found return true (or the index of the target element)John Edgar17

18. Linear Searchint linSearch(int* arr, int n, int target){ for (int i=0; i < n; i++){ if(target == arr[i]){ return i; } } //for return -1; //target not found}John Edgar18The function returns as soon as the target item is foundreturn -1 to indicate that the item has not been foundC++

19. Linear Search Barometer InstructionIterate through an array of n items searching for the target itemThe barometer instruction is equality checking (or comparisons for short)x == arr[i]; There are actually two other barometer instructions, what are they?How many comparisons does linear search do?John Edgar19

20. Linear Search ComparisonsBest caseThe target is the first element of the arrayMake 1 comparisonWorst caseThe target is not in the array orThe target is at the last position in the arrayMake n comparisons in either caseAverage caseIs it (Best case + Worst case) / 2, so (n + 1) / 2?John Edgar20

21. Linear Search: Average CaseThere are two situations when the worst case arisesWhen the target is the last item in the arrayWhen the target is not there at allTo calculate the average cost we need to know how often these two situations ariseWe can make assumptions about thisThough any these assumptions may not hold for a particular use of linear searchJohn Edgar21

22. AssumptionsAssume that the target is not in the array ½ the timeTherefore ½ the time the entire array has to be searchedAssume that there is an equal probability of the target being at any array locationIf it is in the arrayThat is, there is a probability of 1/n that the target is at some location iJohn Edgar22

23. Cost When Target Not FoundWork done if the target is not in the arrayn comparisonsThis occurs with probability of 0.5John Edgar23

24. Cost When Target Is FoundWork done if target is in the array:1 comparison if target is at the 1st locationOccurs with probability 1/n (second assumption)2 comparisons if target is at the 2nd locationAlso occurs with probability 1/n i comparisons if target is at the ith locationTake the weighted average of the values to find the total expected number of comparisons (E)E = 1*1/n + 2*1/n + 3*1/n + … + n * 1/n orE = (n + 1) / 2John Edgar24

25. Average Case CostTarget is not in the array: n comparisonsTarget is in the array (n + 1) / 2 comparisonsTake a weighted average of the two amounts:= (n * ½) + ((n + 1) / 2 * ½)= (n / 2) + ((n + 1) / 4)= (2n / 4) + ((n + 1) / 4)= (3n + 1) / 4Therefore, on average, we expect linear search to perform (3n + 1) / 4 comparisons* *recall the assumptions we made about ½ not in array, uniform distribution if in arrayJohn Edgar25

26. Searching Sorted ArraysIf we sort the target array first we can change the linear search average cost to around n / 2Once a value equal to or greater than the target is found the search can endSo, if a sequence contains 8 items, on average, linear search compares 4 of them, If a sequence contains 1,000,000 items, linear search compares 500,000 of them, etc.However, if the array is sorted, it is possible to do much better than thisJohn Edgar26

27. Binary Search SketchJohn Edgar27The array is sorted, and contains 16 items indexed from 0 to 15Guess that the target item is in the middle, that is index = 15 / 2 = 7value07111521293244455761647379818692index0123456789101112131415Search for 32

28. Binary Search SketchJohn Edgar2845 is greater than 32 so the target must be in the lower half of the arrayvalue07111521293244455761647379818692index0123456789101112131415Search for 32Repeat the search, guessing the mid point of the lower subarray (6 / 2 = 3)Everything in the upper half of the array can be ignored, halving the search space

29. Binary Search SketchJohn Edgar2921 is less than 32 so the target must be in the upper half of the subarrayvalue07111521293244455761647379818692index0123456789101112131415Search for 32Repeat the search, guessing the mid point of the new search space, 5The mid point = (lower subarray index + upper index) / 2The target is found so the search can terminate

30. Binary SearchRequires that the array is sortedIn either ascending or descending orderMake sure you know which!A divide and conquer algorithmEach iteration divides the problem space in halfEnds when the target is found or the problem space consists of one elementJohn Edgar30

31. Binary Search Algorithmint binSearch(int * arr, int n, int target){ int lower = 0; int upper = n - 1; int mid = 0; while (lower <= upper){ mid = (lower + upper) / 2; if(target == arr[mid]){ return mid; } else if(target > arr[mid]){ lower = mid + 1; } else { //target < arr[mid] upper = mid - 1; } } //while return -1; //target not found}John Edgar31Index of the last element in the arrayNote the if, else if, elseC++

32. Analyzing Binary SearchThe algorithm consists of three partsInitialization (setting lower and upper)While loop including a return statement on successReturn statement which executes when on failureInitialization and return on failure require the same amount of work regardless of input sizeThe number of times that the while loop iterates depends on the size of the inputJohn Edgar32

33. Binary Search IterationThe while loop contains an if, else if, else statementThe first if condition is met when the target is foundAnd is therefore performed at most once each time the algorithm is runThe algorithm usually performs 5 operations for each iteration of the while loopChecking the while conditionAssignment to midEquality comparison with targetInequality comparison One other operation (setting either lower or upper)John Edgar33

34. Binary Search: Best CaseIn the best case the target is the midpoint element of the arrayRequiring one iteration of the while loopJohn Edgar34

35. Binary Search: Worst CaseWhat is the worst case for binary search?Either the target is not in the array, or It is found when the search space consists of one elementHow many times does the while loop iterate in the worst case?John Edgar35

36. Analyzing the Worst CaseEach iteration of the while loop halves the search spaceFor simplicity assume that n is a power of 2So n = 2k (e.g. if n = 128, k = 7)The first iteration halves the search space to n/2After the second iteration the search space is n/4After the kth iteration the search space consists of just one element, since n/2k = n/n = 1Because n = 2k, k = log2nTherefore at most log2n iterations of the while loop are made in the worst case!John Edgar36

37. Average CaseIs the average case more like the best case or the worst case?What is the chance that an array element is the target1/n the first time through the loop1/(n/2) the second time through the loop… and so on …It is more likely that the target will be found as the search space becomes smallThat is, when the while loop nears its final iterationWe can conclude that the average case is more like the worst case than the best caseJohn Edgar37

38. Binary Search vs Linear SearchJohn Edgar38n(3n+1)/4log2(n)10831007671,0007511010,0007,50113100,00075,001171,000,000750,0012010,000,0007,500,00124

39. Simple Sorting

40. Simple SortingAs an example of algorithm analysis let's look at two simple sorting algorithmsSelection Sort andInsertion SortCalculate an approximate cost function for these two sorting algorithms By analyzing how many operations are performed by each algorithmThis will include an analysis of how many times the algorithms' loops iterateJohn Edgar40

41. Selection SortSelection sort is a simple sorting algorithm that repeatedly finds the smallest itemThe array is divided into a sorted part and an unsorted partRepeatedly swap the first unsorted item with the smallest unsorted itemStarting with the element with index 0, andEnding with last but one element (index n – 1)John Edgar41

42. Selection SortJohn Edgar422341338107191145find smallest unsorted - 7 comparisons0741338123191145find smallest unsorted - 6 comparisons0711338123194145find smallest unsorted - 5 comparisons0711198123334145find smallest unsorted - 4 comparisons0711192381334145find smallest unsorted - 3 comparisons0711192333814145find smallest unsorted - 2 comparisons0711192333418145find smallest unsorted - 1 comparison 0711192333414581

43. Selection Sort ComparisonsUnsorted elementsComparisons to find smallestnn-1n-1n-2……322110n(n-1)/2John Edgar43

44. Selection Sort Algorithmvoid selectionSort(int *arr, int n){ for(int i = 0; i < n-1; ++i){ int smallest = i; // Find the index of the smallest element for(int j = i + 1; j < n; ++j){ if(arr[j] < arr[smallest]){ smallest = j; } } // Swap the smallest with the current item int temp = arr[i]; arr[i] = arr[smallest]; arr[smallest] = temp; }}John Edgar44C++inner loop bodyn(n – 1)/2 timesouter loopn-1 times

45. Selection Sort Cost FunctionThe outer loop is evaluated n-1 times7 instructions (including the loop statements)Cost is 7(n-1)The inner loop is evaluated n(n – 1)/2 timesThere are 4 instructions but one is only evaluated some of the timeWorst case cost is 4(n(n – 1)/2)Some constant amount (k) of work is performede.g. initializing the outer loopTotal cost: 7(n-1) + 4(n(n – 1)/2) + kAssumption: all instructions have the same costJohn Edgar45

46. Selection Sort SummaryIn broad terms and ignoring the actual number of executable statements selection sortMakes n*(n – 1)/2 comparisons, regardless of the original order of the inputPerforms n – 1 swapsNeither of these operations are substantially affected by the organization of the input John Edgar46

47. Insertion SortAnother simple sorting algorithmDivides array into sorted and unsorted partsThe sorted part of the array is expanded one element at a timeFind the correct place in the sorted part to place the 1st element of the unsorted partBy searching through all of the sorted elements Move the elements after the insertion point up one position to make spaceJohn Edgar47

48. Insertion SortJohn Edgar482341338107191145treats first element as sorted part0711192333414581locate position for 45 – 2 comparisons2341338107191145locate position for 41 - 1 comparison2333418107191145locate position for 33 - 2 comparisons2333418107191145locate position for 81 - 1 comparison0723334181191145locate position for 07 - 4 comparisons0719233341811145locate position for 19- 5 comparisons0711192333418145locate position for 11- 6 comparisons

49. Insertion Sort Algorithmvoid insertionSort(int *arr, int n){ for(int i = 1; i < n; ++i){ int temp = arr[i]; int pos = i; // Shuffle up all sorted items > arr[i] while(pos > 0 && arr[pos - 1] > temp){ arr[pos] = arr[pos – 1]; pos--; } //while // Insert the current item arr[pos] = temp; }}John Edgar49C++max: i – 1 times for each iteration, n * (n – 1) / 2outer loopn-1 timesinner loop bodyhow many times?min: just the test for each outer loop iteration, n

50. Insertion Sort CostSortedElementsWorst-case SearchWorst-case Shuffle000111222………n-1n-1n-1n(n-1)/2n(n-1)/2John Edgar50

51. Insertion Sort Best CaseThe efficiency of insertion sort is affected by the state of the array to be sortedIn the best case the array is already completely sorted!No movement of array elements is requiredRequires n comparisonsJohn Edgar51

52. Insertion Sort Worst CaseIn the worst case the array is in reverse orderEvery item has to be moved all the way to the front of the arrayThe outer loop runs n-1 timesIn the first iteration, one comparison and moveIn the last iteration, n-1 comparisons and movesOn average, n/2 comparisons and movesFor a total of n * (n-1) / 2 comparisons and movesJohn Edgar52

53. Insertion Sort: Average CaseWhat is the average case cost?Is it closer to the best case?Or the worst case?If random data are sorted, insertion sort is usually closer to the worst caseAround n * (n-1) / 4 comparisonsWhat is average input for a sorting algorithm in any case?John Edgar53

54. O Notation

55. Algorithm SummaryLinear search: 3(n + 1)/4 – average caseGiven certain assumptionsBinary search: log2n – worst caseAverage case similar to the worst caseSelection sort: n((n – 1) / 2) – all casesInsertion sort: n((n – 1) / 2) – worst caseAverage case is similar to the worst caseJohn Edgar55

56. Algorithm Comparison Let's compare these algorithms for some arbitrary input size (say n = 1,000)In order of the number of comparisonsBinary searchLinear searchInsertion sort best caseQuicksort (next week) average and best casesSelection sort all cases, Insertion sort average and worst cases, Quicksort worst caseJohn Edgar56

57. Algorithm Growth RateWhat do we want to know when comparing two algorithms?The most important thing is how quickly the time requirements increase with input sizee.g. If we double the input size how much longer does an algorithm take?Here are some graphs …John Edgar57

58. Small nJohn Edgar58Hard to see what is happening with n so small …

59. Not Much Bigger nJohn Edgar59n2 and n(n-1)/2 are growing much faster than any of the others

60. n from 10 to 1,000,000John Edgar60Hmm! Let's try a logarithmic scale …

61. n from 10 to 1,000,000John Edgar61Notice how clusters of growth rates start to emerge

62. O Notation IntroductionExact counting of operations is often difficult (and tedious), even for simple algorithmsAnd is often not much more useful than estimates due to the relative importance of other factorsO Notation is a mathematical language for evaluating the running-time of algorithmsO-notation evaluates the growth rate of an algorithmJohn Edgar62

63. Example of a Cost FunctionCost Function: tA(n) = n2 + 20n + 100Which term in the funtion is most important (dominates)?It depends on the size of nn = 2, tA(n) = 4 + 40 + 100The constant, 100, is the dominating termn = 10, tA(n) = 100 + 200 + 10020n is the dominating termn = 100, tA(n) = 10,000 + 2,000 + 100n2 is the dominating termn = 1000, tA(n) = 1,000,000 + 20,000 + 100n2 is the dominating termJohn Edgar63

64. Big O NotationO notation approximates a cost function that allows us to estimate growth rateThe approximation is usually good enoughEspecially when considering the efficiency of an algorithm as n gets very largeCount the number of times that an algorithm executes its barometer instructionAnd determine how the count increases as the input size increasesJohn Edgar64

65. Why Big O?An algorithm is said to be order f(n)Denoted as O(f(n))The function f(n) is the algorithm's growth rate functionIf a problem of size n requires time proportional to n then the problem is O(n)i.e. If the input size is doubled then the running time is doubledJohn Edgar65

66. Big O Notation DefinitionAn algorithm is order f(n) if there are positive constants k and m such that tA(n)  k*f(n) for all n  mIf so we would say that tA(n) is O(f(n))The requirement n > m expresses that the time estimate is correct if n is sufficiently large John Edgar66

67. Or In English…The idea is that a cost function can be approximated by another, simpler, function The simpler function has 1 variable, the data size nThis function is selected such that it represents an upper bound on the value of tA(n)Saying that the time efficiency of algorithm A tA(n) is O(f(n)) means thatA cannot take more than O(f(n)) time to execute, andThe cost function tA(n) grows at most as fast as f(n)John Edgar67

68. Big O ExampleConsider an algorithm with a cost function of 3n + 12If we can find constants m and k such that:k * n  3n + 12 for all n  m thenThe algorithm is O(n)Find values of k and m so that this is truek = 4, andm = 12 then4n  3n + 12 for all n  12John Edgar68

69. Another Big O ExampleConsider an algorithm with a cost function of 2n2 + 10n + 6If we can find constants m and k such that:k * n2  2n2 + 10n + 6 for all n  m thenThe algorithm is O(n2)Find values of k and m so that this is truek = 3, andm = 11 then3n2  2n2 + 10n + 6 for all n  11John Edgar69

70. And Another GraphJohn Edgar70

71. The general idea is …When using Big-O notationInstead of giving a precise formulation of the cost function for a particular data sizeExpress the behaviour of the algorithm as the data size n grows very large so ignorelower order terms andconstantsJohn Edgar71

72. O Notation ExamplesAll these expressions are O(n):n, 3n, 61n + 5, 22n – 5, …All these expressions are O(n2):n2, 9n2, 18n2 + 4n – 53, …All these expressions are O(n log n):n(log n), 5n(log 99n), 18 + (4n – 2)(log (5n + 3)), …John Edgar72

73. Arithmetic and O NotationO(k * f) = O(f) if k is a constante.g. O(23 * O(log n)), simplifies to O(log n)O(f + g) = max[O(f), O(g)]O(n + n2), simplifies to O(n2)O(f * g) = O(f) * O(g)O(m * n), equals O(m) * O(n)Unless there is some known relationship between m and n that allows us to simplify it, e.g. m < nJohn Edgar73

74. Typical Growth Rate FunctionsO(1) – constant timeThe time is independent of n, e.g. list look-upO(log n) – logarithmic timeUsually the log is to the base 2, e.g. binary searchO(n) – linear time, e.g. linear searchO(n*logn) – e.g. quicksort, mergesort (next week)O(n2) – quadratic time, e.g. selection sortO(nk) – polynomial (where k is some constant)O(2n) – exponential time, very slow!John Edgar74

75. Note on Constant TimeWe write O(1) to indicate something that takes a constant amount of timee.g. finding the minimum element of an ordered array takes O(1) timeThe min is either at the first or the last element of the arrayImportant: constants can be hugeSo in practice O(1) is not necessarily efficientIt tells us is that the algorithm will run at the same speed no matter the size of the input we give itJohn Edgar75

76. Worst, Average and Best CaseThe O-notation growth rate of some algorithms varies depending on the inputTypically we consider three cases:Worst case, usually (relatively) easy to calculate and therefore commonly usedAverage case, often difficult to calculateBest case, usually easy to calculate but less important than the other casesJohn Edgar76

77. O Notation Running TimesLinear searchBest case: O(1)Average case: O(n)Worst case: O(n)Binary searchBest case: O(1)Average case: O(log n)Worst case: O(log n)John Edgar77

78. O Notation Running TimesSelection sortBest Case: O(n2)Average case: O(n2)Worst case: O(n2)Insertion sortBest case: O(n)Average case: O(n2)Worst case: O(n2)John Edgar78

79. SummaryJanuary 2010Greg Mori79

80. SummaryAnalyzing algorithm running timeRecord actual running time (e.g. in seconds)Sensitive to many system / environment conditionsCount instructionsSummarize coarse behaviour of instruction countO NotationNote that all are parameterized by problem size (“n”)Analyze best, worst, “average” caseJohn Edgar80

81. SummarySorting algorithmsInsertion sortSelection sortRunning times of sorting algorithmsJohn Edgar81

82. ReadingsCarrano Ch. 9John Edgar82