/
Computer Science CPSC 322 Lecture  11 SLS wrap up,  Intro To Planning Computer Science CPSC 322 Lecture  11 SLS wrap up,  Intro To Planning

Computer Science CPSC 322 Lecture 11 SLS wrap up, Intro To Planning - PowerPoint Presentation

liane-varnes
liane-varnes . @liane-varnes
Follow
344 views
Uploaded On 2019-11-02

Computer Science CPSC 322 Lecture 11 SLS wrap up, Intro To Planning - PPT Presentation

Computer Science CPSC 322 Lecture 11 SLS wrap up Intro To Planning 1 Lecture Overview Recap of Lecture 10 SLS algorithms Comparing SLS algorithms SLS variants Planning Intro time permitting 2 Local Search ID: 762360

variable random search steps random variable steps search local runtime descent number greedy algorithm choose time runs solved sls

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Computer Science CPSC 322 Lecture 11 SL..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Computer Science CPSC 322Lecture 11SLS wrap up, Intro To Planning 1

Lecture OverviewRecap of Lecture 10SLS algorithmsComparing SLS algorithmsSLS variants Planning Intro (time permitting) 2

Local Search Solving CSPs is NP-hardSearch space for many CSPs is hugeExponential in the number of variablesEven arc consistency with domain splitting is often not enoughIdea: Consider the space of complete assignments of values to variables (all possible worlds)Neighbours of a current node are similar variable assignments Move from one node to another according to a function that scores how good each assignment is Useful method in practice Best available method for many constraint satisfaction and constraint optimization problems 3

Given the set of variables {V1 ….,Vn }, each with domain Dom(V i )The start node is any assignment {V1 / v 1,…,Vn / vn } . The neighbors of node with assignment A= {V1 / v 1,…,Vn / vn } are nodes with assignments that differ from A for one value onlyExample V 1 = v 1 ,V 2 = v 1 ,.., Vn = v1 V 1 = v 2 ,V2 = v1 ,.., Vn = v1 V1 = v4 ,V2 = v1 ,.., Vn = v1 V1 = v1 ,V2 = vn ,.., Vn = v1 V 1 = v 4 , V 2 = v 2 ,.., Vn = v1 V1 = v4 ,V2 = v3 ,.., Vn = v1 V1 = v4 ,V2 = v1 ,.., Vn = v2 Only the current node is kept in memory at each step. Local search does NOT backtrack! 4

Iterative Best ImprovementHow to determine the neighbor node to be selected?Iterative Best Improvement: select the neighbor that optimizes some evaluation functionWhich strategy would make sense? Select neighbour with … Evaluation function: h(n): number of constraint violations in state nGreedy descent: evaluate h(n) for each neighbour, pick the neighbour n with minimal h(n) Hill climbing: equivalent algorithm for maximization problemsHere: Maximize number of satisfied constraints Minimal number of constraint violations5

Problems with Iterative Best ImprovementIt gets misled by locally optimal points (Local Maxima/ Minima) X 1 X 2 Evaluation function Most research in local search is about finding effective mechanisms for escaping from local minima/maxima 6

Start node: random assignmentGoal: assignment with zero unsatisfied constraintsHeuristic function h: number of unsatisfied constraints Lower values of the function are better Stochastic local search is a mix of: Greedy descent: move to neighbor with lowest hRandom walk : take some random stepsi.e., move to a neighbour with some randomness Random restart: reassigning values to all variables Stochastic Local Search for CSPs Uses greedy steps to find local minima, and randomness to avoid getting trapped in them 7

Lecture OverviewRecap of Lecture 10SLS algorithmsComparing SLS algorithmsSLS variants Planning Intro (time permitting) 8

Which randomized method would work best in each of the these two search spaces? A. Greedy descent with random steps best on 1 Greedy descent with random restart best on 2 B. Greedy descent with random steps best on 2 Greedy descent with random restart best on 1 C. equivalent Evaluation function State Space (1 variable) Evaluation function State Space (1 variable) 1 2 9

But these examples are simplified extreme cases for illustrationin practice, you don’t know what your search space looks likeUsually integrating both kinds of randomization works best Greedy descent with random steps best on 2 Greedy descent with random restart best on 1 Evaluation function State Space (1 variable) Evaluation function State Space (1 variable) 1 2 Which randomized method would work best in each of the these two search spaces? 10

Greedy descent vs. Random sampling Greedy descent is good for finding local minimabad for exploring new parts of the search spaceRandom restart (aka Random sampling) is good for exploring new parts of the search space bad for finding local minimaA mix of the two, plus some additional random choices , can work very well11

General Local Search Algorithm Sometime select the best neighbor during the local search Sometime select the neighbor at random ( random step ) 12

General Local Search Algorithm Sometime do a random restart 13

General Local Search Algorithm Sometime select the best neighbor during the local search Sometime select the neighbor at random (random step) Sometime do a random restart 14

Stochastic Local Search for CSPs: detailsExamples of ways to add randomness to local search for a CSPOne stage selection of variable and value: Sometime choose the best neighbour Sometime choose a random variable-value pair Two stage selection (first select variable V, then new value for V):Selecting variables :Sometimes choose the variable which participates in the largest number of conflictsSometimes choose a random variable that participates in some conflictSometimes choose a random variableSelecting valuesS ometimes choose the best value for the chosen variable Sometimes choose a random value for the chosen variable 15

Different ways of selecting neighbors One stage selection: consider all assignments that differ in exactly one variable. How many of those are there for N variables and domain size d? D. O( N+d ) A. O(Nd) B. O(dN) C. O(Nd) 16

Different ways of selecting neighbors One stage selection: consider all assignments that differ in exactly one variable. How many of those are there for N variables and domain size d? Two stage selection : first choose a variable (e.g. the one involved in the most conflicts), then best value How many checks? A. O(Nd) D. O(N+d)A. O(Nd )B. O(dN) C. O(Nd) 17

Different ways of selecting neighbors One stage selection: consider all assignments that differ in exactly one variable. How many of those are there for N variables and domain size d? Two stage selection : first choose a variable (e.g. the one involved in the most conflicts), then best value Lower computational complexity: O(N+d ) checks. But less progress per step A. O(Nd) 18

Random Walk (one-stage) One way to add randomness to local search for a CSP Sometime chose the pair according to the scoring function Sometime chose a random one e.g. in 8-queen, given the assignment here How many neighbors? Chose ……. Chose: Neighbors are generated as assignments that differ in one variable's value How many neighbors there are given n variables with domains with d values? V 1 V 2 V 3 V 4 V 5 V 6 V 7 V 8 19

Random Walk (one-stage) One way to add randomness to local search for a CSP Sometime chose the pair according to the scoring function Sometime chose a random one E.G in 8-queen, given the assignment here How many neighbors ? 56 Chose v2 = 1, or v5 = 2 or v5 = 8 etc etc …….. Chose: any of the 56 neighbors Neighbors are generated as assignments that differ in one variable's value How many neighbors there are given n variables with domains with d values? N(d-1) V 1 V 2 V 3 V 4 V 5 V 6 V 7 V 8 20

Random Walk (two-stage selection)Another strategy: select a variable first, then a valueSelecting variable: alternate among selecting one that participates in the largest number of conflicts : A random variable that participates in some conflict. one of A random variable Selecting value: alternate among selecting one That minimizes # of conflictsat random 0 2 2 3 3 2 3 Aispace 2 a: Greedy Descent with Min-Conflict Heuristic - Technique that tends to work well in practice 21

Random Walk (two-stage selection)Another strategy: select a variable first, then a valueSelecting variable: alternate among selecting one that participates in the largest number of conflicts. => V5 A random variable that participates in some conflict. one of V4, V5, V8 A random variable Selecting value: alternate among selecting one That minimizes # of conflicts => V5 = 1 at random 0 2 2 3 3 2 3 Aispace 2 a: Greedy Descent with Min-Conflict Heuristic - Technique that tends to work well in practice 22

What you can look at in AIspaceCan play with these alternatives in the Hill AI Space applet (there is a description for each algorithm) Random sampling Random walk Greedy Descent Greedy Descent Min conflict Greedy Descent with random walk Greedy Descent with random restart Greedy Descent with all options ….. O ne stage selection of variable and value: Sometime chose the best neighbour Sometime choose a random variable-value pair T wo stage selection (first select variable V, then new value for V): Selecting variables: Sometimes choose the variable which participates in the largest number of conflicts S ometimes choose a random variable that participates in some conflict S ometimes choose a random variable Selecting values Sometimes choose the best value for the chosen variable Sometimes choose a random value for the chosen variable 23

What you can look at in AIspaceGreedy Descent with all options O ne stage selection of variable and value: Sometime chose the best neighbour Sometime choose a random variable-value pair T wo stage selection (first select variable V, then new value for V): Selecting variables: Sometimes choose the variable which participates in the largest number of conflicts S ometimes choose a random variable that participates in some conflict S ometimes choose a random variable Selecting values S ometimes choose the best value for the chosen variable Sometimes choose a random value for the chosen variable This one allows one to define all the others 24

Lecture OverviewRecap of Lecture 10SLS algorithmsComparing SLS algorithmsSLS variants Planning Intro (time permitting) 25

Evaluating SLS algorithmsSLS algorithms are randomizedThe time taken until they solve a problem is a random variableIt is entirely normal to have runtime variations of 2 orders of magnitude in repeated runs!E.g. 0.1 seconds in one run, 10 seconds in the next one, on the same problem instanceSometimes SLS algorithm doesn’t even terminate at all: stagnation If an SLS algorithm sometimes stagnates, what is its mean runtime (across many runs)? Infinity!In practice, one often counts timeouts as some fixed large value XStill, summary statistics, such as mean run time or median run time, don't tell the whole story E.g. would penalize an algorithm that often finds a solution quickly but sometime stagnates26

A better way to evaluate empirical performance Runtime distributionsPerform many runs (e.g. below left: 1000 runs)Comparing SLS algorithms 27

A better way to evaluate empirical performance Runtime distributionsPerform many runs (e.g. below left: 1000 runs)Consider the empirical distribution of the runtimesSort the empirical runtimes Compute fraction (probability) of solved runs within each runtimeRotate graph 90 degrees. E.g. below: longest run took 12 seconds Comparing SLS algorithms 28

Comparing runtime distributions x axis: runtime (or number of steps)y axis: proportion (or number) of runs solved within that runtime # of steps Fraction of solved runs, i.e. P(solved within x # of steps/time) 29

Comparing runtime distributions x axis: runtime (or number of steps)y axis: proportion (or number) of runs solved in that runtime # of steps Which algorithm is most likely to solve the problem within 7 steps? A. blue C. green B. red Fraction of solved runs, i.e. P(solved within x # of steps/time) 30

Comparing runtime distributions x axis: runtime (or number of steps)y axis: proportion (or number) of runs solved in that runtime # of steps Which algorithm is most likely to solve the problem within 7 steps? B. red Fraction of solved runs, i.e. P(solved within x # of steps/time) 31

Comparing runtime distributionsWhich algorithm has the best median performance?I.e., which algorithm takes the fewest number of steps to be successful in 50% of the cases? # of steps A. blue C. green B. red Fraction of solved runs, i.e. P(solved within x # of steps/time) 32

Comparing runtime distributionsWhich algorithm has the best median performance?I.e., which algorithm takes the fewest number of steps to be successful in 50% of the cases? A. blue Fraction of solved runs, i.e. P(solved within x # of steps/time) 33

Comparing runtime distributions x axis: runtime (or number of steps)y axis: proportion (or number) of runs solved in that runtimeTypically use a log scale on the x axis # of steps 28% solved after 10 steps, then stagnate 57% solved after 80 steps, then stagnate Slow, but does not stagnate Crossover point: if we run longer than 80 steps, green is the best algorithm If we run less than 10 steps, red is the best algorithm log scale is often used for the x axis Fraction of solved runs, i.e. P(solved within x # of steps/time) 34

Runtime distributions in AIspaceLook at some algorithms and their runtime distribution in AIspace:Greedy DescentRandom SamplingRandom WalkGreedy Descent with random walk Simple scheduling problem 2 in AIspace :Select Batch Run in the top menu to see runtime distributions of each algorithm on this problem- You can change number of runs per batch, as well as length of each run 35

Runtime distributions in AIspaceLet ’ s look at runtime distributions: Greedy DescentRandom SamplingRandom WalkGreedy Descent with random walk Simple scheduling problem 2 in AIspace: here is a sample set of runtime distributions Greedy descent Greedy descent with random restart Random walk Random restart Select Batch Run in the top menu to see runtime distributions of each algorithm on this problem You can change number of runs per batch, as well as length of each run 36

Lecture OverviewRecap of Lecture 10SLS algorithmsComparing SLS algorithmsSLS variants Planning Intro (time permitting) 37

Simulated AnnealingAnnealing: a metallurgical process where metals are hardened by being slowly cooled.Analogy: start with a high “temperature'‘, i.e., a high tendency to take random stepsOver time, cool down: more likely to follow the scoring functionTemperature reduces over time, according to an annealing schedule Key idea : Change the degree of randomness….38

Simulated Annealing: algorithmHere's how it worksYou are at a node n. Pick a variable at random and a new value at random. You generate n' If it is an improvement i.e., (A or B?) adopt it. B. h(n ) – h(n ’)< 0 A. h(n ) – h(n ’)> 0 39

Simulated Annealing: algorithmHere's how it worksYou are at a node n. Pick a variable at random and a new value at random. You generate n' If it is an improvement i.e., h(n) – h(n’) , adopt it . If it isn't an improvement, adopt it probabilistically depending on the difference h(n) – h(n’) and a temperature parameter, T. move to n' with probability 40

Simulated Annealing: algorithmif there is no improvement, move to n' with probabilityThe higher T, the higher the probability of selecting a non improving node for the same h(n) – h(n’) The higher |h(n) – h(n’)|, the more negativeThe lower the probability of selecting a non improving node for the same T h(n)-h(n’) Difference between n’ and n 41

Properties of simulated annealing searchIf T decreases slowly enough, then simulated annealing search will find a global optimum with probability approaching 1But “slowly enough” usually means longer than the time required by an exhaustive search of the assignment space Finding a good annealing schedule is “an art” and very much problem dependent Widely used in VLSI layout, airline scheduling, etc. 42

Tabu SearchMark partial assignments as tabu (taboo)Prevents repeatedly visiting the same (or similar) local minimaMaintain a queue of k variable/value pairs that are tabooE.g., when changing a variable V’s value from 2 to 4, we cannot change it back to 2 for the next k stepsk is a parameter that needs to be optimized empirically 43

Population Based SLSOften we have more memory than the one required for maintaining just current node (e.g., best so far + tabu list)Key Idea: maintain a population of k individualsAt every stage, update your population. Whenever one individual is a solution, report it. E.g., stochastic Bean Search, Genetic Algorithms Not required for this course but you can see how they work in Ch. 4.9 if interested 44

SLS for Constraint Optimization ProblemsConstraint Satisfaction ProblemsHard constraints: need to satisfy all of themAll models are equally goodConstraint Optimization Problems Hard constraints: need to satisfy all of themSoft constraints: need to satisfy them as well as possibleCan have weighted constraintsMinimize h(n) = sum of weights of constraints unsatisfied in n Hard constraints have a very large weightSome soft constraints can be more important than other soft constraints  larger weightAll local search methods discussed work just as well for constraint optimizationall they need is an evaluation function h 45

Example for constraint optimization problemExam scheduling Hard constraints: Cannot have an exam in too small a roomCannot have multiple exams in the same room in the same time slot…Soft constraintsStudent should not have to write two exams back to back (important)Students should not have multiple exams on the same dayIt would be nice if students had their exams spread out … 46

SLS limitationsTypically no guarantee to find a solution even if one existsSLS algorithms can sometimes stagnateGet caught in one region of the search space and never terminateVery hard to analyze theoreticallyNot able to show that no solution existsSLS simply won’t terminate You don ’ t know whether the problem is infeasible or the algorithm has stagnated47

SLS pros: anytime algorithmsWhen do you stop?When you know the solution found is optimal (e.g. no constraint violations)Or when you’re out of time: you have to act NOW Anytime algorithm: maintain the node with best h found so far (the “incumbent”) given more time, can improve its incumbent 48

SLS pros: dynamically changing problemsThe problem may change over timeParticularly important in schedulingE.g., schedule for airline:Thousands of flights and thousands of personnel assignmentsA storm can render the schedule infeasible Goal: Repair the schedule with minimum number of changes Often easy for SLS starting from the current scheduleOther techniques usually: Require more timeMight find solution requiring many more changes 49

Successful application of SLSScheduling of Hubble Space Telescope :reduced time to schedule 3 weeks of observations: from one week to around 10 sec. 50

RNA strand GUCCCAUAGGAUGUCCCAUAGGA Example: SLS for RNA secondary structure design RNA strand made up of four bases: cytosine (C), guanine (G), adenine (A), and uracil (U)Which 2D/3D structure an RNA strand folds into is important for its functionPredicting structure for a strand is “easy”: O(n3)But what if we want a strand that folds into a certain structure?Local search over strandsSearch for one that folds into the right structureEvaluation function for a strandRun O(n3) prediction algorithmEvaluate how different the result is from our target structure Secondary structure Easy Hard One of best algorithm to date: Local search algorithm RNA-SSD developed at UBC [ Andronescu , Fejes , Hutter , Condon, and Hoos , Journal of Molecular Biology, 2004] 51

Learning Goals for Local SearchImplement local search for a CSP. Implement different ways to generate neighborsImplement scoring functions to solve a CSP by local search through either greedy descent or hill-climbing .Implement SLS withrandom steps (1-step, 2-step versions)random restartCompare SLS algorithms with runtime distributions Explain functioning of Simulated Annealing52