CSCE421821 Spring 2019 wwwcseunleduchoueiryS19421821 All questions to Piazza Berthe Y Choueiry Shuweri Avery Hall Room 360 Evaluation of Deterministic BT Search Algorithms ID: 810740
Download The PPT/PDF document "1 Foundations of Constraint Processing" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
1
Foundations of Constraint Processing CSCE421/821, Spring 2019www.cse.unl.edu/~choueiry/S19-421-821/All questions to PiazzaBerthe Y. Choueiry (Shu-we-ri)Avery Hall, Room 360
Evaluation of (Deterministic) BT Search Algorithms
Slide22
OutlineEvaluation of (deterministic) BT search algorithms [Dechter, 6.6.2]CSP parametersComparison criteria Theoretical evaluations
Empirical evaluations
Slide33
CSP parametersBinary: n,a,p1,t; Non-binary: n,a,p1,k,tNumber of variables: n
Domain size:
a,
d
Degree of a variable:
deg
Arity of the constraints: kConstraint tightness: Proportion of constraints (a.k.a., constraint density, constraint probability)p1 = e / emax, e is number of constraints
Slide44
Comparison criteriaNumber of nodes visited (#NV)Every time you call label Number of Backtracks (#BT)Every un-assignment of a variable in unlabel
Number of constraint check (#CC)
Every time you call check(
i,j
)
CPU time
Be as honest and consistent as possible
Optional: Some specific criterion for assessing the quality of the improvement proposed
Presentation of values:Descriptive statistics of criterion: average (also, median, mode, max, min)(qualified) run-time distribution
Solution-quality distribution
Slide55
Theoretical evaluationsComparing NV and/or CC Common assumptions: for finding all solutions static/same orderings
Slide66
Empirical evaluation: data setsUse real-world data (anecdotal evidence)Use benchmarkscsplib.orgSolver competition benchmarksUse randomly generated problemsVarious models of random generatorsGuaranteed with a solutionUniform or structured
Slide77
Empirical evaluations: random problemsVarious models exist (use Model B)Models A, B, C, E, F, etc.Vary parameters: <n, a, t, p>
Number of variables:
n
Domain size:
a,
d
Constraint tightness: t = |forbidden tuples| / | all tuples |Proportion of constraints (a.k.a., constraint density, constraint probability): p1 = e / e
maxIssues: UniformityDifficulty (phase transition)Solvability of instances
(for incomplete search techniques)
Slide88
Model BInput: n, a, t, p1Generate n nodesGenerate a list of
n.(n-1)/2
tuples of all combinations of 2 nodes
Choose
e
elements from above list as constraints to between the
n
nodesIf the graph is not connected, throw away, go back to step 4, else proceedGenerate a list of a2 tuples of all combinations of 2 valuesFor each constraint, choose randomly a number of tuples from the list to guarantee tightness t for the constraint
Slide99
Phase transition [Cheeseman et al. ‘91]
Cost of solving
Mostly solvable problems
Mostly un-solvable problems
Order parameter
Critical value of order parameter
Significant increase of cost around critical value
In CSPs, order parameter is constraint tightness & ratio
Algorithms compared around phase transition
Slide10TestsFix n, a, p1 and Vary t in {0.1, 0.2, …,0.9}Fix n, a, t and Vary p1 in {0.1, 0.2, …,0.9}For each data point (for each value of t/p1)Generate (at least) 50 instancesStore all instancesMake measurements#NV, #CC, CPU time, #messages, etc.
Slide11Comparing two algorithms A1 and A2Store all measurements in ExcelUse Excel, R, SAS, etc. for statistical measurementsUse the t-test, paired testComparing measurementsA1, A2 a significantly differentComparing ln measurementsA1is significantly better than A2For Excel: Microsoft button, Excel Options, Adds in, Analysis ToolPak, Go, check the box for Analysis ToolPak, Go. Intall…
#CC
ln
(#CC)
A
1
A
2
A
1
A
2
i
1
100
200
…
…
i
2
…
i
3
…
i
50
Slide12t-test in ExcelUsing ln valuesp ttest(array1,array2,tails,type)tails=1 or 2 type1 (paired)t tinv(p,df)degree of freedom = #instances – 2
Slide13t-test with 95% confidenceOne-tailed testInterested in direction of changeWhen t > 1.645, A1 is larger than A2When t -1.645, A2 is larger than A1When -1.645 t 1.645, A1 and A2 do not differ significantly|t|=1.645 corresponds to p=0.05 for a one-tailed testTwo-tailed testAlthough it tells direction, not as accurate as the one-tailed testWhen t > 1.96, A1 is larger than A2When t -1.96, A
2 is larger than A1When -1.96 t 1.96, A1 and A2 do not differ significantly|t|=1.96 corresponds to p=0.05 for a two-tailed testp=0.05 is a US Supreme Court ruling: any statistical analysis needs to be significant at the 0.05 level to be admitted in court
Slide14Computing the 95% confidence intervalThe t test can be used to test the equality of the means of two normal populations with unknown, but equal, variance.We usually use the t-testAssumptionsNormal distribution of dataSampling distributions of the mean approaches a uniform distribution (holds when #instances 30)Equality of variances Sampling distribution: distribution calculated from all possible samples of a given size drawn from a given population
Slide15Alternatives to the t testTo relax the normality assumption, a non-parametric alternative to the t test can be used, and the usual choices are: for independent samples, the Mann-Whitney U testfor related samples, either the binomial test or the Wilcoxon signed-rank testTo test the equality of the means of more than two normal populations, an Analysis of Variance can be performedTo test the equality of the means of two normal populations with known variance, a Z-test can be performed
Slide16AlertsFor choosing the value of t in general, check http://www.socr.ucla.edu/Applets.dir/T-table.html For a sound statistical analysisconsult the Help Desk of the Department of Statistics at UNLheld at least twice a week at Avery Hall.Acknowledgments: Dr. Makram Geha, Department of Statistics @ UNL. All errors are mine..