Lecture 02 PAC Learning and tail bounds intro CS 790134 Spring 2015 Alex Berg Todays lecture PAC Learning Tail bounds Rectangle learning Hypothesis ID: 784279
Download The PPT/PDF document "Machine Learning with Discriminative Met..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Machine Learning with Discriminative MethodsLecture 02 – PAC Learning and tail bounds introCS 790-134 Spring 2015
Alex Berg
Slide2Today’s lecturePAC LearningTail bounds…
Slide3Rectangle learning
+
-
-
-
-
-
-
+
+
+
Hypothesis
H
Hypothesis is any axis aligned rectangle. Inside rectangle is positive.
Slide4Rectangle learning – Realizable case
+
-
-
-
-
-
+
+
+
Actual
boundary is also an axis-aligned rectangle
,
“The Realizable Case”
(no approximation error)
-
Hypothesis
H
Slide5Rectangle learning – Realizable case
+
-
-
-
-
-
+
+
+
Actual
boundary is also an axis-aligned rectangle
,
“The Realizable Case”
(no approximation error)
-
Hypothesis
H
-
A mistake for the hypothesis
H!
Measure ERROR by the probability of making a mistake.
Slide6Rectangle learning – a strategy for a learning algorithm…
+
-
-
-
-
-
+
+
+
Hypothesis
H
(Output of learning algorithm so far…)
-
Make the smallest rectangle consistent with all the data so far.
Slide7Rectangle learning – making a mistake
+
-
-
-
-
-
+
+
+
Hypothesis
H
(Output of learning algorithm so far…)
-
+
Current hypothesis makes a mistake on a new data item…
Make the smallest rectangle consistent with all the data so far.
Slide8Rectangle learning – making a mistake
+
-
-
-
-
-
+
+
+
Hypothesis
H
(Output of learning algorithm so far…)
-
Make the smallest rectangle consistent with all the data so far.
+
Current hypothesis makes a mistake on a new data item…
Use probability of such a mistake
(this is our error measure)
to find a bound for how likely it was we had not yet seen a training examp
le in this region…
Slide9Very subtle formulation… From the Kearns and Vazirani Reading
R’ = Result of algorithm so far (after m sample)
R = Actual decision boundary
Slide10From the Kearns and Vazirani Reading
Slide11PAC Learning
Slide12Flashback: Learning/fitting is a process…From Raginsky notes
Estimating the probability that a tossed coin comes up heads…
The
i
’th
coin toss
Estimator based on n tosses
Estimate is within epsilon
Estimate is not within epsilon
Probability of being bad is inversely proportional to the number of samples…
(the underlying computation is an example of a tail bound)
Slide13Markov’s InequalityFrom Raginksy’s notes
Slide14Chebyshev’s InequalityFrom Raginksy’s notes
Slide15Not quite good enough…From Raginksy’s notes
Slide16For next classRead the wikipedia page for Chernoff Bound: http://en.wikipedia.org/wiki/Chernoff_boundRead at least first Raginsky’s introductory notes on tail bounds (pages 1-5)http://maxim.ece.illinois.edu/teaching/fall14/notes/concentration.pdf
Come to class with questions! It is fine to have questions, but first spend some time trying to work through reading/problems. Feel free to post questions to Sakai discussion board!