/
SVM QP & Midterm Review SVM QP & Midterm Review

SVM QP & Midterm Review - PowerPoint Presentation

tawny-fly
tawny-fly . @tawny-fly
Follow
426 views
Uploaded On 2016-05-10

SVM QP & Midterm Review - PPT Presentation

Rob Hall 10142010 1 This Recitation Review of Lagrange multipliers basic undergrad calculus Getting to the dual for a QP Constrained norm minimization for SVM Midterm review 2 Minimizing a quadratic ID: 314373

midterm constraint regression svm constraint midterm svm regression linear dual lagrangian quadratic constrained minimization inequality density function lagrange constraints

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "SVM QP & Midterm Review" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

SVM QP & Midterm Review

Rob Hall 10/14/2010

1Slide2

This Recitation

Review of Lagrange multipliers (basic undergrad calculus)

Getting to the dual for a QP

Constrained norm minimization (for SVM)

Midterm review

2Slide3

Minimizing a quadratic

“Positive definite”

3Slide4

Minimizing a quadratic

“Gradient”

“Hessian”

So just solve:

4Slide5

Constrained Minimization

“Objective function”

Constraint

Same quadratic shown with contours of

linear

constraint function

5Slide6

Constrained Minimization

New optimality condition

Theoretical justification for this case (linear constraint):

Remain feasible

Decrease f

Taylor’s theorem

Otherwise, may choose so:

6Slide7

The

Lagrangian

“The

L

agrangian

“Lagrange multiplier”

New optimality condition

feasibility

Stationary points satisfy:

7Slide8

Dumb Example

Maximize area of rectangle, subject to perimeter = 2c

1. Write function

2. Write

Lagrangian

3. Take partial derivatives

4. Solve system (if possible)

8Slide9

Inequality Constraints

Linear equality constraint Linear inequality constraint

Solution must be on line

Solution must be in

halfspace

Lagrangian

(as before)

9Slide10

Inequality Constraints

2 cases:

Constraint “inactive”

Constraint “active”/“tight”

Why?

Why?

10Slide11

Inequality Constraints

2 cases:

Constraint “inactive”

Constraint “active”/“tight”

“Complementary Slackness”

11Slide12

Duality

Lagrangian

Lagrangian

dual function

Dual problem

Intuition:

λ

x(

λ

)

f(x)

g(x)

d(

λ

)

0

Unconstrained

minimizer

min

Maybe > 0

Min

f

1

Near min

Maybe

> 0

Near min f

Non decreasing

Constrained

minimizer

Constrained min

Must be 0

> Min f

Largest value will be constrained minimum

12Slide13

SVM

“Hard margin” SVM

13

Learn a classifier of the form:

Distance of point from decision boundary

Note, only feasible if data are

linearly separableSlide14

Norm Minimization

Vector of Lagrange multipliers.

constraint rearranged to g(w)≤0

14

Scaled to simplify math

Matrix with y

i

on diagonal and 0 elsewhereSlide15

SVM Dual

15

Take derivative:

Leads to:

And:

Remark:

w is a linear combination of x with positive LMs, i.e., those points where the constraint is tight: i.e.

support vectors

Slide16

SVM Dual

16

Using both results we have:

Remarks:

Result is another quadratic to maximize, which only has non-negativity constraints

No b here -- may embed x into higher dimension by taking (x,1), then last component of w = b

“kernel trick” here (next class)Slide17

Midterm

Basics: Classification, regression, density estimationBayes

risk

Bayes

optimal classifier (or regressor)

Why can’t you have it in practice?

Goal of ML: To minimize a risk = expected loss

Why cant you do it in practice?

Minimize some estimate of risk

17Slide18

Midterm

Estimating a density:MLE: maximizing a likelihood

MAP / Bayesian inference

Parametric distributions

Gaussian, Bernoulli etc.Nonparametric estimationKernel density estimator

Histogram

18Slide19

Midterm

ClassificationNaïve bayes

: assumptions / failure modes

Logistic regression:

Maximizing a log likelihoodLog loss function

Gradient ascent

SVM

Kernels

Duality

19Slide20

Midterm

Nonparametric classification:Decision trees

KNN

Strengths/weakness compared to parametric methods

20Slide21

Midterm

RegressionLinear regression

Penalized regression (ridge regression, lasso etc).

Nonparametric regression:

Kernel smoothing

21Slide22

Midterm

Model selection:MSE = bias^2 + variance

Tradeoff bias

vs

varianceModel complexity How to do model selection:

Estimate the risk

Cross validation

22