2018-09-21 5K 5 0 0

##### Description

Machine Learning. Chapter 1: Introduction. Example. Handwritten Digit Recognition. Polynomial Curve Fitting . Sum-of-Squares Error Function. 0. th. Order Polynomial. 1. st. Order Polynomial. 3. rd. ID: 674441

**Embed code:**

## Download this presentation

DownloadNote - The PPT/PDF document "Pattern Recognition and" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

## Presentations text content in Pattern Recognition and

Pattern Recognition

and Machine Learning

Chapter 1: Introduction

Slide2Example

Handwritten Digit Recognition

Slide3Polynomial Curve Fitting Slide4

Sum-of-Squares Error FunctionSlide5

0th Order Polynomial

Slide61st Order Polynomial

Slide73rd Order Polynomial

Slide89th Order Polynomial

Slide9Over-fitting

Root-Mean-Square (RMS) Error:

Slide10Polynomial Coefficients Slide11

Data Set Size:

9

th

Order Polynomial

Slide12Data Set Size:

9

th

Order Polynomial

Slide13Regularization

Penalize large coefficient values

Slide14Regularization: Slide15

Regularization: Slide16

Regularization: vs. Slide17

Polynomial Coefficients Slide18

Probability Theory

Apples

and

Oranges

Slide19Probability Theory

Marginal Probability

Conditional Probability

Joint Probability

Slide20Probability Theory

Sum Rule

Product Rule

Slide21The Rules of Probability

Sum Rule

Product Rule

Slide22Bayes’ Theorem

posterior

likelihood × prior

Slide23Probability Densities

Slide24Transformed DensitiesSlide25

Expectations

Conditional Expectation

(discrete)

Approximate Expectation

(discrete and continuous)

Slide26Variances and

Covariances

Slide27The Gaussian DistributionSlide28

Gaussian Mean and VarianceSlide29

The Multivariate GaussianSlide30

Gaussian Parameter Estimation

Likelihood function

Slide31Maximum (Log) LikelihoodSlide32

Properties of and

Slide33Curve Fitting Re-visitedSlide34

Maximum Likelihood

Determine by minimizing sum-of-squares error, .

Slide35Predictive DistributionSlide36

MAP: A Step towards

Bayes

Determine by minimizing regularized sum-of-squares error, .

Slide37Bayesian Curve FittingSlide38

Bayesian Predictive DistributionSlide39

Model Selection

Cross-Validation

Slide40Curse of DimensionalitySlide41

Curse of Dimensionality

Polynomial curve fitting,

M

= 3

Gaussian Densities in

higher dimensions

Slide42Decision Theory

Inference step

Determine either or .Decision step For given x

, determine optimal

t

.

Slide43Minimum Misclassification RateSlide44

Minimum Expected Loss

Example: classify medical images as ‘cancer’ or ‘normal’

Decision

Truth

Slide45Minimum Expected Loss

Regions are chosen to minimize

Slide46Reject OptionSlide47

Why Separate Inference and Decision?

Minimizing risk (loss matrix may change over time)

Reject optionUnbalanced class priors

Combining models

Slide48Decision Theory for Regression

Inference step

Determine .Decision step For given x

, make optimal

prediction,

y

(

x

)

,

for

t

.

Loss function:

Slide49The Squared Loss FunctionSlide50

Generative

vs Discriminative

Generative approach: Model Use Bayes’ theorem

Discriminative approach:

Model directly

Slide51Entropy

Important quantity in

coding theory

statistical physics

machine learning

Slide52Entropy

Coding theory:

x discrete with 8 possible states; how many bits to transmit the state of x?

All states equally likely

Slide53EntropySlide54

Entropy

In how many ways can

N identical objects be allocated M bins?

Entropy maximized when

Slide55EntropySlide56

Differential Entropy

Put bins of width

¢ along the real line

Differential entropy maximized (for fixed ) when

in which case

Slide57Conditional EntropySlide58

The

Kullback-Leibler Divergence

Slide59Mutual Information