/
CHAPTER 3: CHAPTER 3:

CHAPTER 3: - PowerPoint Presentation

celsa-spraggs
celsa-spraggs . @celsa-spraggs
Follow
402 views
Uploaded On 2016-07-01

CHAPTER 3: - PPT Presentation

Bayesian Decision Theory Souce Alpaypin with modifications by Christoph F Eick Remark Belief Networks will be covered in April Utility theory will be covered as part of reinforcement learning ID: 385297

reject decision choose making decision reject making choose risk learning rule heads receive optimal risks lecture alpayd

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "CHAPTER 3:" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

CHAPTER 3:Bayesian Decision Theory

Souce

:

Alpaypin

with modifications by

Christoph

F.

Eick

;

Remark: Belief Networks will be covered

in April.

Utility theory will be covered as

part

of reinforcement learning.Slide2

2

Probability and Inference

Result of tossing a coin is

Î

{Heads,Tails}Random var X Î{1,0} Bernoulli: P {X=1} = poSample: X = {xt }Nt =1 Estimation: po = # {Heads}/#{Tosses} = ∑t xt / NPrediction of next toss: Heads if po > ½, Tails otherwise

P(X=k)=

In the theory of

probability

and

statistics

, a

Bernoulli trial

is an experiment whose outcome is random and can be either of two possible outcomes, "success" and "failure".Slide3

Binomial Distribution

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

3Slide4

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

4

Classification

Credit scoring: Inputs are income and savings.

Output is low-risk vs high-riskInput: x = [x1,x2]T ,Output: C Î {0,1}Prediction: Slide5

5

Bayes’ Rule

posterior

likelihood

prior

evidence

see:

http

://en.wikipedia.org/wiki/Bayes'_

theoremSlide6

6

Bayes’ Rule: K>2 Classes

Remember: The disease/symptom exampleSlide7

7

Losses and Risks

Actions:

α

i Loss of αi when the state is Ck : λik Expected risk (Duda and Hart, 1973)

Remark:λ

ik is the cost of choosing i when k is correct!If we use accuracy/error, then λik := If i=k then 0 else 1! Slide8

8

Losses and Risks: 0/1 Loss

For minimum risk, choose the most probable class

Remark

: This strategy is not optimal in other cases Slide9

9

Losses and Risks: Reject

Risk for

rejectSlide10

Example and Homework!

C1=has cancer

C2=has not cancer12=9 21=72Homework:

a) Determine the optimal decision making strategy

Inputs: P(C1|x), P(C2|x)Decision Making Strategy:…b) Now assume we also have a reject option and the cost for making no decision are 3: reject,2=3 reject, 1=3 Inputs: P(C1|x), P(C2|x) Decision Making Strategy: …

10

Ungraded Homework: to be discussed Feb. 6! Slide11

Homework:

a) Determine the optimal decision making strategy

Input:

P(C1|x

),R(a1|x)=9xP(C2)R(a2|x)=72xP(C1)R(areject

|x)=3Setting those equal receive:

9xP(C2)=72xP(C1) (P(C2)/P(C1))=8; additionally using P(C1)+P(C2)=1 we receive: P(C1)=1/9 and P(C2)=8/9 and the risk-minimizing decision rule becomes: IF P(C1)>1/9 THEN choose C1 ELSE choose C2b) Now assume we also have a reject option and the cost for making no decision are 3: reject,2=3 reject, 1=3

Input:

P(C1|x

)

First we find equating R(

a

reject

|x

)

with

R(

a

1

|x

) and

R(

a2|x):

If P(C2)≥1/3  P(C1) ≤2/3 reject should be preferred over class1 and P(C1)≥1/24 reject should be preferred over class2. Combining this knowledge with the previous decision rule we receive:IF P(C1)[0,1/24] THEN choose class2

ELSE IF P(C1)[2/3,1] THEN choose class1

ELSE choose reject

je

11Slide12

Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

12

Discriminant Functions

K

decision regions

R

1

,...,

R

K