/
Machine Learning Machine Learning

Machine Learning - PowerPoint Presentation

karlyn-bohler
karlyn-bohler . @karlyn-bohler
Follow
405 views
Uploaded On 2016-03-07

Machine Learning - PPT Presentation

Lecture 6 KNearest Neighbor Classifier G53MLE Machine Learning Dr Guoping Qiu 1 Objects Feature Vectors Points 2 Elliptical blobs objects 1 2 3 4 5 6 7 8 9 10 11 12 13 ID: 246480

machine nearest guoping learning nearest machine learning guoping g53mle qiu neighbour model training classifier david neighbor 152 points 122 neighbors average requirement

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Machine Learning" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Machine Learning

Lecture 6K-Nearest Neighbor Classifier

G53MLE

Machine LearningDr Guoping Qiu

1Slide2

Objects, Feature Vectors, Points

2

Elliptical blobs (objects)

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

x

1

x

2

X(1)

X(3)

X(7)

X(8)

X(15)

X(16)

X(25)

X(6)

X(12)

X(13)

X(4)

X(14)

X(9)

X(11)

X(10)Slide3

Nearest Neighbours

G53MLE Machine Learning Dr Guoping Qiu

3

x

1

x

2

X(i)=(x

1

(i), x

2

(i), …,x

n

(i))

X(j)=(x

1

(j), x

2

(j), …,x

n

(j))Slide4

Nearest Neighbour Algorithm

Given training data (X(1),D(1)), (X(2),D(2)), …, (X(N),D(N))

Define a distance metric between points in inputs space. Common measures are:

Euclidean Distance

G53MLE Machine Learning Dr Guoping Qiu

4Slide5

K-Nearest Neighbour Model

Given test point XFind the K nearest training inputs to X

Denote these points as

(X(1),D(1)), (X(2), D(2)), …, (X(k), D(k))

G53MLE Machine Learning Dr Guoping Qiu

5

xSlide6

K-Nearest Neighbour Model

ClassificationThe class identification of X

Y = most common class in set {D(1), D(2), …, D(k)}

G53MLE Machine Learning Dr Guoping Qiu

6

x

xSlide7

K-Nearest Neighbour Model

Example : Classify whether a customer will respond to a survey question using a 3-Nearest Neighbor classifier

G53MLE Machine Learning Dr Guoping Qiu

7

Customer

Age

Income

No. credit cards

Response

John

35

35K

3

No

Rachel

22

50K

2

Yes

Hannah

63

200K

1

No

Tom

59

170K

1

No

Nellie

25

40K

4

Yes

David

37

50K

2

?Slide8

K-Nearest Neighbour Model

Example : 3-Nearest Neighbors

G53MLE Machine Learning Dr Guoping Qiu

8

15.74

122

152.23

15

15.16Slide9

K-Nearest Neighbour Model

Example : 3-Nearest Neighbors

G53MLE Machine Learning Dr Guoping Qiu

9

15.74

122

152.23

15

15.16

Three nearest ones to David are: No, Yes, YesSlide10

K-Nearest Neighbour Model

Example : 3-Nearest Neighbors

G53MLE Machine Learning Dr Guoping Qiu

10

15.74

122

152.23

15

15.16

Three nearest ones to David are: No, Yes, Yes

YesSlide11

K-Nearest Neighbour Model

Picking K

Use N fold cross validation – Pick K to minimize the cross validation error

For each of N training example

Find its K nearest neighbours

Make a classification based on these K neighbours

Calculate classification error

Output average error over all examples

Use the K that gives lowest average error over the N training examples

G53MLE Machine Learning Dr Guoping Qiu

11Slide12

K-Nearest Neighbour Model

Example: For the example we saw earlier, pick the best K from the set {1, 2, 3} to build a K-NN classifier

G53MLE Machine Learning Dr Guoping Qiu

12Slide13

Further Readings

T. M. Mitchell, Machine Learning, McGraw-Hill International Edition, 1997

Chapter 8

G53MLE Machine Learning Dr Guoping Qiu

13Slide14

Tutorial/Exercise Questions

K nearest neighbor classifier has to store all training data creating high requirement on storage. Can you think of ways to reduce the storage requirement without affecting the performance? (hint: search the Internet, you will find many approximation methods).

G53MLE Machine Learning Dr Guoping Qiu

14