Lecture 6 KNearest Neighbor Classifier G53MLE Machine Learning Dr Guoping Qiu 1 Objects Feature Vectors Points 2 Elliptical blobs objects 1 2 3 4 5 6 7 8 9 10 11 12 13 ID: 246480
Download Presentation The PPT/PDF document "Machine Learning" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Machine Learning
Lecture 6K-Nearest Neighbor Classifier
G53MLE
Machine LearningDr Guoping Qiu
1Slide2
Objects, Feature Vectors, Points
2
Elliptical blobs (objects)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
x
1
x
2
X(1)
X(3)
X(7)
X(8)
X(15)
X(16)
X(25)
X(6)
X(12)
X(13)
X(4)
X(14)
X(9)
X(11)
X(10)Slide3
Nearest Neighbours
G53MLE Machine Learning Dr Guoping Qiu
3
x
1
x
2
X(i)=(x
1
(i), x
2
(i), …,x
n
(i))
X(j)=(x
1
(j), x
2
(j), …,x
n
(j))Slide4
Nearest Neighbour Algorithm
Given training data (X(1),D(1)), (X(2),D(2)), …, (X(N),D(N))
Define a distance metric between points in inputs space. Common measures are:
Euclidean Distance
G53MLE Machine Learning Dr Guoping Qiu
4Slide5
K-Nearest Neighbour Model
Given test point XFind the K nearest training inputs to X
Denote these points as
(X(1),D(1)), (X(2), D(2)), …, (X(k), D(k))
G53MLE Machine Learning Dr Guoping Qiu
5
xSlide6
K-Nearest Neighbour Model
ClassificationThe class identification of X
Y = most common class in set {D(1), D(2), …, D(k)}
G53MLE Machine Learning Dr Guoping Qiu
6
x
xSlide7
K-Nearest Neighbour Model
Example : Classify whether a customer will respond to a survey question using a 3-Nearest Neighbor classifier
G53MLE Machine Learning Dr Guoping Qiu
7
Customer
Age
Income
No. credit cards
Response
John
35
35K
3
No
Rachel
22
50K
2
Yes
Hannah
63
200K
1
No
Tom
59
170K
1
No
Nellie
25
40K
4
Yes
David
37
50K
2
?Slide8
K-Nearest Neighbour Model
Example : 3-Nearest Neighbors
G53MLE Machine Learning Dr Guoping Qiu
8
15.74
122
152.23
15
15.16Slide9
K-Nearest Neighbour Model
Example : 3-Nearest Neighbors
G53MLE Machine Learning Dr Guoping Qiu
9
15.74
122
152.23
15
15.16
Three nearest ones to David are: No, Yes, YesSlide10
K-Nearest Neighbour Model
Example : 3-Nearest Neighbors
G53MLE Machine Learning Dr Guoping Qiu
10
15.74
122
152.23
15
15.16
Three nearest ones to David are: No, Yes, Yes
YesSlide11
K-Nearest Neighbour Model
Picking K
Use N fold cross validation – Pick K to minimize the cross validation error
For each of N training example
Find its K nearest neighbours
Make a classification based on these K neighbours
Calculate classification error
Output average error over all examples
Use the K that gives lowest average error over the N training examples
G53MLE Machine Learning Dr Guoping Qiu
11Slide12
K-Nearest Neighbour Model
Example: For the example we saw earlier, pick the best K from the set {1, 2, 3} to build a K-NN classifier
G53MLE Machine Learning Dr Guoping Qiu
12Slide13
Further Readings
T. M. Mitchell, Machine Learning, McGraw-Hill International Edition, 1997
Chapter 8
G53MLE Machine Learning Dr Guoping Qiu
13Slide14
Tutorial/Exercise Questions
K nearest neighbor classifier has to store all training data creating high requirement on storage. Can you think of ways to reduce the storage requirement without affecting the performance? (hint: search the Internet, you will find many approximation methods).
G53MLE Machine Learning Dr Guoping Qiu
14