Kmeans Input set of data points k Randomly pick k points as means For i in 0 maxiters Assign each point to nearest center Reestimate each center as mean of points assigned to it ID: 776472
Download Presentation The PPT/PDF document " Grouping What is grouping?" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Grouping
Slide2What is grouping?
Slide3K-means
Input: set of data points, k
Randomly pick k points as means
For
i
in [0,
maxiters
]:
Assign each point to nearest center
Re-estimate each center as mean of points assigned to it
Slide4K-means - the math
Input: set of data points , kRandomly pick k points as means For iteration in [0, maxiters]:Assign each point to nearest centerRe-estimate each center as mean of points assigned to it
K-means - the math
An objective function that must be minimized:Every iteration of k-means takes a downward step:Fixes and sets to minimize objectiveFixes and sets to minimize objective
K-means on image pixels
Slide7K-means on image pixels
Picture courtesy David Forsyth
One of the clusters from k-means
Slide8K-means on image pixels
What is wrong?Pixel positionNearby pixels are likely to belong to the same objectFar-away pixels are likely to belong to different objectsHow do we incorporate pixel position?Instead of representing each pixel as (r,g,b)Represent each pixel as (r,g,b,x,y)
Slide9K-means on image pixels
Slide10The issues with k-means
Captures pixel similarity butDoesn’t capture continuityCaptures proximity only weaklyCan merge far away objects togetherRequires knowledge of k!
Slide11Oversegmentation and superpixels
We don’t know k. What is a safe choice?
Idea: Use large k
Can potentially break big objects, but will hopefully not merge unrelated objects
Later processing can decide which groups to merge
Called
superpixels
Slide12Regions Boundaries
Slide13Does Canny always work?
Slide14The aperture problem
Slide15The aperture problem
Slide16“Globalisation”
Slide17Images as graphs
Each pixel is nodeEdge between “similar pixels”Proximity: nearby pixels are more similarSimilarity: pixels with similar color are more similarWeight of edge = similarity
Slide18Segmentation is graph partitioning
Slide19Segmentation is graph partitioning
Every partition “cuts” some edgesIdea: minimize total weight of edges cut!
Slide20Criterion: Min-cut?
Min-cut carves out small isolated parts of the graphIn image segmentation: individual pixels
Slide21Normalized cuts
“Cut” = total weight of cut edges
Small cut means the groups don’t “like” each other
But need to normalize
w.r.t
how much they like
themselves
“Volume”
of a subgraph = total weight of edges within the subgraph
Slide22Normalized cut
Min-cut vs normalized cut
Both rely on interpreting images as graphs
By itself, min-cut gives small isolated pixels
But can work if we add other constraints
min-cut can be solved in polynomial time
Dual of max-flow
N-cut is NP-hard
But approximations exist!
Slide24Random walk
Slide25Random walk
Slide26Random walk
Slide27Random walk
Given that ghosts inhabit set A, how likely are they to stay in A?
Slide28Random walk
Given that ghosts inhabit set A, how likely are they
to stay in A?
Slide29Random walk
Given that ghosts inhabit set A, how likely are they
to stay in A?
Slide30Random walk
Given that ghosts inhabit set A, how likely are they
to stay in A?
Slide31Random walk
Key idea: Partition should be such that ghost should be likely to stay in one partition
Normalized cut criterion is the same as this
But how do we find this partition?
Slide32Graphs and matrices
w(i,j) = weight between i and j (Affinity matrix)d(i) = degree of i =D = diagonal matrix with d(i) on diagonal
W
D
N
N
N
N
Slide33Graphs and matrices
1
2
3
4
0
5
8
9
7
6
W
Slide34Graphs and matrices
1
2
3
4
0
5
8
9
7
6
W
E = D
-1
W
Slide35Graphs and matrices
How do we represent a clustering?A label for N nodes1 if part of cluster A, 0 otherwiseAn N-dimensional vector!
1
2
3
4
0
5
8
9
7
6
1
1
1
1
1
0
0
0
0
0
0:
1:
2:
3:
4:
5:
6:
7:
8:
9:
v
1
Slide36Graphs and matrices
How do we represent a clustering?A label for N nodes1 if part of cluster A, 0 otherwiseAn N-dimensional vector!
1
2
3
4
0
5
8
9
7
6
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
0:
1:
2:3:4:5:6:7:8:9:
v1
v
2
Slide37Graphs and matrices
How do we represent a clustering?A label for N nodes1 if part of cluster A, 0 otherwiseAn N-dimensional vector!
1
2
3
4
0
5
8
9
7
6
1
1
1
1
1
0
0
0
0
0
0
0
1
1
1
1
1
0
0
0
0:
1:
2:3:4:5:6:7:8:9:
v1
v2
0011111000
v
3
Slide38Graphs and matrices
1
2
3
4
0
5
8
9
7
6
E = D
-1
W
1
1
1
1
1
0
0
0
0
0
0:
1:
2:
3:
4:
5:
6:
7:
8:
9:
v
1
Slide39Graphs and matrices
E = D
-1W
1111100000
0:1:2:3:4:5:6:7:8:9:
v1
1
111100000
Ev
1
Slide40Graphs and matrices
E = D
-1W
0000011111
0:1:2:3:4:5:6:7:8:9:
v2
0
000011111
Ev
2
Slide41Graphs and matrices
E = D
-1W
0011111000
0:1:2:3:4:5:6:7:8:9:
v3
0.7
0.80.60.50.60.30.20.50.50.7
Ev
3
Slide42Graphs and matrices
1
2
3
4
0
5
8
9
7
6
E = D
-1
W
Slide43Graphs and matrices
E = D-1W
1111100000
0:1:2:3:4:5:6:7:8:9:
v1
1
1111000.200
Ev1
Slide44Graphs and matrices
Define z so that
Slide45Graphs and matrices
is called the Normalized Graph Laplacian
Slide46Graphs and matrices
We wantTrivial solution: all nodes of graph in one cluster, nothing in the otherTo avoid trivial solution, look for the eigenvector with the second smallest eigenvalue Find z s.t.
Slide47Normalized cuts
Approximate solution to normalized cutsConstruct matrix W and DConstruct normalized graph laplacianLook for the second smallest eigenvectorCompute Threshold y to get clusters Ideally, sweep threshold to get lowest N-cut value
Slide48More than 2 clusters
Given graph, use N-cuts to get 2 clusters
Each cluster is a graph
Re-run N-cuts on each graph
Slide49Normalized cuts
NP Hard
But approximation using
eigenvector of normalized graph
laplacian
Smallest eigenvector : trivial solution
Second smallest eigenvector: good partition
Other eigenvectors: other partitions
An instance of “Spectral clustering”
Spectrum = set of eigenvalues
Spectral clustering = clustering using eigenvectors of (various versions of) graph
laplacian
Slide50Images as graphs
Each pixel is a nodeWhat is the edge weight between two nodes / pixels?F(i): intensity / color of pixel iX(i): position of pixel i
Slide51Computational complexity
A 100 x 100 image has 10K pixels
A graph with 10K pixels has a 10K x 10K affinity matrix
Eigenvalue computation of an N x N matrix is O(N
3
)
Very very expensive!
Slide52Eigenvectors of images
The eigenvector has as many components as pixels in the image
Slide53Eigenvectors of images
The eigenvector has as many components as pixels in the image
Slide54Another example
2
nd
eigenvector
3
rd
eigenvector
4
th
eigenvector
Slide55Recursive N-cuts
2
nd
eigenvector
First partition
2
nd
eigenvector of 1st subgraph
recursive partition
Slide56N-Cuts resources
http://
scikit-learn.org/stable/modules/clustering.html#spectral-clustering
https://people.eecs.berkeley.edu/~
malik/papers/SM-ncut.pdf
Slide57Images as graphs
Enhancement: edge between far away pixel, weight = 1 – magnitude of intervening contour
Slide58Eigenvectors of images