/
High Density Clusters June 2017 1 Idea Shift Density-Based Clustering VS Center-Based. High Density Clusters June 2017 1 Idea Shift Density-Based Clustering VS Center-Based.

High Density Clusters June 2017 1 Idea Shift Density-Based Clustering VS Center-Based. - PowerPoint Presentation

lindy-dunigan
lindy-dunigan . @lindy-dunigan
Follow
342 views
Uploaded On 2019-10-31

High Density Clusters June 2017 1 Idea Shift Density-Based Clustering VS Center-Based. - PPT Presentation

High Density Clusters June 2017 1 Idea Shift DensityBased Clustering VS CenterBased 2 Main Objective Objective find a clustering of tight knit groups in G 3 Clustering Algorithm Recursive Algorithm based on Sparse Cuts ID: 761634

dense clustering recursive submatrices clustering dense submatrices recursive cuts clusters sparse flow vertices technique edges finding density graph algorithm

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "High Density Clusters June 2017 1 Idea S..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

High Density Clusters June 2017 1

Idea Shift Density-Based Clustering VS Center-Based. 2

Main Objective Objective: find a clustering of tight knit groups in G. 3

Clustering Algorithm: Recursive Algorithm based on Sparse Cuts Finding “Dense Submatrices” Community Finding: Network Flow 4 Outline

5 Part : Recursive Clustering  

Recursive Clustering-Sparse Cuts For two disjoint sets of nodes S,T, we will define:   6 S T  

Recursive Clustering-Sparse Cuts For a set S, we will define:   7 = 3   S

Recursive Clustering-Sparse Cuts 8 1 6 2 5 9 7 3 10 8 4 S   11 12

1 3 2 8 5 4 9 7 6 : |S| = 3 |W| = 9   : |S| = 3 |W| = 6   Let be           9

Recursive Clustering-Sparse Cuts Clusters(G)- List of current clusters Initiailization : Clusters(G ) = {V} (One cluster- the graph) Let > 0 Rec_Clustering(G, , ) for each cluster W in Clusters(G) do if do Rec_Clustering (G , , )   10

Recursive Clustering-Sparse Cuts Theorem 7.9 At the termination of Recursive Clustering, the total number of edges between vertices in different clusters is at most   11

12 Part : Dense Submatrices  

Dense Submatrices - Different Approach Let n data points in d-space be represented as a Matrix (We will assume that A is non negative).   13 Example: The Document-Term matrix. Let D1 be the statement “I really really like Clustering”and Let D2 be the statement “I love Clustering” Clustering really lovelikeI 120 11D1 1010 1D2

Like I Really Love 1 2         T   14 Clustering

Dense Submatrices Say we look at A as a bipartite graph, where one side represents Rows(A) and the other Col(A), where the edge ( i , j) is given weight We want s.t :   15

Dense Submatrices First Try: (The average size in the submatrix)   16 Second Try:   D , and let the density of A.  

17 Dense Submatrices Clustering Love Like I 1 0 11 D111 01D2    

Dense Submatrices Theorem 7.10 Let A be a Matrix with entries in then Furthermore, we can find S,T such that using the top singular vector   18

19 Part : Community Finding  

Dense Submatrices Special Case: Similarity of the Set For S subgroup of V, What does D(S,S) represent?   20 0 0 1 1 0 0 0 1 01 110 1110 100 011 00 1234 5 1 2 3 4 5 Let S= {3,4,5}

21 0 0 1 1 0 0 0 101 1101 110 100 0110 0 1 3 2 4 5 1 2 3 4 5 1 2 3 4 5 S= Green   Dense Submatrices

Community Finding- Similarity of the Set Goal: Find the subgraph with maximum average degree in graph G. 22

Community Finding Let G=(V,E) a weighted graph. We define: Where S,T are two sets of nodes.   23 The density of S will be:   What are we looking for in terms of density?

Flow technique Sub-Problem Let . (Or claim it does not exist!)   24 S= Green  

v u x w s t ( u,v ) ( v,w ) ( w,x) 1 1 1                     edges vertices What kinds of Cuts exist in H? Flow technique 25

v u x w s t ( u,v ) ( v,w ) ( w,x) 1 1 1                     edges vertices C(S,T) = |E| Type 1 Cut 26

v u x w s t ( u,v ) ( v,w ) ( w,x) 1 1 1                     edges vertices Type 2 Cut 27 C(S,T) = |V|  

v u x w s t ( u,v ) ( v,w ) ( w,x) 1 1 1                     edges vertices Type 3 Cut 28 C(S,T) =  

Flow technique Theorem:   29

Algorithm: Start with Build Network, and run MaxFlow If we get Type 3 Cut: Look for bigger Else: Look for a smaller Complexity :   30 Flow technique

Flow technique - Questions When do we stop? 31 > 0 (different stages of algorithm) and whole,