PPT-Clustering
Author : kittie-lecroy | Published Date : 2016-06-05
Mining of Massive Datasets Jure Leskovec Anand Rajaraman Jeff Ullman Stanford University httpwwwmmdsorg Note to other teachers and users of these slides We
Presentation Embed Code
Download Presentation
Download Presentation The PPT/PDF document "Clustering" is the property of its rightful owner. Permission is granted to download and print the materials on this website for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Clustering: Transcript
Mining of Massive Datasets Jure Leskovec Anand Rajaraman Jeff Ullman Stanford University httpwwwmmdsorg Note to other teachers and users of these slides We would be delighted if you found this our material useful in giving your own lectures Feel free to use these slides verbatim or to modify them to fit your own needs. Adapted from Chapter 3. Of. Lei Tang and . Huan. Liu’s . Book. Slides prepared by . Qiang. Yang, . UST, . HongKong. 1. Chapter 3, Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool, September, 2010. . Hierarchical Clustering . Produces a set of . nested clusters . organized as a hierarchical tree. Can be visualized as a . dendrogram. A tree-like diagram that records the sequences of merges or splits. and Physical Interaction . Datasets. Manikandan Narayanan, Adrian Vetta, Eric E. Schadt, Jun Zhu. PLoS Computational Biology 2010. Presented by: Tal Saiag. Seminar in Algorithmic Challenges in Analyzing Big Data* in Biology and . Brendan and Yifang . April . 21 . 2015. Pre-knowledge. We define a set A, and we find the element that minimizes the error. We can think of as a sample of . Where is the point in C closest to X. . extratropical. cyclones: their influence on extreme precipitation events in the . UK. Suzanne Gray. Ruari. Rhodes. , Len . Shaffrey. Jointly sponsored . by . University of Reading and Lloyds Banking Group. Supervised & Unsupervised Learning. Supervised learning. Classification. The number of classes and class labels of data elements in training data is known beforehand. Unsupervised learning. Clustering. Frank Lin. 10-710 Structured Prediction. School of Computer Science. Carnegie Mellon . University. 2011-11-28. Talk Outline. Clustering. Spectral Clustering. Power Iteration Clustering (PIC). PIC with Path Folding. issue in . computing a representative simplicial complex. . Mapper does . not place any conditions on the clustering . algorithm. Thus . any domain-specific clustering algorithm can . be used.. We . What is clustering?. Why would we want to cluster?. How would you determine clusters?. How can you do this efficiently?. K-means Clustering. Strengths. Simple iterative method. User provides “K”. Lecture outline. Distance/Similarity between data objects. Data objects as geometric data points. Clustering problems and algorithms . K-means. K-median. K-center. What is clustering?. A . grouping. of data objects such that the objects . 1. Mark Stamp. K-Means for Malware Classification. Clustering Applications. 2. Chinmayee. . Annachhatre. Mark Stamp. Quest for the Holy . Grail. Holy Grail of malware research is to detect previously unseen malware. Produces a set of . nested clusters . organized as a hierarchical tree. Can be visualized as a . dendrogram. A tree-like diagram that records the sequences of merges or splits. Strengths of Hierarchical Clustering. Log. 2. transformation. Row centering and normalization. Filtering. Log. 2. Transformation. Log. 2. -transformation makes sure that the noise is independent of the mean and similar differences have the same meaning along the dynamic range of the values.. Randomization tests. Cluster Validity . All clustering algorithms provided with a set of points output a clustering. How . to evaluate the “goodness” of the resulting clusters?. Tricky because .
Download Document
Here is the link to download the presentation.
"Clustering"The content belongs to its owner. You may download and print it for personal use, without modification, and keep all copyright notices. By downloading, you agree to these terms.
Related Documents
