PPT-Clustering Mining of Massive Datasets
Author : aaron | Published Date : 2018-10-31
Jure Leskovec Anand Rajaraman Jeff Ullman Stanford University httpwwwmmdsorg Note to other teachers and users of these slides We would be delighted if you
Presentation Embed Code
Download Presentation
Download Presentation The PPT/PDF document "Clustering Mining of Massive Datasets" is the property of its rightful owner. Permission is granted to download and print the materials on this website for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Clustering Mining of Massive Datasets: Transcript
Jure Leskovec Anand Rajaraman Jeff Ullman Stanford University httpwwwmmdsorg Note to other teachers and users of these slides We would be delighted if you found this our material useful in giving your own lectures Feel free to use these slides verbatim or to modify them to fit your own needs. Itemset. Mining & Association Rules. Mining of Massive Datasets. Jure Leskovec, . Anand. . Rajaraman. , Jeff Ullman . Stanford University. http://www.mmds.org . Note to other teachers and users of these . Outline. Validating clustering results. Randomization tests. Cluster Validity . All clustering algorithms provided with a set of points output a clustering. How . to evaluate the “goodness” of the resulting clusters?. MapReduce. Shannon Quinn. Today. Naïve . Bayes. with huge feature sets. i.e. ones that don’t fit in memory. Pros and cons of possible approaches. Traditional “DB” (actually, key-value store). MMDS . Secs. . 3.2-3.4. . Slides adapted from: . J. . Leskovec. , A. . Rajaraman. , J. Ullman: Mining of Massive Datasets, . http://www.mmds.org. October 2014. Task: Finding . Similar Documents. Goal:. SVD & CUR. Mining of Massive Datasets. Jure Leskovec, . Anand. . Rajaraman. , Jeff Ullman . Stanford University. http://www.mmds.org . Note to other teachers and users of these . slides:. We . would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit your own needs. issue in . computing a representative simplicial complex. . Mapper does . not place any conditions on the clustering . algorithm. Thus . any domain-specific clustering algorithm can . be used.. We . 2). Mining of Massive Datasets. Jure Leskovec, . Anand. . Rajaraman. , Jeff Ullman . Stanford University. http://www.mmds.org . Note to other teachers and users of these . slides:. We . would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit your own needs. Overlapping Communities. Mining of Massive Datasets. Jure Leskovec, . Anand. . Rajaraman. , Jeff Ullman . Stanford University. http://www.mmds.org . Note to other teachers and users of these . slides:. Lecture outline. Distance/Similarity between data objects. Data objects as geometric data points. Clustering problems and algorithms . K-means. K-median. K-center. What is clustering?. A . grouping. of data objects such that the objects . Mining of Massive Datasets. Jure Leskovec, . Anand. . Rajaraman. , Jeff Ullman . Stanford University. http://www.mmds.org . Note to other teachers and users of these . slides:. We . would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit your own needs. Decision Trees on MapReduce CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu Decision Tree Learning Give one attribute (e.g., lifespan), try to predict the value of new people’s lifespans by means of some of the other available attribute Ranking Nodes on the Graph. Web pages are not equally “important”. www.joe-schmoe.com. vs. . www.stanford.edu. . Since there is large diversity . in the connectivity of the . web graph we can . Collections through . Contextual Focal Points. Kai . Xu. , . Rui. Ma, . Hao. Zhang, . Chenyang. Zhu,. Ariel . Shamir,. . Daniel Cohen-Or,. . . Hui. . Huang. Shenzhen . VisuCA. Key Lab / . Log. 2. transformation. Row centering and normalization. Filtering. Log. 2. Transformation. Log. 2. -transformation makes sure that the noise is independent of the mean and similar differences have the same meaning along the dynamic range of the values..
Download Document
Here is the link to download the presentation.
"Clustering Mining of Massive Datasets"The content belongs to its owner. You may download and print it for personal use, without modification, and keep all copyright notices. By downloading, you agree to these terms.
Related Documents