PPT-Clustering and Dimensionality Reduction

Author : lois-ondreau | Published Date : 2016-08-02

Brendan and Yifang April 21 2015 Preknowledge We define a set A and we find the element that minimizes the error We can think of as a sample of Where is the

Presentation Embed Code

Download Presentation

Download Presentation The PPT/PDF document "Clustering and Dimensionality Reduction" is the property of its rightful owner. Permission is granted to download and print the materials on this website for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Clustering and Dimensionality Reduction: Transcript

Brendan and Yifang April 21 2015 Preknowledge We define a set A and we find the element that minimizes the error We can think of as a sample of Where is the point in C closest to X . Adapted from Chapter 3. Of. Lei Tang and . Huan. Liu’s . Book. Slides prepared by . Qiang. Yang, . UST, . HongKong. 1. Chapter 3, Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool, September, 2010. . Kenneth D. Harris 24/6/15. Exploratory vs. confirmatory analysis. Exploratory analysis. Helps you formulate a hypothesis. End result is usually a nice-looking picture. Any method is equally valid – because it just helps you think of a hypothesis. Lecture outline. Distance/Similarity between data objects. Data objects as geometric data points. Clustering problems and algorithms . K-means. K-median. K-center. What is clustering?. A . grouping. of data objects such that the objects . Principle Component Analysis. Why Dimensionality Reduction?. It becomes more difficult to extract meaningful conclusions from a data set as data dimensionality increases--------D. L. . Donoho. Curse of dimensionality. Aayush Mudgal [12008]. Sheallika Singh [12665]. What is Dimensionality Reduction ?. Mapping . of data to lower dimension such . that:. . uninformative variance is . discarded,. . or a subspace where data lives is . k. Ramachandra . murthy. Why Dimensionality Reduction. ?. It . is so easy and convenient to collect . data. Data is not collected only for data mining. Data . accumulates in an unprecedented speed. Data pre-processing . issue in . computing a representative simplicial complex. . Mapper does . not place any conditions on the clustering . algorithm. Thus . any domain-specific clustering algorithm can . be used.. We . What is clustering?. Why would we want to cluster?. How would you determine clusters?. How can you do this efficiently?. K-means Clustering. Strengths. Simple iterative method. User provides “K”. Unsupervised . learning. Seeks to organize data . into . “reasonable” . groups. Often based . on some similarity (or distance) measure defined over data . elements. Quantitative characterization may include. Clustering, Dimensionality Reduction and Instance Based Learning Geoff Hulten Supervised vs Unsupervised Supervised Training samples contain labels Goal: learn All algorithms we’ve explored: Logistic regression Clustering, Dimensionality Reduction and Instance Based Learning Geoff Hulten Supervised vs Unsupervised Supervised Training samples contain labels Goal: learn All algorithms we’ve explored: Logistic regression 1. Mark Stamp. K-Means for Malware Classification. Clustering Applications. 2. Chinmayee. . Annachhatre. Mark Stamp. Quest for the Holy . Grail. Holy Grail of malware research is to detect previously unseen malware. Produces a set of . nested clusters . organized as a hierarchical tree. Can be visualized as a . dendrogram. A tree-like diagram that records the sequences of merges or splits. Strengths of Hierarchical Clustering. What is clustering?. Grouping set of documents into subsets or clusters.. The Goal of clustering algorithm is:. To create clusters that are coherent internally, but clearly different from each other.

Download Document

Here is the link to download the presentation.
"Clustering and Dimensionality Reduction"The content belongs to its owner. You may download and print it for personal use, without modification, and keep all copyright notices. By downloading, you agree to these terms.