PPT-CS 478 - Clustering

Author : briana-ranney | Published Date : 2016-03-05

1 Unsupervised Learning and Clustering In unsupervised learning you are given a data set with no output classifications Clustering is an important type of unsupervised

Presentation Embed Code

Download Presentation

Download Presentation The PPT/PDF document "CS 478 - Clustering" is the property of its rightful owner. Permission is granted to download and print the materials on this website for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

CS 478 - Clustering: Transcript

1 Unsupervised Learning and Clustering In unsupervised learning you are given a data set with no output classifications Clustering is an important type of unsupervised learning PCA was another type of unsupervised learning. Hierarchical Clustering . Produces a set of . nested clusters . organized as a hierarchical tree. Can be visualized as a . dendrogram. A tree-like diagram that records the sequences of merges or splits. Ensemble Clustering. unlabeled . data. ……. F. inal . partition. clustering algorithm 1. combine. clustering algorithm . N. ……. clustering algorithm 2. Combine multiple partitions of . given. data . 1. Ensembles. CS 478 - Ensembles. 2. A “Holy Grail” of Machine Learning. Automated. Learner. Just a . Data Set. or. just an. explanation. of the problem. Hypothesis. Input Features. Outputs. CS 478 - Ensembles. & . Local Environmental Knowledge. Ken . Salo. . DURP at UIUC. kensalo@illinois.edu. . Personal Introduction. B. Science and Masters in Public Law (LLM);. At . Cape Peninsula University of . Technology (CPUT) I . Lecture outline. Distance/Similarity between data objects. Data objects as geometric data points. Clustering problems and algorithms . K-means. K-median. K-center. What is clustering?. A . grouping. of data objects such that the objects . Safety Training. purpose. Understand the appropriate safety measures and who to contact in an event of an emergency. Aid in the safety of students in your designated building. Become familiar with the evacuation and assembly areas both inside and outside of your building. 100 kg. 75 kg. F = G (m. 1. m. 2. / r. ²). m. 1. = 100 kg. m. 2. = 75 kg. r = 2 m. 2 m. F = 6.67 x 10 . –11. . Nm. ²/kg² . (100kg * 75kg) / (2m)². = 1.25 E -7 N. UGA EXAMPLE 1. Two particles are separated in space with a center to center distance of 0.478 m. The mass of A is 365 kg and the mass of B is 765 kg. Find the magnitude and direction of the net gravitational force acting on the particles.. What is clustering?. Why would we want to cluster?. How would you determine clusters?. How can you do this efficiently?. K-means Clustering. Strengths. Simple iterative method. User provides “K”. Unsupervised . learning. Seeks to organize data . into . “reasonable” . groups. Often based . on some similarity (or distance) measure defined over data . elements. Quantitative characterization may include. Produces a set of . nested clusters . organized as a hierarchical tree. Can be visualized as a . dendrogram. A . tree-like . diagram that records the sequences of merges or splits. Strengths of Hierarchical Clustering. Produces a set of . nested clusters . organized as a hierarchical tree. Can be visualized as a . dendrogram. A tree-like diagram that records the sequences of merges or splits. Strengths of Hierarchical Clustering. purpose. Understand the appropriate safety measures and who to contact in an event of an emergency. Aid in the safety of students in your designated building. Become familiar with the evacuation and assembly areas both inside and outside of your building. Log. 2. transformation. Row centering and normalization. Filtering. Log. 2. Transformation. Log. 2. -transformation makes sure that the noise is independent of the mean and similar differences have the same meaning along the dynamic range of the values.. Randomization tests. Cluster Validity . All clustering algorithms provided with a set of points output a clustering. How . to evaluate the “goodness” of the resulting clusters?. Tricky because .

Download Document

Here is the link to download the presentation.
"CS 478 - Clustering"The content belongs to its owner. You may download and print it for personal use, without modification, and keep all copyright notices. By downloading, you agree to these terms.