PPT-Parallel Clustering of High-Dimensional Social Media Data S

Author : pamella-moone | Published Date : 2017-06-19

1 Xiaoming Gao Emilio Ferrara Judy Qiu School of Informatics and Computing Indiana University Outline Background and motivation Sequential social media stream clustering

Presentation Embed Code

Download Presentation

Download Presentation The PPT/PDF document "Parallel Clustering of High-Dimensional ..." is the property of its rightful owner. Permission is granted to download and print the materials on this website for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Parallel Clustering of High-Dimensional Social Media Data S: Transcript

1 Xiaoming Gao Emilio Ferrara Judy Qiu School of Informatics and Computing Indiana University Outline Background and motivation Sequential social media stream clustering algorithm Parallel algorithm. Unlike sequential algorithms parallel algorithms cannot be analyzed very well in isolation One of our primary measures of goodness of a parallel system will be its scalability Scalability is the ability of a parallel system to take advantage of incr Brendan and Yifang . April . 21 . 2015. Pre-knowledge. We define a set A, and we find the element that minimizes the error. We can think of as a sample of . Where is the point in C closest to X. . Supervised & Unsupervised Learning. Supervised learning. Classification. The number of classes and class labels of data elements in training data is known beforehand. Unsupervised learning. Clustering. Peter Andras. School of Computing and Mathematics. Keele University. p.andras@keele.ac.uk. Overview. High-dimensional functions and low-dimensional manifolds. Manifold mapping. Function approximation over low-dimensional projections. Lecture outline. Distance/Similarity between data objects. Data objects as geometric data points. Clustering problems and algorithms . K-means. K-median. K-center. What is clustering?. A . grouping. of data objects such that the objects . IoT. and Streaming Data. . IC2E Internet of Things Panel. Judy Qiu. Indiana University. Event Processing Programming Models. Query Based. Complex Event processing. SQL like languages. Programming APIs. Suresh Merugu, IITR. Overview. Definition of Clustering. Existing Clustering Methods. Clustering Examples. Classification. Classification Examples. Cluster. : A collection of data objects. Similar to one another within the same cluster. Fuzzy . k. -means. Self-organizing maps. Evaluation of clustering results. Figures and equations from Data Clustering by . Gan. et al.. Center-based clustering. Have objective functions which define how good a solution is;. issue in . computing a representative simplicial complex. . Mapper does . not place any conditions on the clustering . algorithm. Thus . any domain-specific clustering algorithm can . be used.. We . Unsupervised . learning. Seeks to organize data . into . “reasonable” . groups. Often based . on some similarity (or distance) measure defined over data . elements. Quantitative characterization may include. Lecture outline. Distance/Similarity between data objects. Data objects as geometric data points. Clustering problems and algorithms . K-means. K-median. K-center. What is clustering?. A . grouping. of data objects such that the objects . 1. Mark Stamp. K-Means for Malware Classification. Clustering Applications. 2. Chinmayee. . Annachhatre. Mark Stamp. Quest for the Holy . Grail. Holy Grail of malware research is to detect previously unseen malware. Produces a set of . nested clusters . organized as a hierarchical tree. Can be visualized as a . dendrogram. A tree-like diagram that records the sequences of merges or splits. Strengths of Hierarchical Clustering. It’s no secret that this world we live in can be pretty stressful sometimes. If you find yourself feeling out-of-sorts, pick up a book.According to a recent study, reading can significantly reduce stress levels. In as little as six minutes, you can reduce your stress levels by 68%.

Download Document

Here is the link to download the presentation.
"Parallel Clustering of High-Dimensional Social Media Data S"The content belongs to its owner. You may download and print it for personal use, without modification, and keep all copyright notices. By downloading, you agree to these terms.