Alimir Olivettr Artero Maria Cristina Ferreiara de Oliveira Haim levkowitz Information Visualization 2004 Abstract The idea is inspired by traditional image processing techniques such as grayscale manipulation ID: 196324
Download Presentation The PPT/PDF document "Uncovering Clusters in Crowded Parallel ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Uncovering Clusters in Crowded Parallel Coordinates Visualizations
Alimir
Olivettr
Artero
, Maria Cristina
Ferreiara
de Oliveira,
Haim
levkowitz
Information Visualization 2004Slide2
Abstract
The idea is inspired by traditional image processing techniques such as grayscale manipulation.
Reducing
visual clutter and
allowing the analyst to observe
relevant
patterns in the parallel coordinates.Slide3
Introduction
The strong overlapping of graphical markers hampers the user’s ability to identify patterns in the data when the number of records and the dimensionality of the data set are high.
It is important to avoid displaying irrelevant
information and enhancing the presentation of the
useful one
.Slide4
Introduction
Tackling this problem with a strategy that computes frequency and density information, and uses them in parallel coordinates visualizations to filter out the information to be presented to the user.Slide5
Frequency Information
The frequency function for a n-dimensional variable x is defined as :
where h is the size of bins,
σ
is the number of records
in the same bin, m is the number of all records.Slide6
Frequency Information
A two-dimensional matrix is generated to store the frequency of each pair of attribute values, which is then used to draw the polygonal lines for the records in the data set.
For a data set with n attributes, n-1 frequency matrices are generated, one for each pair of attributes.Slide7
Frequency Information
All the non-zero matrix elements generate a line segment in the visualization and the pixel intensity used to draw the line segment.
Each line segment is drawn with the
Bresenham
algorithm:Slide8
Interactive Parallel Coordinates Frequency and Density plots
The intensity of the pixel
with coordinates (
q,p
) is given by:
Square wave smoothing filter is used for each pixel:
Slide9
Interactive Parallel Coordinates Frequency and Density plots
S is a scaling factor.Slide10
Density Information
The density function for a n-dimensional variable x is defined as :
where
d
i
is the
i-th
record of the data set
and
K is the
kernel function, the
parameter
defines a smoothing
factor or
bandwidth.Slide11
visualizations of the Pollen data
a)
Frequency Plot b
)
Density PlotSlide12
Interactive high-dimensional clustering with IPC plotSlide13
Interactive high-dimensional clustering with IPC plotSlide14
Interactive high-dimensional clustering with IPC plotSlide15
Interactive high-dimensional clustering with IPC plotSlide16
Interactive high-dimensional clustering with IPC plotSlide17
Performance
Running times in seconds for the proposed
algorithm with
different values of m and n.Slide18
Conclusions
The new plots support interactive data
exploration of
large and high-dimensional data sets, allowing users to
remove noise
and highlight areas with high concentration of data
.
The proposed algorithms use only
integer arithmetic to compute the frequency matrices.