in Tensors with Quality Guarantees Kijung Shin Bryan Hooi Christos Faloutsos Carnegie Mellon University Motivation Review Fraud MZoom Fast DenseBlock Detection in Tensors with Quality Guarantees ID: 667268
Download Presentation The PPT/PDF document "M-Zoom: Fast Dense-Block Detection" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
M-Zoom:Fast Dense-Block Detection in Tensors with Quality Guarantees
Kijung Shin, Bryan Hooi, Christos FaloutsosCarnegie Mellon University Slide2
Motivation: Review FraudM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 2/40
Bob’sCarol’s
Alice’s
Alice
Introduction
Experiments
Conclusion
Proposed MethodSlide3
Fraud Forms Dense BlocksM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 3/40
Restaurants
Accounts
Restaurants
Accounts
Adjacency Matrix
Introduction
Experiments
Conclusion
Proposed MethodSlide4
Problem: Natural Dense BlocksQuestion. How can we distinguish them?M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 4/40
Restaurants
Accounts
Adjacency Matrix
Introduction
Experiments
Conclusion
Proposed Method
suspicious
dense blocks
formed by fraudsters
natural dense blocks
(core, community, etc.)Slide5
Solution: Tensor ModelingNatural dense blocks are sparse on the time axis (formed gradually)Suspicious dense blocks are also dense on the time axis (due to synchronous behavior)
Suspicious dense blocks are denser than natural dense blocks in the tensor modelM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 5/40RestaurantsTimestamp
Sparse
Dense
Accounts
Introduction
Experiments
Conclusion
Proposed MethodSlide6
Solution: Tensor Modeling (Cont.)Any side information can be used instead of/in addition to time in review datasets
Using multiple side information high-order tensors
M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 6
/40
Introduction
Experiments
Conclusion
Proposed Method
IP Address
keywords
Number of starsSlide7
Anomaly/Fraud in TensorsDense blocks signal anomalies/fraud in many tensor datasetsM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 7/40
Introduction
Experiments
Conclusion
Proposed Method
Src
IP
Dst
IP
Timestamp
Src
User
Dst
User
Timestamp
User
Page
Timestamp
TCP Dumps
Wikipedia
Revision History
Time-evolving
Social NetworkSlide8
Research QuestionQuestion: Given a large-scale high-order tensor, how can we find dense blocks in it?M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 8/40
Introduction
Experiments
Conclusion
Proposed MethodSlide9
Road MapM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 9/40
Introduction
Proposed Method:
M-Zoom
Terminologies and Problem Definition <<
Algorithm
Experiments
ConclusionSlide10
TerminologiesAssume a block (subtensor) in a 3-way tensor
:
:
: sum of entries in
M-Zoom:
Fast Dense-Block Detection in Tensors with Quality Guarantees
10
/40
Introduction
Experiments
Conclusion
Proposed Method
Slide11
Density MeasuresDensity measures: Traditional Density:
- Maximized by a single entry with maximum value
Arithmetic Avg. Degree:
Geometric Avg. Degree:
Suspiciousness (Jiang et al. 2015) :
Note that our method is not limited by specific density measures
M-Zoom:
Fast Dense-Block Detection in Tensors with Quality Guarantees
11
/40
Introduction
Experiments
Conclusion
Proposed MethodSlide12
Problem DefinitionGiven: (1) : a tensor, (2) : a density measure, (3)
: the number of blocks we aim to findFind: distinct dense blocks maximizing M-Zoom:
Fast Dense-Block Detection in Tensors with Quality Guarantees 12/40
Introduction
Experiments
Conclusion
Proposed Method
Slide13
RequirementsOur goal is to design an approximation algorithmM-Zoom, our proposed method, satisfies all the requirementsM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 13/40
Scalable: runs in near-linear timeAccurate: provides an accuracy guaranteeFlexible: works well with various density metrics Effective: produces meaningful results in practice
0
Introduction
Experiments
Conclusion
Proposed MethodSlide14
Road MapM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 14/40
Introduction
Proposed Method:
M-Zoom
Terminologies and Problem Definition
Algorithm <<
Experiments
ConclusionSlide15
Single Dense Block DetectionGreedy search methodStarts from the entire tensorM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 15/40
Introduction
Experiments
Conclusion
Proposed Method
3 0
6 1
2 0 0
1 0 1
0
0
Slide16
Single Dense Block Detection (cont.)Remove a slice to maximize density
M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 16/40
Introduction
Experiments
Conclusion
Proposed Method
3 0
6 1
2 0 0
Slide17
M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 17/40
Introduction
Experiments
Conclusion
Proposed Method
3
6
2 0
3.3
Remove a slice to maximize density
Single Dense Block Detection (cont.)Slide18
M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 18/40
Introduction
Experiments
Conclusion
Proposed Method
3
6
2 0
Remove a slice to maximize density
Single Dense Block Detection (cont.)Slide19
Until all slices are removedM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 19/40
Introduction
Experiments
Conclusion
Proposed Method
Single Dense Block Detection (cont.)Slide20
Output: return the densest block so farM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 20/40
Introduction
Experiments
Conclusion
Proposed Method
3
6
2 0
Single Dense Block Detection (cont.)Slide21
Speeding Up ProcessTheorem 1 [Remove Minimum Mass First]Among slices in the same mode, removing the slice with minimum mass is always bestM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 21/40
Introduction
Experiments
Conclusion
Proposed Method
12
> 9 >
2Slide22
Accuracy GuaranteeTheorem 2 [Approximation Guarantee]
M-Zoom:
Fast Dense-Block Detection in Tensors with Quality Guarantees
22
/40
Introduction
Experiments
Conclusion
Proposed Method
M-Zoom Result
Input Tensor
Order
Densest Block
Scalable
: runs in near-linear time
Accurate
: provides an accuracy guarantee
Flexible
: works well with various density metrics
Effective
: produces meaningful results in practice
0.37
0
Properties of
M-Zoom:Slide23
Handling Multiple BlocksRemove found blocks before finding othersM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 23/40
Find & Remove
Find & Remove
Find & Remove
Introduction
Experiments
Conclusion
Proposed Method
RestoreSlide24
Road MapM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 24/40
Introduction
Proposed Method:
M-Zoom
Terminologies and Problem Definition
Algorithm
Experiments <<
ConclusionSlide25
Exp1. Scalability TestGoal: Measure scalability w.r.t. each input factorM-Zoom scales almost linearly!M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 25/40
#Non-ZerosOrderDimensionality (#Slices)#Blocks to Find
Actual Running Time Linear Increase
slope=1
Introduction
Experiments
Conclusion
Proposed MethodSlide26
Exp1. Scalability Test (cont.)M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 26/40Scalable: runs in near-linear time
Accurate: provides an accuracy guaranteeFlexible: works well with various density metrics Effective: produces meaningful results in practice0.37 0
Properties of
M-Zoom:
Introduction
Experiments
Conclusion
Proposed MethodSlide27
Exp2. Speed-Accuracy TestM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 27/40
Introduction
Experiments
Conclusion
Proposed Method
Goal
: compare
speed
and
accuracy
of dense-block detection methods with
various density measures
Methods Compared:
M-Zoom:
Proposed Method
CPD:
Tensor Decomposition
CrossSpot
(Jiang et al. 2015): Local Search MethodSlide28
Exp2. Speed-Accuracy Test (cont.)M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 28/40
Introduction
Experiments
Conclusion
Proposed Method
Data
: Music Review (User X Music X Time X Rate)
Arithmetic Average Degree (
)
Geometric Average Degree (
)
Suspiciousness (
)
M-Zoom
CPD
CrossSpot
(random seed)
CrossSpot
(CPD seed)
XSlide29
Exp2. Speed-Accuracy Test (cont.)M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees
29/40
Introduction
Experiments
Conclusion
Proposed Method
Arithmetic Average Degree (
)
Geometric Average Degree (
)
Suspiciousness (
)
M-Zoom
CPD
CrossSpot
(random seed)
CrossSpot
(CPD seed)
X
Data
: Wikipedia Revision History (User X Page X Time)Slide30
Exp2. Speed-Accuracy Test (cont.)M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 30/40
Introduction
Experiments
Conclusion
Proposed Method
M-Zoom
is up to 100
faster than its competitors
M-Zoom
shows comparable accuracy with its competitors regardless of density measures
Scalable
: runs in near-linear time
Accurate
: provides an accuracy guarantee
Flexible
: works well with various density metrics
Effective
: producing meaningful results in practice
0.37
0
Properties of
M-Zoom:Slide31
Exp3. Discovery in PracticeM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 31/40
Introduction
Experiments
Conclusion
Proposed Method
In Korean Wikipedia revision history (User X Page X Timestamp)
First three blocks found by M-Zoom
11 users
revised
10 pages
2,305 times
within
16 hoursSlide32
Exp3. Discovery in Practice (cont.)M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 32/40
Introduction
Experiments
Conclusion
Proposed Method
In Korean Wikipedia revision history (User X Page X Timestamp),
M-Zoom
detects
edit wars
First three blocks found by M-Zoom
11 users
revised
10 pages
2,305 times
within
16 hoursSlide33
Exp3. Discovery in Practice (cont.)M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 33/40
Introduction
Experiments
Conclusion
Proposed Method
In English Wikipedia revision history (User X Page X Timestamp)
First three blocks found by M-Zoom
8
accounts
revised
12 pages
2.5 million timesSlide34
Exp3. Discovery in Practice (cont.)M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 34/40
Introduction
Experiments
Conclusion
Proposed Method
In English Wikipedia revision history (User X Page X Timestamp),
M-Zoom
detects
bot activities
First three blocks found by M-Zoom
8
accounts
revised
12 pages
2.5 million timesSlide35
Exp3. Discovery in Practice (cont.)In TCP Dumps (7-way tensor), M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 35/40
Introduction
Experiments
Conclusion
Proposed Method
First three blocks found by M-ZoomSlide36
Exp3. Discovery in Practice (cont.)In TCP Dumps (7-way tensor), M-Zoom detects network attacks with near-perfect accuracy (AUC=0.98)M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 36/40
Introduction
Experiments
Conclusion
Proposed Method
TCP connections
forming the densest blocks
are
network attacks
with
near-perfect
accuracy
First three blocks found by M-ZoomSlide37
Exp3. Discovery in Practice (cont.)M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 37/40
Introduction
Experiments
Conclusion
Proposed Method
Scalable
: runs in near-linear time
Accurate
: provides an accuracy guarantee
Flexible
: works well with various density metrics
Effective
: produces meaningful results in practice
0.37
0
Properties of
M-Zoom:Slide38
Road MapM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 38/40
Introduction
Proposed Method:
M-Zoom
Terminologies and Problem Definition
Algorithm
Experiments
Conclusion <<Slide39
ConclusionM-Zoom (Multi-dimensional Zoom):Dense-Block Detection in tensors M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 39/40
Introduction
Experiments
Conclusion
Proposed Method
Scalable
#Non-Zeros
Accurate
[Approximation Guarantee]
Flexible
EffectiveSlide40
Thank you!Source codes and datasets used in the paper are available at https://github.com/kijungs/mzoomM-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees 40/40