Mateos and Georgios B Giannakis Dynamic Structural Equation Models for Tracking Cascades over Social Networks Acknowledgments NSF ECCS Grant No 1202135 and NSF AST Grant No 1247885 December 17 2013 ID: 465965
Download Presentation The PPT/PDF document "Brian Baingana, Gonzalo" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Brian Baingana, Gonzalo Mateos and Georgios B. Giannakis
Dynamic Structural Equation Models for Tracking Cascades over Social Networks
Acknowledgments: NSF ECCS Grant No. 1202135 and NSF AST Grant No. 1247885
December 17, 2013Slide2
Context and motivation2
P
opular news stories
I
nfectious diseases
B
uying patterns
P
ropagate in
cascades
over
social networks
N
etwork topologies:
U
nobservable, dynamic, sparse
Topology inference vital:
V
iral advertising, healthcare policy
B.
Baingana
, G
.
Mateos,
and G. B.
Giannakis, ``Dynamic structural equation models for social network topology inference,'' IEEE J. of Selected Topics in Signal Processing, 2013 (arXiv:1309.6683 [cs.SI])
Goal:
track unobservable time-varying network topology from cascade traces
ContagionsSlide3
Contributions in context3
Contributions
Dynamic SEM for tracking slowly-varying
sparse networks
Accounting for external influences – Identifiability [Bazerque-Baingana-GG’13]
ADMM-based topology inference algorithm
Related work
Static, undirected networks e.g., [Meinshausen-Buhlmann’06], [Friedman et al’07]
MLE-based dynamic network inference [Rodriguez-Leskovec’13]
Time-invariant sparse SEM for gene network inference
[Cai-Bazerque-GG’13
]
Structural equation models (SEM):
[Goldberger’72]
S
tatistical framework for modeling causal interactions (
endo/exogenous effects) Used in economics, psychometrics, social sciences, genetics… [Pearl’09]
J. Pearl,
Causality: Models, Reasoning, and Inference, 2nd Ed., Cambridge Univ. Press, 2009Slide4
Cascades over dynamic networks4
Example: N = 16 websites, C = 2 news event, T = 2 days
Unknown (asymmetric) adjacency matrices
N-node directed, dynamic network, C cascades observed over
Event #1
Event #2
Cascade
infection times
depend on:
Causal interactions among nodes (topological influences)
Susceptibility to infection (non-topological influences)Slide5
Model and problem statement5
Captures (directed)
topological and external influences
Problem statement:
Data:
Infection time of node
i
by contagion
c
during interval
t
:
external influence
u
n-modeled dynamics
D
ynamic SEMSlide6
Exponentially-weighted LS criterion6
Structural
spatio-temporal properties Slowly time-varying topology
Sparse edge connectivity,
Sparsity
-promoting
exponentially-weighted
least-squares (LS) estimator
(P1)
Edge
sparsity
encouraged by -norm regularization with
Tracking
dynamic topologies possible if Slide7
Topology-tracking algorithm7
Alternating-direction method of multipliers (ADMM), e.g., [Bertsekas-Tsitsiklis’89]
Each time interval
(P2)
Acquire new data
Recursively update data sample (cross-)correlations
Solve (P2) using ADMM
Attractive features
Provably convergent, close-form updates (unconstrained LS and soft-
thresholding
)
Fixed computational cost and memory storage requirement per Slide8
ADMM iterations8
Sequential data terms: , ,
can be updated recursively:
denotes row
i
of Slide9
Simulation setup Kronecker graph [
Leskovec et al’10]: N = 64, seed graph
cascades, ,
Non-zero edge weights varied for
Uniform random selection from
Non-smooth edge weight variation
9Slide10
Simulation results
Algorithm parameters
Initialization
Error performance
10Slide11
The rise of Kim Jong-un
t = 10 weeks
t = 40 weeks
W
eb mentions of
“Kim Jong-un”
tracked from March’11 to Feb.’12
N = 360 websites, C = 466 cascades, T = 45 weeks
11
Data
:
SNAP’s “Web and blog datasets”
http
://
snap.stanford.edu
/
infopath/data.html
Kim Jong-un – Supreme leader of N. Korea
Increased media frenzy following Kim Jong-
un’s ascent to power in 2011Slide12
LinkedIn goes publicTracking phrase
“Reid Hoffman” between March’11 and Feb.’12
N = 125 websites, C = 85 cascades, T = 41 weeks
t
= 5 weeks
t
= 30 weeks
12
Data
:
SNAP’s “Web and blog datasets”
http
://
snap.stanford.edu
/
infopath
/
data.html
US sites
Datasets include other interesting “memes”: “Amy Winehouse”, “Syria”,
“Wikileaks”
,….Slide13
Conclusions13
Dynamic SEM
for modeling node infection times due to cascades
Topological influences and external sources of information diffusion
Accounts for edge
sparsity
typical of social networks
ADMM algorithm for tracking slowly-varying network topologies
Corroborating tests with synthetic and real cascades of online social media
Key events manifested as network connectivity changes
Thank You!
Ongoing and future research
Identifiabiality
of sparse and dynamic SEMs
Statistical model consistency tied to
L
arge-scale
MapReduce
/GraphLab implementations
Kernel extensions for network topology forecastingSlide14
ADMM closed-form updates14
Update with equality constraints:
,
:
Update by
soft-
thresholding
operator