Talya Eden Tel Aviv University Amit Levi University of Waterloo Dana Ron Tel Aviv University C Seshadhri UC Santa Cruz Counting Triangles Basic graphtheoretic algorithmic ID: 497856
Download Presentation The PPT/PDF document "Approximately counting triangles in subl..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Approximately counting triangles in sublinear time
Talya Eden,
Tel Aviv
University
Amit Levi,
University of Waterloo
Dana
Ron,
Tel Aviv
University
C.
Seshadhri
,
UC Santa CruzSlide2
Counting Triangles
Basic graph-theoretic algorithmic
question that arises in various applications (e.g. Bioinformatics and Social networks).
Has been studied quite extensively in the past:Algorithms for exact counting: O(m3/2) – [Itai&Rodeh], [Chiba&Nisizeki] (m is num of edges)O(m1.41) – [Alon,Yuster&Zwick] (based on matrix multiplication)Algorithms for approximate countingMany algorithms in a variety of models (including streaming) (e.g., [Schank&Wagber], [Tsourakakis], [Avron], [Kolointzakis,Miller,Peng,Tsourakakis], [Chu&Cheng], [Suri&Vassilvitskii], [Arifuzzamna,Khan,Marathe], [Seshadhri,Kolda,Pinar], [Tangwongsan,Pavan,Tirthapura]… )All previous algorithms (exact/approximate) read the entire graphSlide3
Counting Triangles in Sublinear Time
Problem considered by
[
Gonen,R,Shavit], whose main focus was on counting the number of s-stars They considered algorithms that had access to degree queries: what is d(v) for vertex v, and neighbor queries: what is i‘th neighbor of vertex v. Showed that in general no sublinear algorithm for approximately counting num of triangles (in contrast to s-stars) Simple LB construction:
Natural question:
Is there sublinear
alg
if also allow vertex-pair queries (is there an edge
btwn u and
v)?
We answer question
affirmatively
No triangles
Num
of triangles
linear
in
n
(and
m
)Slide4
Our Results
Given
query access
(degree, neighbor, vertex-pair) to graph G with n vertices, m edges, t triangles and parameter (0,1], our algorithm returns s.t. with high constant probability (1-)t (1+)t Expected query complexity O(n/t1/3 + m3/2/t) poly(log n,1/)More precisely: O(n/t1/3 + min{m,m3/2/t}) poly(log n,1/)Also give matching lower bound (up to polylog(n) factors and for constant )Slide5
Related Works (Sublinear algs
)
Approximating the
average degree (number of edges) [Feige], [Goldreich,R]Approximating the number of stars [Gonen,R,Shavit]Other sublinear algorithms for approximating graph parameters: MST [Chazelle,Rubinfeld,Trevisan], [Czumaj&Sohler], [Czuman,Ergun,Fortnow,Magen,Newman,Rubinfeld,Sohler], Min VC [Parnas&R], [Nguyan&Onak], [Marko&R], [Yoshida,Yamamoto,Ito], [Onak,R,Rosen,Rubinfled], Max Match [Nguyan&Onak], [Yoshida,Yamamoto,Ito] Testing Triangle-Freeness [Alon,Fischer,Krivelevich,Szegedy], [Alon], [Alon,Kaufman,Krivelevich,R]
Slide6
Towards an algorithm I
Start with following
assumptions
(removed later)Can sample a uniform edgeCan query t(e): num of triangles edge e participates in Also assume that know m (estimate suffices - use [Feige]) and that know constant factor estimate of t (can remove by search) Given these assumptions can get (1) estimate of t:Select q edges uniformly at random. Denote sample by YQuery t(e) for each e in YReturn (eY t(e))/3q)mAnalysisSince et(e) = 3t, Expe[t(e)] = 3t/m, so Exp[
e
Yt
(e)/(3q)] = t/m
To get h.c.p
: Suffices to take q=O((m/t) max
e{t(e)}) (for
const
)
Difficulty:
max
e
{t(e)}
may be largeSlide7
Towards an algorithm II : Bounding t(e)
Modify
t(e)
so that e = (u,v) only assigned triangles (u,v,w) s.t. d(w)>d(u),d(v) (break ties by id).Observe: each triangle assigned to single edge: et(e)=tClaim: t(e)=O(m1/2).Proof: If d(u) m1/2, then immediate. Otherwise (d(u)>m1/2), num of neighbors w of u with degree at least m1/2 is O(m1/2) (or else get more than m edges).If have oracle access to (modified definition of) t(e) and can sample edges uniformly, get an algorithm with query complexity O((m/t) max
e
{t(e)}) = O(m3/2
/t)
u
v
wSlide8
Towards an algorithm III:
Removing oracle assumption (for t(e))
Procedure
replacing oracle for t(e) given edge e=(u,v)Consider lower deg endpoint of e=(u,v), wlog, it’s uSelect neighbor w of u unif. at randomQuery the pair (w,v)If (w,v)E and d(w)>d(u),d(v), set (e)=d(u)o.w., (e)=0u
v
w
?
Analysis (for fixed
e
)
Exp
[
(e
)] =
Pr
[
hit tri assigned to
e]d(u) = (t(e)/d(u))
d(u
) = t(e)
If
d(u) m
1/2
then
(e
)
m
1/2
Otherwise, to reduce variance
“internal to procedure”,
let
(e
)
be average value over
d(u)/m
1/2
repetitions of above.
Resulting algorithm
for estimating
t
:
Select q=O(m3/2
/t)
edges uniformly at random. Denote sample by
YRun procedure on each
e in Y
to get (e)
Return
(m/q)eY
(e)
Expected query complexity
O(m
3/2
/t)
Slide9
Towards an algorithm IV:
Removing assumption on
unif
edge selectionIdea: Select subset S of vertices unif at random, consider set of incident (“ordered”) edges E(S) = {(u,v): uS, v(u)}If query deg of all S, can sample edge unif in E(S)Algorithm Select s=O(n/t1/3) vertices uniformly at random. Denote sample by SSelect q=O(m3/2/t) edges uniformly at random in E(S) Denote sample by YRun procedure on each e in Y to get (e)Return (n/2s)(|E(S)|/q)eY (
e)
S
u
Can show that by modifying
t(e)
and procedure that computes
(e
),
get
algorithm that computes
(1)
estimate of
t
by
performing
O
(n/t
1/3
+ min{m
3/2
/
t,m
})
queries in expectation.
Exp
[
(
n/2s
)(|E(S)|/q)
e
Y
(e
)
]
= (n/2s)((
sd
avg
)/q)q(t/m) = t
(
almost..
)Slide10
Towards an algorithm IV:
Removing assumption on
unif
edge selectionAlgorithm (almost)Select s=O(n/t1/3) vertices uniformly at random. Denote sample by SSelect q=O(m3/2/t) edges uniformly at random in E(S) Denote sample by YRun procedure on each e in Y to get (e)Return (n/2s)(|E(S)|/q)eY (e)What’s missing?By slightly generalizing what we have already shown, whp, (|E(S)|/q)eY (e) is a good approximation of eE(S)t(e).If we write
eE
(S)t(e
) as
vSt
(v), where t(v) =
eE
(v)
t(e)
Would like to show that
(n/s)
vS
t
(v
)
is close to
v
V
t
(v
)
=2t
We show this for variant of
t(v)
(
t(e)
) which requires modifying the procedure for
(e
).
Slide11
Lower bound idea(s)
Recal
:
(n/t1/3 + min{m3/2/t,m}) LB of (n/t1/3) is a simple “hitting” lower bound: With fewer than n/t1/3 queries cannot distinguish between:An empty graph - no triangles, A graph containing a clique of over t1/3 vertices, and n-t1/3 independent set – (t) triangles.Slide12
Lower bound idea(s) continued
LB of
(m
3/2/t ) (for tm1/2)Basic structure: Complete bipartite graph with both sides of size m1/2 (remaining vertices, independent set). No triangles.Consider adding edges btwn vertices on lhs of bipartite graph. Each edge gives m1/2 triangles. (For example: t=(m), add (random) perfect matching.)Small difficulty: degrees of lhs vertices “give it away”. Take care by removing bipartite edges and adding matching edges on rhs.
Intuition for LB:
Let
k
be number of added edges so that
k=t/m
1/2
.
Probability of “
hitting
” added edge (or removed edge) is
k/m=t/m
3/2
.Slide13
Summary
Present algorithm computing
s.t. with high constant probability (1-)t (1+)t Expected query complexity O(n/t1/3 + min{m,m3/2/t}) poly(log n,1/)Main ideas:Assign triangles to edges so that each edge e assigned t(e)=O(m1/2) triangles (if had oracle to t(e) and could sample edges uniformly, would be done)Give simple procedure for computing r.v. (e) s.t. Exp[(e)]=t(e) (if could sample edges uniformly, would be done)Replace uniform sampling of edges from entire graph by uniformly sampling edges incident to uniformly sampled subset of vertices.Matching lower bound (up to polylog
(n) factors and for constant
)Slide14
Thanks