/
Approximately counting triangles in sublinear time Approximately counting triangles in sublinear time

Approximately counting triangles in sublinear time - PowerPoint Presentation

debby-jeon
debby-jeon . @debby-jeon
Follow
382 views
Uploaded On 2016-12-06

Approximately counting triangles in sublinear time - PPT Presentation

Talya Eden Tel Aviv University Amit Levi University of Waterloo Dana Ron Tel Aviv University C Seshadhri UC Santa Cruz Counting Triangles Basic graphtheoretic algorithmic ID: 497856

triangles edges uniformly algorithm edges triangles algorithm uniformly amp sample edge graph vertices random procedure query algorithms denote sublinear counting select min

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Approximately counting triangles in subl..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Approximately counting triangles in sublinear time

Talya Eden,

Tel Aviv

University

Amit Levi,

University of Waterloo

Dana

Ron,

Tel Aviv

University

C.

Seshadhri

,

UC Santa CruzSlide2

Counting Triangles

Basic graph-theoretic algorithmic

question that arises in various applications (e.g. Bioinformatics and Social networks).

Has been studied quite extensively in the past:Algorithms for exact counting: O(m3/2) – [Itai&Rodeh], [Chiba&Nisizeki] (m is num of edges)O(m1.41) – [Alon,Yuster&Zwick] (based on matrix multiplication)Algorithms for approximate countingMany algorithms in a variety of models (including streaming) (e.g., [Schank&Wagber], [Tsourakakis], [Avron], [Kolointzakis,Miller,Peng,Tsourakakis], [Chu&Cheng], [Suri&Vassilvitskii], [Arifuzzamna,Khan,Marathe], [Seshadhri,Kolda,Pinar], [Tangwongsan,Pavan,Tirthapura]… )All previous algorithms (exact/approximate) read the entire graphSlide3

Counting Triangles in Sublinear Time

Problem considered by

[

Gonen,R,Shavit], whose main focus was on counting the number of s-stars They considered algorithms that had access to degree queries: what is d(v) for vertex v, and neighbor queries: what is i‘th neighbor of vertex v. Showed that in general no sublinear algorithm for approximately counting num of triangles (in contrast to s-stars) Simple LB construction:

Natural question:

Is there sublinear

alg

if also allow vertex-pair queries (is there an edge

btwn u and

v)?

We answer question

affirmatively

No triangles

Num

of triangles

linear

in

n

(and

m

)Slide4

Our Results

Given

query access

(degree, neighbor, vertex-pair) to graph G with n vertices, m edges, t triangles and parameter (0,1], our algorithm returns  s.t. with high constant probability (1-)t    (1+)t Expected query complexity O(n/t1/3 + m3/2/t) poly(log n,1/)More precisely: O(n/t1/3 + min{m,m3/2/t}) poly(log n,1/)Also give matching lower bound (up to polylog(n) factors and for constant )Slide5

Related Works (Sublinear algs

)

Approximating the

average degree (number of edges) [Feige], [Goldreich,R]Approximating the number of stars [Gonen,R,Shavit]Other sublinear algorithms for approximating graph parameters: MST [Chazelle,Rubinfeld,Trevisan], [Czumaj&Sohler], [Czuman,Ergun,Fortnow,Magen,Newman,Rubinfeld,Sohler], Min VC [Parnas&R], [Nguyan&Onak], [Marko&R], [Yoshida,Yamamoto,Ito], [Onak,R,Rosen,Rubinfled], Max Match [Nguyan&Onak], [Yoshida,Yamamoto,Ito] Testing Triangle-Freeness [Alon,Fischer,Krivelevich,Szegedy], [Alon], [Alon,Kaufman,Krivelevich,R]

Slide6

Towards an algorithm I

Start with following

assumptions

(removed later)Can sample a uniform edgeCan query t(e): num of triangles edge e participates in Also assume that know m (estimate suffices - use [Feige]) and that know constant factor estimate of t (can remove by search) Given these assumptions can get (1) estimate of t:Select q edges uniformly at random. Denote sample by YQuery t(e) for each e in YReturn (eY t(e))/3q)mAnalysisSince et(e) = 3t, Expe[t(e)] = 3t/m, so Exp[

e

Yt

(e)/(3q)] = t/m

To get h.c.p

: Suffices to take q=O((m/t) max

e{t(e)}) (for

const

)

Difficulty:

max

e

{t(e)}

may be largeSlide7

Towards an algorithm II : Bounding t(e)

Modify

t(e)

so that e = (u,v) only assigned triangles (u,v,w) s.t. d(w)>d(u),d(v) (break ties by id).Observe: each triangle assigned to single edge: et(e)=tClaim: t(e)=O(m1/2).Proof: If d(u)  m1/2, then immediate. Otherwise (d(u)>m1/2), num of neighbors w of u with degree at least m1/2 is O(m1/2) (or else get more than m edges).If have oracle access to (modified definition of) t(e) and can sample edges uniformly, get an algorithm with query complexity O((m/t)  max

e

{t(e)}) = O(m3/2

/t)

u

v

wSlide8

Towards an algorithm III:

Removing oracle assumption (for t(e))

Procedure

replacing oracle for t(e) given edge e=(u,v)Consider lower deg endpoint of e=(u,v), wlog, it’s uSelect neighbor w of u unif. at randomQuery the pair (w,v)If (w,v)E and d(w)>d(u),d(v), set (e)=d(u)o.w., (e)=0u

v

w

?

Analysis (for fixed

e

)

Exp

[

(e

)] =

Pr

[

hit tri assigned to

e]d(u) = (t(e)/d(u))

d(u

) = t(e)

If

d(u)  m

1/2

then

(e

)

 m

1/2

Otherwise, to reduce variance

“internal to procedure”,

let

(e

)

be average value over

d(u)/m

1/2

repetitions of above.

Resulting algorithm

for estimating

t

:

Select q=O(m3/2

/t)

edges uniformly at random. Denote sample by

YRun procedure on each

e in Y

to get (e)

Return

(m/q)eY

(e)

Expected query complexity

O(m

3/2

/t)

Slide9

Towards an algorithm IV:

Removing assumption on

unif

edge selectionIdea: Select subset S of vertices unif at random, consider set of incident (“ordered”) edges E(S) = {(u,v): uS, v(u)}If query deg of all S, can sample edge unif in E(S)Algorithm Select s=O(n/t1/3) vertices uniformly at random. Denote sample by SSelect q=O(m3/2/t) edges uniformly at random in E(S) Denote sample by YRun procedure on each e in Y to get (e)Return (n/2s)(|E(S)|/q)eY (

e)

S

u

Can show that by modifying

t(e)

and procedure that computes

(e

),

get

algorithm that computes

(1)

estimate of

t

by

performing

O

(n/t

1/3

+ min{m

3/2

/

t,m

})

queries in expectation.

Exp

[

(

n/2s

)(|E(S)|/q)

e

Y

(e

)

]

= (n/2s)((

sd

avg

)/q)q(t/m) = t

(

almost..

)Slide10

Towards an algorithm IV:

Removing assumption on

unif

edge selectionAlgorithm (almost)Select s=O(n/t1/3) vertices uniformly at random. Denote sample by SSelect q=O(m3/2/t) edges uniformly at random in E(S) Denote sample by YRun procedure on each e in Y to get (e)Return (n/2s)(|E(S)|/q)eY (e)What’s missing?By slightly generalizing what we have already shown, whp, (|E(S)|/q)eY (e) is a good approximation of eE(S)t(e).If we write 

eE

(S)t(e

) as

vSt

(v), where t(v) =

eE

(v)

t(e)

Would like to show that

(n/s)

vS

t

(v

)

is close to

v

V

t

(v

)

=2t

We show this for variant of

t(v)

(

t(e)

) which requires modifying the procedure for

(e

).

Slide11

Lower bound idea(s)

Recal

:

(n/t1/3 + min{m3/2/t,m}) LB of (n/t1/3) is a simple “hitting” lower bound: With fewer than n/t1/3 queries cannot distinguish between:An empty graph - no triangles, A graph containing a clique of over t1/3 vertices, and n-t1/3 independent set – (t) triangles.Slide12

Lower bound idea(s) continued

LB of

(m

3/2/t ) (for tm1/2)Basic structure: Complete bipartite graph with both sides of size m1/2 (remaining vertices, independent set). No triangles.Consider adding edges btwn vertices on lhs of bipartite graph. Each edge gives m1/2 triangles. (For example: t=(m), add (random) perfect matching.)Small difficulty: degrees of lhs vertices “give it away”. Take care by removing bipartite edges and adding matching edges on rhs.

Intuition for LB:

Let

k

be number of added edges so that

k=t/m

1/2

.

Probability of “

hitting

” added edge (or removed edge) is

k/m=t/m

3/2

.Slide13

Summary

Present algorithm computing

s.t. with high constant probability (1-)t    (1+)t Expected query complexity O(n/t1/3 + min{m,m3/2/t}) poly(log n,1/)Main ideas:Assign triangles to edges so that each edge e assigned t(e)=O(m1/2) triangles (if had oracle to t(e) and could sample edges uniformly, would be done)Give simple procedure for computing r.v. (e) s.t. Exp[(e)]=t(e) (if could sample edges uniformly, would be done)Replace uniform sampling of edges from entire graph by uniformly sampling edges incident to uniformly sampled subset of vertices.Matching lower bound (up to polylog

(n) factors and for constant

)Slide14

Thanks