/
Statistical properties of network community structure Statistical properties of network community structure

Statistical properties of network community structure - PowerPoint Presentation

test
test . @test
Follow
393 views
Uploaded On 2016-07-04

Statistical properties of network community structure - PPT Presentation

Jure Leskovec CMU Kevin Lang Anirban Dasgupta and Michael Mahoney Yahoo Research Network communities Communities Sets of nodes with lots of connections inside and few to outside ID: 390649

network community ncp plot community network plot ncp communities edges score nodes networks whiskers cut core structure social small quality large fire

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Statistical properties of network commun..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Statistical properties of network community structure

Jure Leskovec, CMU

Kevin Lang, Anirban Dasgupta and Michael MahoneyYahoo! ResearchSlide2

Network communities

Communities:Sets of nodes with lots of connections

inside and few to outside (the rest of the network)Assumption:Networks are (hierarchically) composed of

communities (modules)

Communities, clusters, groups, modules

Our question:

Are large networks really like this?

2Slide3

Community score (quality)

How community like is a set of nodes?

Need a natural intuitive measure

Conductance (normalized cut)

Φ(S) = # edges cut / # edges inside

Small

Φ(S

)

corresponds to

more community-like sets of nodes

S

S’

3Slide4

Community score (quality)

Score:

Φ(S

) = # edges cut / # edges inside

What is “best” community of 5 nodes?

4Slide5

Community score (quality)

Score:

Φ(S

) = # edges cut / # edges inside

Bad community

Φ

=5/6 = 0.83

What is “best” community of 5 nodes?

5Slide6

Community score (quality)

Score:

Φ(S

) = # edges cut / # edges inside

Better community

Φ

=5/7 = 0.7

Bad community

Φ

=2/5 = 0.4

What is “best” community of 5 nodes?

6Slide7

Community score (quality)

Score: Φ(S) = # edges cut / # edges inside

Better community

Φ=5/7 = 0.7

Bad community

Φ

=2/5 = 0.4

Best community

Φ

=2/8 = 0.25

What is “best” community of 5 nodes?

7Slide8

Network Community Profile Plot

We define: Network community profile (NCP

) plot Plot the score of best community of size k

Search over all subsets of size k and find best:

Φ

(k=5) = 0.25

NCP plot is intractable to compute

Use a

pproximation

algorithm

8Slide9

NCP plot: Small Social NetworkDolphin social networkTwo communities of dolphins

NCP plot

Network

9Slide10

NCP plot: Zachary’s karate clubZachary’s university karate club social network

During the study club split into 2The split (squares vs. circles) corresponds to cut B

NCP plot

Network

10Slide11

NCP plot: Network Science

Collaborations between scientists in Networks

NCP plot

Network

11Slide12

Geometric and Hierarchical graphs

Hierarchical network

Geometric (grid-like) network

– Small social networks

– Geometric and

– Hierarchical network

have

downward

NCP plot

12Slide13

Our work: Large networks

Previously researchers examined community structure of small

networks (~100 nodes)We examined more than 70 different large social and information networks

Large

real-world networks look very different!

13Slide14

Example of our findings

Typical example:

General relativity collaboration network (4,158 nodes, 13,422 edges)

14Slide15

Community score

Community size

NCP: LiveJournal (N=5M, E=42M)

Better and better communities

Best communities get worse and worse

Best community has

100 nodes

15Slide16

Explanation: Downward part

Whiskers are responsible for downward slope of NCP plot

Whisker

is a set of nodes connected to the network by a

single

edge

NCP plot

Largest whisker

16Slide17

Explanation: Upward partEach new edge inside the community costs

more

NCP plot

Φ

=2/4

=

0.5

Φ

=8/6

=

1.3

Φ

=64/14

=

4.5

Each node has twice as many children

Φ

=1/3

=

0.33

17Slide18

Suggested network structure

Network structure:

Core-periphery (jellyfish

,

octopus)

Whiskers

are responsible for good communities

Denser and denser core of the network

Core

contains 60% node and 80% edges

18Slide19

Caveat: Bag of whiskers

What if we allow cuts that give disconnected communities?

Cut all whiskers

Compose

communities out of

whiskers

How good “community” do we get?

19Slide20

Communities made of whiskers

Community score

Community size

We get better community

scores when composing disconnected sets of whiskers

Connected communities

Bag of whiskers

20Slide21

Comparison to rewire networkTake a real network GRewire edges for a long time

We obtain a random graph with same degree distribution as the real network G

21Slide22

Comparison to a rewired network

22

Rewired network:

random network with same degree distributionSlide23

What is a good model?What is a good model that explains such network structure?

None of the existing models work

Pref. attachment

Small World

Geometric Pref. Attachment

Flat

Down and Flat

Flat and Down

23Slide24

Forest Fire model works

Forest Fire:

connections spread like a fire

New node joins the network

Selects a seed node

Connects to some of its neighbors

Continue recursively

As community

grows

it

blends

into the

core

of the

network

24Slide25

Forest Fire NCP plot

rewired

network

Bag of whiskers

25Slide26

Conclusion and connections

Whiskers:Largest whisker has ~100 nodesIndependent of network sizeDunbar number: a person can maintain social relationship to

at most 150 peopleBond vs. identity communitiesCore:Core has little structure (hard to cut)Still more structure than the random network

26Slide27

Conclusion and connectionsNCP plot

is a way to analyze network community structureOur results agree with previous work on

small networks (that are commonly used for testing community finding algorithms)But large networks are differentLarge

networksWhiskers + Core structureSmall well isolated communities

blend into the core of the networks as they grow

27