/
Microsoft Instant Messenger Communication Network Microsoft Instant Messenger Communication Network

Microsoft Instant Messenger Communication Network - PowerPoint Presentation

trish-goza
trish-goza . @trish-goza
Follow
395 views
Uploaded On 2017-09-27

Microsoft Instant Messenger Communication Network - PPT Presentation

How does the world communicate Jure Leskovec jurecscmuedu Machine Learning Department httpwwwcscmuedu jure Joint work with Eric Horvitz Microsoft Research Networks Why ID: 591200

user network data communication network user communication data age people users conversation million conversations number time messages edges instant

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Microsoft Instant Messenger Communicatio..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Microsoft Instant Messenger Communication NetworkHow does the world communicate?

Jure Leskovec (jure@cs.cmu.edu)Machine Learning Departmenthttp://www.cs.cmu.edu/~jure

Joint work with: Eric Horvitz, Microsoft ResearchSlide2

Networks: Why?

Today: large on-line systems leave detailed records of social activityOn-line communities: MyScace, Facebook Email, blogging, instant messagingOn-line publications repositories, arXiv, MedLineEmerging behavior (need lots of data):

Actions of individual nodes are independent but global patterns and regularities emergeSlide3

The Largest Social NetworkWhat is the largest social network in the world (that we can relatively easily obtain)?

 For the first time we had a chance to look at complete (anonymized) communication of the whole planet (using Microsoft MSN instant messenger network)3Slide4

Instant Messaging

Contact (buddy) list Messaging window

4Slide5

Instant Messaging as a Network5

Buddy

ConversationSlide6

IM – Phenomena at planetary scaleObserve social phenomena at planetary scale:

How does communication change with user demographics (distance, age, sex)?How does geography affect communication?What is the structure of the communication network?6Slide7

Communication dataThe record of communication

Presence data user status events (login, status change)Communication data who talks to whomDemographics data user age, sex, location7Slide8

Data description: PresenceEvents:Login, Logout

Is this first ever loginAdd/Remove/Block buddyAdd unregistered buddy (invite new user)Change of status (busy, away, BRB, Idle,…)For each event:User IdTime8Slide9

Data description: CommunicationFor every conversation (session) we have a list of users who participated in the conversation

There can be multiple people per conversationFor each conversation and each user:User IdTime JoinedTime LeftNumber of Messages SentNumber of Messages Received9Slide10

Data description: DemographicsFor every user (self reported):Age

GenderLocation (Country, ZIP)LanguageIP address (we can do reverse geo IP lookup)10Slide11

Data collectionLog size: 150Gb/day

Just copying over the network takes 8 to 10hParsing and processing takes another 4 to 6hAfter parsing and compressing ~ 45 Gb/dayCollected data for 30 days of June 2006:Total: 1.3Tb of compressed data

11Slide12

Network: Conversations12

ConversationSlide13

Data statisticsActivity over June 2006 (30 days)245 million users logged in

180 million users engaged in conversations17,5 million new accounts activatedMore than 30 billion conversations13Slide14

Data statistics per dayActivity on June 1 20061 billion conversations

93 million users login65 million different users talk (exchange messages)1.5 million invitations for new accounts sent14Slide15

User characteristics: age

15Slide16

Age piramid: MSN vs. the world

16Slide17

Conversation: Who talks to whom?Cross gender edges:

300 male-male and 235 female-female edges640 million female-male edges17Slide18

Number of people per conversation

Max number of people simultaneously talking is 20, but conversation can have more people18Slide19

Conversation durationMost conversations are short

19Slide20

Conversations: number of messages

Sessions between fewer people run out of steam20Slide21

Time between conversationsIndividuals are highly diverse

What is probability to login into the system after t minutes?Power-law with exponent 1.5Task queuing model [Barabasi]My email, Darvin’s and Einstein’s letters follow the same pattern

21Slide22

Age: Number of conversations

User self reported age

High

Low

22Slide23

Age: Total conversation duration

User self reported age

High

Low

23Slide24

Age: Messages per conversation

User self reported age

High

Low

24Slide25

Age: Messages per unit time

User self reported age

High

Low

25Slide26

Who talks to whom: Number of conversations

26Slide27

Who talks to whom: Conversation duration

27Slide28

Geography and communicationCount the number of users logging in from particular location on the earth

28Slide29

How is Europe talking

Logins from Europe29Slide30

Users per geo location

Blue circles have more than 1 million logins.30Slide31

Users per capita

Fraction of population using MSN:Iceland: 35%Spain: 28%Netherlands, Canada, Sweden, Norway: 26%

France, UK: 18%USA, Brazil: 8%

31Slide32

Communication heat mapFor each conversation between geo points (A,B) we increase the intensity on the line between A and B

32Slide33

Correlation:

Probability

:

Homophily

(

gliha

v

kup

štriha)

Age vs. Age

33Slide34

Per country statisticsOn a particular typical day…

34Country# of logins

# of users

# of messages

Messages

per user

USA

38,319,363

13,261,337

412,729,278

31.12

Brazil

20,582,613

7,864,424

467,972,522

59.50

France

19,163,131

6,475,858

518,931,785

80.13

Unknown

18,444,352

6,872,347

191,167,085

27.81

Spain

16,868,549

6,140,895

503,759,240

82.03

UK

16,659,009

5,724,826

487,018,470

85.07

Canada

14,558,692

5,021,185

160,249,686

31.91

China

14,225,163

5,314,463

101,003,729

19.00

Turkey

13,619,789

4,696,555

353,540,475

75.27

Mexico

10,756,989

4,359,932

209,195,100

47.98

Note that global usage and market share statistics are higher if we accumulate data over longer time periods.Slide35

Per typical user per countryOn a typical day MSN user from a country …

35CountryLogins on a particular day

Users on a particular

day

Messages sent

Messages per user

Slovenia

364,988

130,884

15,919,892

121.6335992

Malta

122,846

41,829

4,993,316

119.3745009

Hungary

1,214,268

427,320

47,623,604

111.4471684

Bosnia

105,584

35,689

3,254,170

91.18131637

Teunion

100,335

33,399

3,041,635

91.0696428

Gibraltar

19,096

6,452

581,195

90.07982021

UK

16,659,009

5,724,826

487,018,470

85.07131396

Macedonia

126,729

43,754

3,669,977

83.87751977

Netherlands

7,399,160

2,696,669

221,300,210

82.06428375

Spain

16,868,549

6,140,895

503,759,240

82.03352117

Note that global usage and market share numbers are higher if we accumulate data over longer time periods.Slide36

What about Slovenia (per capita)?

StatisticNumberRank (

per capita)

Conversations inside

19,868,886

22

Conversation to outside

7,868,483

48

Total conversations

27,737,369

29

Avg.

time inside

309.49

147

Avg. time outside

314.39

80

Avg. time inside (pct.)

0.4960

Messages sent inside

9.78

32

Messages sent outside

9.46

19

Messages inside (pct.)

0.5083

36Slide37

Who is Slovenia talking to?37

RankTarget Country

Pairs of people

Number of conversations

Avg. time

per conv.

Avg. # of messages

1

Slovenia

13,41,250

19,868,886

309.4

9.78

2

USA

61,794

922,527

303.4

9.14

3

Spain

27,650

310,357

289.4

7.97

4

UK

14,709

204,335

325.4

9.02

5

Germany

9,047

129,551

350.3

10.20

6

Bosnia

9,956

114,509

385.9

14.62

7

Yugoslavia

8,194

104,270

381.7

12.55

8

Italy

8,612

100,698

358.8

9.89

9

Croatia

6,838

84,362

359.011.0010

Turkey10,76377,651292.4

8.0811Albania

9,517

76,440320.710.88

12Sweden5,083

69,019306.98.34

13Netherlands5,061

68,287315.98.87

14

Canada

5,003

60,617

301.8

7.38Slide38

Instant Messaging as a Network38

BuddySlide39

IM Communication NetworkBuddy graph:

240 million people (people that login in June ’06)9.1 billion edges (friendship links)Communication graph:There is an edge if the users exchanged at least one message in June 2006180 million people1.3 billion edges30 billion conversations39Slide40

Buddy network: Number of buddies

Buddy graph: 240 million nodes, 9.1 billion edges (~40 buddies per user)40Slide41

Communication Network: DegreeNumber of people a users talks to in a month

41Slide42

Network: Small-world

6 degrees of separation [Milgram ’60s]Average distance 5.590% of nodes can be reached in < 8 hops

Hops

Nodes

1

10

2

78

3

396

4

8648

5

3299252

6

28395849

7

79059497

8

52995778

9

10321008

10

1955007

11

518410

12

149945

13

44616

14

13740

15

4476

16

1542

17

536

18

167

19

71

20

29

21

16

22

10

23

3

24

2

25

3Slide43

Network: Searchability

Milgram’s experiment showed:(1) short paths exist in networks(2) humans are able to find themAssume the following setting:Nodes are scattered on a planeGiven starting node

u and we want to reach target node vAlgorithm: always navigate to a neighbor that is geographically closest to target node

vSurprise: Geo-routing finds the short paths (for appropriate distance measure)

43

u

vSlide44

Communication network: Clustering

How many triangles are closed?Clustering normally decays as k-1Communication network is highly clustered: k-0.37

High clustering

Low clustering

44Slide45

Communication Network Connectivity

45Slide46

k-Cores decompositionWhat is the structure of the core of the network?

46Slide47

k-Cores: core of the networkPeople with k<20 are the periphery

Core is composed of 79 people, each having 68 edges among them47Slide48

Network robustnessWe delete nodes (

in some order) and observe how network falls apart:Number of edges deletedSize of largest connected component48Slide49

Robustness: Nodes vs. Edges

49Slide50

Robustness: Connectivity

50Slide51

ConclusionA first look at planetary scale social networkThe largest social network analyzed

Strong presence of homophily: people that communicate share attributesWell connected: in only few hops one can research most of the networkVery robust: Many (random) people can be removed  and the network is still connected51Slide52

ReferencesLeskovec and Horvitz:

Worldwide Buzz: Planetary-Scale Views on an Instant-Messaging Network, 2007http://www.cs.cmu.edu/~jure 52