User Location Inference in Social Media Yuto Yamaguchi Toshiyuki Amagasa and Hiroyuki Kitagawa University of Tsukuba 131008 COSN 2013 Yuto Yamaguchi 1 locationrelated information ID: 328227
Download Presentation The PPT/PDF document "Landmark-Based" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Landmark-Based User Location Inferencein Social Media
Yuto Yamaguchi†, Toshiyuki Amagasa†and Hiroyuki Kitagawa††University of Tsukuba
13/10/08
COSN 2013 - Yuto Yamaguchi
1Slide2
location-related information
13/10/08
COSN 2013 - Yuto Yamaguchi
2
Eating seafood !!!
I’m at Logan airport
Profile
Residence: Tokyo, Japan
COSN @ northeastern Slide3
applicationsVarious Researches using Home Locations
Outbreak Modeling [Poul+, ICWSM’12]Real-World Event Detection [Sakaki+, WWW’12]Analyzing Disasters [Mandel+, LSM’12]Other Useful ApplicationsLocation-aware Recommender
[Levandoski+, ICDE’12]Merketing
, AdsDisaster Warning
13/10/08
COSN 2013 - Yuto Yamaguchi
3Slide4
our ProblemLocation profiles are not available for …
76% of Twitter users [Cheng et al., CIKM’10]94% of Facebook users [Backstrom et al., WWW’10]This reduces opportunities of location information User Home Location Inference
13/10/08
COSN 2013 - Yuto Yamaguchi
4Slide5
User home location inferenceContent-Based Approaches
[Cheng et al., CIKM’10][Kinsella et al., SMUC’11][Chandra et al., SocialCom’11]Graph-Based Approaches[Backstrom et al., WWW’10][
Sadilek et al., WSDM’12][
Jurgens, ICWSM’13]
13/10/08
COSN 2013 - Yuto Yamaguchi
5
Our focusSlide6
graph-based approach (1/2)Basic Idea
13/10/08COSN 2013 - Yuto Yamaguchi6
Boston
Boston
Boston
Chicago
New York
Boston
?
friendsSlide7
graph-based approach (2/2)Closeness Assumption
13/10/08COSN 2013 - Yuto Yamaguchi7
F
riends
Not friends
Spatially close
Spatially distant
Really close
?
60% are 100km distantSlide8
concentration assumption13/10/08
COSN 2013 - Yuto Yamaguchi8
Boston
Boston
?
LANDMARK
Unknown
NY
ChicagoSlide9
landmarks
13/10/089COSN 2013 - Yuto YamaguchiSlide10
requirementsSmall Dispersion
Large Centrality13/10/08COSN 2013 - Yuto Yamaguchi
10Slide11
examples in twitter13/10/08
COSN 2013 - Yuto Yamaguchi11Slide12
Landmarks mapping13/10/08
COSN 2013 - Yuto Yamaguchi12
Red: all usersBlue: landmarksSlide13
proposed method
13/10/0813COSN 2013 - Yuto YamaguchiSlide14
OverviewProbabilistic Model
Modeling 13/10/08COSN 2013 - Yuto Yamaguchi
14
Each user has his/her
location distribution
Location inference =
Selecting the location with
the largest probability density
location set
LANDMARK MIXTURE MODELSlide15
Dominance DistributionSpatial distribution of followers’ home locations
Modeled as Gaussian Landmarks have small covariances
many followers at the center
13/10/08
COSN 2013 - Yuto Yamaguchi
15
latitude
longitude
many
followers
few
followersSlide16
Landmark Mixture Model (LMM)13/10/08
COSN 2013 - Yuto Yamaguchi16
I
nference
target user
follow
Landmark
Non-landmark
Non-landmark
Dominance
distribution
Mixture
weight
Large weight for landmarkSlide17
mixture weights13/10/08
COSN 2013 - Yuto Yamaguchi17
Proportional to centrality
Landmark
Non-landmark
Large mixture weight
Small
mixture weightSlide18
Confidence ConstraintIf the distribution does not have a clear peak,
we should not infer the location of that user13/10/08COSN 2013 - Yuto Yamaguchi18
High precision
but
low recallSlide19
Centrality ConstraintWe can reduce the cost by
ignoring non-landmarks13/10/08COSN 2013 - Yuto Yamaguchi19
low cost
but
low recall
I
nference
target user
follow
Landmark
Non-landmark
Non-landmarkSlide20
experiments
13/10/0820COSN 2013 - Yuto YamaguchiSlide21
DatasetTwitter dataset provided by [Li et al., KDD’12]
3M users in the U.S.285M follow edgesGeocode their location profiles for ground truth465K users (15%) labeled usersTest set
46K users (10% of labeled users)
13/10/08
COSN 2013 - Yuto Yamaguchi
21Slide22
performance comparison13/10/08
COSN 2013 - Yuto Yamaguchi22
Compared three methodsLMM: our method
UDI: [Li+, KDD’12]
Naïve: Spatial medianSlide23
effect of Confidence constraint13/10/08
COSN 2013 - Yuto Yamaguchi23
p0
We can adjust the trade-off between precision and recallSlide24
effect of centrality constraint13/10/08
COSN 2013 - Yuto Yamaguchi24
c0
We can adjust the trade-off between cost and recallSlide25
ConclusionIntroduced the
concentration assumptioninstead of widely-used closeness assumptionThere exist landmarksProposed landmark mixture modelOutperforms the state-of-the-art methodConfidence / Centrality constraintFuture work
Other application of landmarks
Recommending landmarks or their tweets
13/10/08
COSN 2013 - Yuto Yamaguchi
25