A KARTELJ and V FILIPOVIC School of Mathematics University of Belgrade Serbia and V MILUTINOVIC School of Electrical Engineering University of Belgrade Serbia Agenda Problem overview Classification of the existing solutions ID: 430118
Download Presentation The PPT/PDF document "Automated Personality Classification" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Automated Personality Classification
A. KARTELJ and V. FILIPOVIC
School of Mathematics, University of Belgrade, Serbia
and
V. MILUTINOVIC
School of Electrical Engineering, University of Belgrade, SerbiaSlide2
Agenda
Problem overview
Classification of the existing solutions
Presentation of the existing solutionsComparison of the solutionsWork in progress:Bayesian Structure Learning for the APCFuture work: Video Based APCConclusions
MULTI 2012
2
3.10.2012Slide3
Problem Overview
MULTI 2012
3
3.10.2012Slide4
The Big 5 Model
MULTI 2012
4
3.10.2012Slide5
The Steps in Our Research
Survey paper
(under review at ACM CSUR)
Research paper:A new APC model based on Bayesian structure learning (in progress)Real-purpose applicationof the APC model from step 2Go to step 3 MULTI 2012
5
3.10.2012Slide6
Elements of APCCorpus:
Essay, weblog, email, news group,
Twitter counts...
Personality measurement:Questionnaire (internet and written). We are searching for an alternative!Model:Stylistic analysis, linguistic features, machine learning techniquesMULTI 201263.10.2012Slide7
Applications
MULTI 2012
7
3.10.2012Slide8
Mining People’s Characteristics
MULTI 2012
8
3.10.2012Slide9
Classification of Solutions
MULTI 2012
9
3.10.2012
C1 criterion separates solutions by type of conversation (1 = self-reflexive, N = continuous)
C2 criterion separates solutions by approach (TD = top-down, DD = data-driven, or HY = hybrid) Slide10
Linguistic Styles: Language Use as an Individual Difference
Pennebaker
and King [1999]MULTI 2012103.10.2012Slide11
LIWC and MRC Features
Feature
Type
ExampleAnger wordsLIWCHate, killMetaphysical issuesLIWCGod, heaven, coffin
Physical state / function
LIWC
Ache, breast, sleep
Inclusive
words
LIWC
With, and, include
Social processes
LIWC
Talk, us, friend
Family members
LIWC
Mom, brother, cousin
Past
tense verbs
LIWC
Walked, were, had
References
to friends
LIWC
Pal, buddy, coworker
Imagery of
words
MRC
Low:
future, peace – High: table, car
Syllables per word
MRC
Low:
a – High: uncompromisingly
Concreteness
MRC
Low: patience, candor
– High: ship
Frequency of use
MRC
Low: duly, nudity –
High: he, the
MULTI 2012
11
3.10.2012Slide12
What Are They Blogging About? Personality, Topic and Motivation in
Blogs
Gill
et al. [2009]MULTI 2012123.10.2012Slide13
Taking Care of the Linguistic Features of Extraversion
Gill
and
Oberlander [2002]MULTI 2012133.10.2012Slide14
Personality Based Latent Friendship Mining Wang et al. [2009]
MULTI 2012
14
3.10.2012Slide15
A Comparative Evaluation of Personality Estimation Algorithms for the TWIN
Recommender
System
Roshchina et al. [2011]MULTI 2012153.10.2012Slide16
Predicting Personality with Social MediaGolbeck
et al. [
2011]MULTI 2012163.10.2012Slide17
Our Twitter Profiles, Our Selves: Predicting Personality with TwitterQuercia et al.
[
2011
]MULTI 2012173.10.2012Slide18
Paper
Input
Corpus
Features Algorithm
Soft.
Cit.
I
S
A
R
[Pennebaker and King 1999]
text
essays
LIWC
correlations
n/a
455
H
H
H
M
[Mairesse et al. 2007]
text,
speech
essays
LIWC,
MRC
C4.5,
NB
,
SMO, M5’
Weka
99
M
M
H
M
[Gill et al. 2009]
text
weblogs
(14.8words)
LIWC
linear
regression
n/a
26
H
H
M
M
[Yarkoni 2010]
text
w
eblogs
(100K words)
LIWC
correlations
n/a
21
H
M
M
M
[Gill and Oberlander 2002]
text
emails
(105
students)
bigrams
bigram
analysis
n/a
49
L
M
M
L
[Nowson et al. 2005]
text
weblogs
(410K
words)
word list correlations n/a 48LHHL[Oberlander 2006]text weblogs (410K words)N-grams NB, SMO Weka 53HMHM[Wang et al. 2009]text, weblogs (200 pairs)lexical freq. ,TFIDFlogistic regressionMinitab 1HMMM[Iacobelli et al. 2011]text weblogs (3000)LIWC, bigrams,SVM, SMO, NB..Weka 1HHMH[Argamon et al. 2005]text essaysword list, conj.SMO Weka 38HMMM[Argamon et al. 2007]text essaysword list, conj.SMO Weka, ATMan45HMMM[Mairesse and Walker 2006]text , conv. extracts96 persons (≈100Kwords)LIWC, MRC, utterance…RankBoost n/a 22MMHM[Rigby and Hassan 2007]text mail. lists (140K emails)LIWC C4.5 Weka, SPSS30MHML[Roshchina et al. 2011]text TripAdvisor reviews LIWC, MRCLinear, M5, SVMWeka 2HMLM[Quercia et al. 2011] meta 335 Twitter usersTwitter counts M5’ rules Weka 5MHMM[Golbeck et al. 2011]text, meta279 FB users 5 classes (161 in total)M5’ rules, Gaussian processesWeka 12HMMM[Celli 2012] text 1065 posts22 ling. Featuresmajority-based classificationn/a 1MMMM
MULTI 2012
18
3.10.2012Slide19
Naive Bayes Classifier
MULTI 2012
19
3.10.2012Slide20
Naive Bayes and Bayesian Network
MULTI 2012
20
3.10.2012Slide21
Bayesian Network for the APC
MULTI 2012
21
3.10.2012Slide22
Bayesian Network Structure Learning
Obtain corpus (training set T)
Fit T to appropriate network structure by:
ILP formulation + solver (CPLEX, Gurobi…) on smaller instancesApply metaheuristic on larger instancesValidate quality of metaheuristic approachCompare obtained APC accuracy with other approachesMULTI 201222
3.10.2012Slide23
Other Ideas
MULTI 2012
23
Games with a purpose
(GWAP)
Clustering personality characteristics
3.10.2012Slide24
Packing everything together: Video Based APC
MULTI 2012
24
3.10.2012Slide25
ConclusionsClassification of the existing solutions (Survey paper)Filling the gaps inside classification tree
Introducing Bayesian Structure Learning for the APC
Utilizing metaheuristics in dealing
with high dimensionalityAPC potential: social networks, recommender, and expert systemsMULTI 2012253.10.2012Slide26
THANK YOU!
Aleksandar
Kartelj kartelj@matf.bg.ac.rsVladimir Filipovic vladaf@matf.bg.ac.rsVeljko Milutinovic vm@etf.bg.ac.rs