Dr Verena Rieser amp Prof Rob Pooley Sentiment Analysis of Arabic Social Networks Presented by Eshrag Refaee Outline The concept of sentiment analysis Arabic as a morphologically rich language ID: 280101
Download Presentation The PPT/PDF document "Supervisors" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Supervisors Dr. Verena Rieser & Prof. Rob Pooley
Sentiment Analysis of Arabic Social Networks
Presented byEshrag RefaeeSlide2
Outline The concept of sentiment analysisArabic as a morphologically rich languageAims of the research Sentiment analysis in English and Arabic literatureTwitter corpus: collection and annotation
Empirical work Results and evaluation Future work Slide3
Sentiment analysis Definition: Analysing and understanding people’s sentiments, evaluations, opinions, attitudes, and emotions from written text.
Research on SA appeared early 2000 (Liu, 2012).
SA is one of the most active research areas in NLP.Slide4
Applications In addition to its significance as a major sub-field of Natural Language Processing (NLP)research, SSA has a potential of several:Commercial applications measuring success of a productSocial applications
Political applications
Economical applications Slide5
Sentiment analysis of social networks The growing importance of sentiment analysis coincides with the growth of social media such as reviews, forum discussions, and micro-blogs.A social network like twitter, with more than 500 million active users (ALEXA, 2012), provides a global arena for users to share views, attitudes, preferences etc; and discuss points of agreement, and/or conflict
.March 2012, Twitter has become available in Arabic (Twitter Blog, 2012) Slide6
About Arabic Arabic is the language of an aggregate population of over 300 million people, first language of the 22 member countries of the Arabic League and official language in three others (Habash, 2010). Slide7
About Arabic Arabic language can be classified into three major levels: Classic Arabic (CA)Modern standard Arabic (MSA)Arabic Dialects (AD).
Social networks uses DA & MSA side-by-side(Al-
Sabbagh, and Girju, 2012).Slide8
Aims Address the bottleneck of availability of NLP resources to study SA of Arabic micro-blogs genre by constructing a corpus of Arabic tweets, a subset of which is annotated for sentiment analysis.Use the corpus to build and test models of sentiment analysis.Employ freely available Arabic NLP tool for annotating language specific features, including Part-of-Speech tagging, and morphological analysis.
Evaluate the quality of these features by measuring their contribution to the SA classification task.Slide9
Aims of this research Construct a corpus of Arabic tweets for sentiment analysis.Build and test
classification models for automatic sentiment analysis.Explore distant supervision approaches to build efficient models for the changing twitter stream.Slide10
Sentiment analysis of English text
Feature-sets
Publication
Word tokens
Semantic Feat.
Stylistic Feat.
n-grams
Morph
Unique
Domain
POS
User: PER/ORG
Statistical Feat.
Classification Schemes
Results
Targeted language
Yu, H., & Hatzivassiloglou, V. (2003)
NB
Acc. 91
English(newswire articles, question-answering)
Abbasi et al (2008)
SVM 10-fold CV
2-stage classification
Best Acc. 91 .70
English and Arabic forums, movie reviews
Osherenko, (2008)
SVM
precision 44% recall 42%
English
(759sentences)
Wilson et al (2009)
Boos Texter, TiMBL, Ripper, SVM
(1)Perfect neutral classification (manual). BL78.7 SVM81.6
(2) Auto neut. Detection SVM64.
Neutral-polar SVM75.3
English (question-answering opinion corpus)
Bifet and Frank (2010)
Multi-nominal NB, SGD
Best acc.86.11 NB
86.26 SGD
Englis tweets (automatic annotation using emoticons)
Pak and
Paroubek
(2010)
NB
SVM
60% F
English tweets
Purver
and
Battersby
(2012)
SVM
10-fold CV
Six-class emotion detection
77.5% F for happiness on manual test set
English tweets-distant Learning (automatic annotation using emoticons) noisy labelsSlide11
Sentiment analysis of Arabic text
Feature-sets
Publication
Word tokens
Semantic Feat.
Stylistic Feat.
n-grams
Morph
Unique
Domain
POS
User: PER/ORG
Statistical Feat.
Classification Schemes
Results
Targeted language
Abbasi et al (2008)
SVM 10-fold CV
2-stage classification
Best Acc. 91 .70
English and Arabic forums, movie reviews
Farra
et al (2010)
SVM , J48
10-fold CV
Acc. Grammatical 89.3/semant 80
Arabic movie reviews(44)
Abdul-Mageed et al 2011
SVM 2-stage classification
(-neutral)
Manual polarity MSA lexicon
Stem+morph+ADJ 90.93 F 5-fold CV
95.52 F (with the best config.
Modern Standard Arabic
El-Halees, 2011
Max entropy, k-nearest, NB, SVM
Best acc. 84.34
Arabic forum
posts(1143)
Itani
et al 2012
Naïve Bayes
Best
acc
. 85.6
Arabic (Facebook posts)
Mourad
and
Darwish
2013
NB, and SVM 2-stage (sentiment: only positive vs. negative) 10-fold CV
Best acc. On tweet SUBJ 64.1, SENTI 72.5
Arabic tweets (2,300 manual annotation)Slide12
Approach and methodology Slide13
Building training set 1: defining the Annotation scheme
Label
DefinitionExample
Polar
Positive or negative emotion, evaluation, or attitude.
السياحة في اليمن جمال لا يصدق
Tourism in Yemen, unbelievable beauty
positive
Clear positive indicator
كم انت عظيم يا بشار الاسد
How great you are, Bashar Al-Asad
Negative
Clear negative indicator
حنا للأسف نستخدم ايفون
Unfortunately, we use the iPhone
Neutral
Simple factual statement/ news
Open questions with no emotions
indicated
Undeterminable indicators/neither positive or negative
وفاة جديدة بإتش7إن9 بالصين
A new reported death case with H7N9 in China
كيف انقطعت الإنترنت عن سوريا؟
How was the Internet disconnected from Syria
?
لمساواة في قمع الحريات الشخصية عدل
Equality in suppressing personal freedoms is justiceSlide14
Building training set 2: Agreement studywe conducted an inter-annotator agreement study on a subset of 677 of the annotated tweets. We
use Cohen’s Kappa (Cohen, 1960) which measures the degree of agreement among the assigned labels, correcting for agreement by chance.
Where Pr(a) is the observed agreement among annotators, and Pr(e) is the probability of agreement by chance among annotators. The overall observed agreement is
84.79%
and resulting weighted Kappa reached
0.756
, which indicates a reliable annotations.
Slide15
Our Arabic Twitter corpus (Refaee E, and Rieser V, 2014). An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis. Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014
) Reykjavik, Iceland.
Corpus freely available from LREC repository.Slide16
Approach and methodology Slide17
Building training set : features extraction & feature vector construction
Classifier/ learner
Class of a new document Slide18
Experimental settings Machine learners We use the implementations of the following algorithms provided by the WEKA data mining package – version 3.7.9 (Witten and Frank, 2005). Naïve Bayes (NB)
Trees
(J48)
NB
is a simple probabilistic
classifier that assume the feature independence
J48 is a statistical model that generate a decision tree used for classification.Slide19
Experimental settings Machine learners We use the implementations of the following algorithms provided by the WEKA data mining package – version 3.7.9 (Witten and Frank, 2005). Sequential
Minimal Optimization-SMO (Platt, 1999) Support Vector Machines (SVM)
ZeroR (baseline scheme)
SVM aims to
identify the Optimal
hyperplane
that linearly separates data instances with the maximum marginSlide20
Experimental settings b. Evaluation MetricsThe results are evaluated with respect to two statistical measurements
:F-measure (F) the harmonic average of the precision and recall:
Where precision is the ratio of retrieved instances that are relevant, and recall is the ratio of relevant instances that are retrieved. The accuracy is percentage of the correctly classified instances:For all experiments, machine learners were run 100 times for each data-set (10 repetition* 10-fold cross validation)
Slide21
Results and evaluation
baseline
SVM
Tokens
55.25
94.55
Morph feat.
55.25
95.64
Semantic feat.
55.25
96.02
Stylistic feat.
55.25
96.05
2-level classification:
Subjective
vs.
Objective
Slide22
Results and evaluation
2-level classification: positive vs. negative
baseline
SVM
Tokens
50.16
88.21
Morph feat.
50.16
89.55
Semantic feat.
50.16
91.69
Stylistic feat.
50.16
92.1Slide23
Results and Evaluation
baseline
SVM
Tokens
55.25
92.29
Morph feat.
55.25
92.47
Semantic feat.
55.25
93.22
Stylistic feat.
55.25
93.46
Single-level classification:
positive vs. negative. Vs.
neutralSlide24
Current direction of research Applying semi-supervised learning to automatically annotate the rest of our twitter corpus.
Investigate distant learning approaches to boost a large training set to be used for models’ optimisation.
Building a high quality polarity lexicon to be employed in automatically detecting/identifying the overall sentiment orientation of a given text.
Explore culture-related features that can detect cultural references in user-generated text
. Slide25
Thanks @eshragR