Undergraduate Researchers Juweek Adolphe Ressi Miranda Graduate Student Mentor Zhaoyu Li Faculty Advisor Dr Yi Shang ID: 731028
Download Presentation The PPT/PDF document "Finding Correlations Between Geographica..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Finding Correlations Between Geographical Twitter Sentiment and Stock Prices
Undergraduate Researchers: Juweek Adolphe Ressi Miranda Graduate Student Mentor: Zhaoyu Li Faculty Advisor: Dr. Yi ShangSlide2
Research Project
Find out whether a specific demographic’s Twitter sentiment has a more significant correlation to a company’s stock price than anotherSlide3
CorrelateSlide4
Previous Work
Sources: Sentidex.com Slide5
Tools
Sentiment AnalysisLexicon based approach finding the sentiment of individual words to get total sentiment of sentence
Tweepy Streaming APIFiltered by topic, languageMatplotlibGraphsSlide6
Methodology: Area
Sector: Food & RestaurantsStandard & Poor’s 500Companies: McDonalds and Starbucks
Key searches:Ticket Symbol, Keywords, Company ProductsKey Words Sample:$MCD, Big Mac, McDonalds, Happy Meal
$SBUX, Starbucks, Caramel MacchiatoSlide7
Making a Dataset
Other dataset didn’t workStreamed Tweets for 5 daysFiltered by keywords, English
Information Extracted:company related tweet time
self-reported locationusernamefollowers countSlide8
Stock Market Data
Google FinanceStock Price by the minute Slide9
Processing Data
Normalize TweetsLowercasedNon-alphanumerical characters (@, $, #, etc.)Sentiment Analysis
lexicon-based approachUsed SentiWordNet (http://sentiwordnet.isti.cnr.it/) Slide10
Lexicon Based Approach Explained
Tweet Example:“going to mcdonald's with mah friends today and i need to know
what toy i should get with my happy meal”
Positive Score
Negative Score
Word: know
0
0
0.125
0
0.125
0
0.25
0.25
0.375
0.625
0
0
0
0
0
0
0
0
0
0
know,
recognize, acknowledge
know,
cognize
know
know
knowknow, live, experienceknowknowknowknow
Scores taken from SentiWordNetSlide11
Lexicon Based Approach Explained
Tweet Example:“going to mcdonald's with mah friends today and i need to
know what toy i should get with my happy meal”
Positive ScoreNegative ScoreWord: know
0
0
0.125
0
0.125
0
0.25
0.25
0.375
0.625
Average: 0.1625
0
0
0
0
0
0
0
0
0
0
Average: 0
know,
recognize, acknowledge
know,
cognize
know
know
knowknow, live, experienceknowknowknowknow
Scores taken from SentiWordNetSlide12
Pos
NegWord0000.5
goinggoing00
friends00.1250.25
0
0
0
0
0
today
today
today,
nowadays, now
today
0.125
0
0.
0.375
0.125
0.125
0.25
0
0.25
0.125
need,
want, require
need,
involve, demand, postulate
need,
motive
need
need,
demand000.12500.125
0
0.25
0.25
0.375
0.625
0000000000know, recognize, acknowledgeknow, cognizeknowknowknowknow, live, experienceknowknowknowknow
00.250000000000000.1250.125toytoy, play, fiddle, diddletoy, play flirt dally toy_dogtoy, miniaturetoy, play thingtoytoy000000000000.1250.5000000000.125000000000000.1250000000000.1250000getget, caused, simulateget, dive, aimgetget, fix, pay_backget, catch, captureget, catchget, fetch, convey, bringget, catch, arrestgetget, drawget, catchgetget_under_ones_skinget, come, arrivegetget, get_offget, have, experienceget, receiveget, catchget, catchget, acquire get, make, haveget
0.1250.750.8750.50000happyhappyhappyhappy, glad000000mealmeal, repastmeal
Scores taken from SentiWordNetSlide13
Positive Average
Negative AverageWord0.16250going
00friends0.09375
0today0.1250.75
need
0.175
0
know
0.03125
0.03125
toy
0.03125
0.0104166
get
0.5625
0
happy
0
0
meal
1.18125
0.7916666
Total SentimentSlide14
Tweet Example: “going to mcdonald's with mah friends today and i need to
know
what toy i should get with my happy meal”
Positive!Slide15
Geographical Location
Filter out by US citiesChoose the top represented citiesassumed self-reported location is valid
Used Google Maps Api to process tweetsSlide16
Work FlowSlide17
Locations Found
Our Twitter SampleCities are highly represented**Does our Twitter Sample have a high representation of the top cities?
Twitter Top Cities*
New York, NYWashington DCLos Angeles, CA
Chicago, IL
Dallas, TX
Top Cities (GDP)
New York, NY
Los Angeles, CA
Chicago, IL
Houston, TX
Washington DC
*Wikipedia.orgSlide18
ResultsSlide19
ResultsSlide20
Challenges
Limited time frameGeographic locationsDifferent number of tweets/stocks per minuteSlide21
Future Work
Larger Twitter SamplePredicting Stock PriceCorrelate the number of followers to stock priceSlide22
ReferencesCities by GDP
*"List of U.S. Metropolitan Areas by GDP." Wikipedia. Wikimedia Foundation, 22 July 2014. Web. 31 July 2014.**Mislove, Alan, et al. "Understanding the Demographics of Twitter Users."ICWSM
11 (2011): 5th.Slide23
Thank you!
Faculty Advisor: Dr. Shang YiGraduate Student: Zhaoyu LiREU Group & Mentors for their help and support!
University of MissouriNational Science Foundation* *Award Abstract #1359125 REU: Research in Consumer Networking
TechnologiesSlide24
Questions?