/
Commentslikethisoneinspiredthestatisticalworkwepresenthere.Inthispaper Commentslikethisoneinspiredthestatisticalworkwepresenthere.Inthispaper

Commentslikethisoneinspiredthestatisticalworkwepresenthere.Inthispaper - PDF document

natalia-silvester
natalia-silvester . @natalia-silvester
Follow
377 views
Uploaded On 2015-12-07

Commentslikethisoneinspiredthestatisticalworkwepresenthere.Inthispaper - PPT Presentation

REFERENCES1LAdamicandBHubermanZipfslawandtheInternetGlottometrics314315015020022AlexahttpalexacomRetrievedJune120123ABarab ID: 217491

REFERENCES1.L.AdamicandB.Huberman.Zipf'slawandtheInternet.Glottometrics 3(143–150) 2002.2.Alexa.http://alexa.com.RetrievedJune1 2012.3.A.Barab

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Commentslikethisoneinspiredthestatistica..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Commentslikethisoneinspiredthestatisticalworkwepresenthere.Inthispaper,wepresentastudyofunderprovi-siononReddit.Examiningbothpageviewdataandduplicatesubmissions,wearriveattheconclusionthatwidespreadun-derprovisionofvotesislikelyhappeningonthesite.Notably,Redditoverlooked52%ofthemostpopularlinksthersttimetheyweresubmitted.Thissuggeststhatmanypotentiallypopularlinks(i.e.,onestheRedditcommunitywouldvalue)areignored,jeopardizingReddit'scorepurpose.Weconcludebydiscussingpossiblereasonsbehindtheun-derprovisionofvotesweobserveonReddit.Inparticular,wediscusscandidateanswerstothefollowingquestion:Whichdesignelements,ifany,invitetheunderprovisionwesee?Whilewecanoffernoconclusiveanswerswiththisshortpaper,wehopetoinvitelinesofworkthatcaninvestigatethisdeeperresearchquestion.METHODInthispaper,wepresenttwotypesofstatisticalevidenceaddressingunderprovisiononReddit:pageviewdataandananalysisofduplicatesubmissions.PageViewsTogainprecisionaroundissuesofunderprovision,werstperformedananalysisofpageviewdatawecollectedfromReddit.Wewantedtounderstandhowmanypeopleactuallylookatthenewlinksowingintothesite.Asnotedearlier,Redditorganizesitsdesignaroundthemostpopularcontent;youhavetoseekoutthequeueofbrandnewsubmissions.Theratioofnewsubmissionpageviewstopopularcontentpageviewswouldgiveusarst-orderviewofwhereattentionowsonReddit.WhilesiteslikeAlexa[2]canprovideaggregatetrafcforasite,ne-grainedpage-viewdataamongsubsectionsofasiteishardtocomebyunlessyouhaveaccesstositelogs.However,weemployedaworkaroundinfocusingonapar-ticularsubcommunitywithinReddit:thepicssubreddit.Red-dit'ssecond-largestsubcommunitybymembership[24],picsletsuserssharepicturesfromaroundtheinternet.picshasmorethan2.1Msubscribers,accountingforroughly3.6%ofReddit'stotalsubscriptions,anditscontributionsveryoftenmaketheirwaytosite'smostpopularpage.Whiletheimagecontentvarieswidely,redditorssharethemajorityoftheirimagesusingtheimage-sharingsiteimgur.com(astatisticwederivedfromourdata).WhileRedditdoesnotprovidepageviewdata,imgur.comshowspageviewstousers.Inthispaper,wecomparethepageviewsthatnewpicsimagesreceivewithhowmanythemostpopularpicsimagesreceive.Whileimperfect—itonlymeasuresasegmentofReddit—thisresultwillgiveusinsightintothesite'sdistributionofattention.BetweenApril25,2011andMay11,2011,wecrawledthemostpopularpicsimagesandthepicsnewqueueevery10seconds,recordingeveryimgur.comlink.Ourdatasetconsistsofpageviewstatisticsforthe14,864imagessubmittedtopicsandthe648mostpopularimagesduringthese17days.Theseparticular17dayswerenotasignicantaspectofthestudy;rather,theyallowedustostudyRedditovertimewhilecollectingenoughdatatodrawconclusions.Otherreferringsites.Onepotentialconfoundwiththeap-proachjustdescribedisthatpeoplecanvisitanimgurpicturebyreferralfromanysiteontheweb,orsimplybytypingitsURLdirectlyintotheirbrowser.Withoutaccesstoimgur'sserverlogs(andtheirassociatedHTTPreferrerdata),itisim-possibletosayhowmuchofanimage'strafcoriginatesfromReddit.However,itisimportanttoconsiderthataRedditorcreatedimgurexpresslyfortheRedditcommunitytoshareimages,callingit“MyGifttoReddit”[18].Whileimgurhasbranchedouttootheronlinecommunitiessincethen,inarecentinterviewgiventoTheWallStreetJournal,imgur'sfounderstatedthatRedditreferralsstilldominateimgur'strafc[9].Therefore,whilewecannotachievethekindofprecisionwewouldnormallylike,weseeitasreasonabletoexaminepageviewsofimgurimageslinkedfromRedditforarst-orderestimateofwhereattentionowsonthesite.Wearguethatdespitenoiseattheindividualimagelevel(i.e.,itwouldbehardtoanswer“DidthisimagegetpopularbecauseofReddit?”),thisnoisewashesoutbetweengroupsatthescaleoftensofthousandsofdatapoints.DuplicateSubmissionsIfenoughpeoplemonitortheRedditnewqueue,thentrulygoodcontentshouldrarelygooverlooked.(ThisisavariantofLinus'sLaw:“Givenenougheyeballs,allbugsareshallow”[22].)Inthesecondphaseofourwork,wecrawledRedditev-erytensecondsbetweenApril25andMay11,acquiringoveragigabyteoftext.Thistime,however,wecrawledtheRedditmainpageanditsoverallnewqueue.Ourdatasetconsistsofthe172,030linkssubmittedtoRedditandthe9,370linksthatappearedonthemostpopularpageduringthese17days.Wewillsearchthisdatasetforlinkswhichultimatelybecamepopular,butweresubmittedearlierbysomeoneelse.Theseearlierlinks—goodbydenitionbecausetheyultimatelybecameverypopular—wentoverlookedbythecommunity.Groundtruthdata.Thismethodallowsustoconstructgroundtruthdata:linksthatultimatelybecamepopular(eveniftheywereoverlookedtherstfewtimes)arepreciselywhatRedditvalues.However,ourapproachdisregardslinksthatcouldhavebeenpopularifonlytheyhadattractedReddit'sattention.Futureworkmightconsiderovercomingthislimi-tationbysendingnewRedditlinksenmassetoMechanicalTurk,forexample.Forthetimebeing,wearguethattheapproachpresentedherewillprovidetheresearchcommunitywithaconservativeestimateoftheproportionofvaluable,yetoverlookedcontent.RESULTSThemedianpageviewsforanimageputonthepicsnewqueue(butonewhichdidnotendupbecomingpopular)is557.Themedianpageviewsforapopularimage,ontheotherhand,is148,911.Putanotherway,animagewhichendsuponthemostpopularpagereceivesgreaterthantwoordersofmagnitudemoreviewsthanonethatdoesnot,WilcoxonW=9,008,436,p10�15.Asistypicalforquantitativesocialdata,thesep-valuesholdlessmeaningthanthemagnitudeofthedifferencesbetweengroups.Figure1illustratesthisndinggraphically,showingthelog-scaledistributionsofbothgroupsinducedviaGaussiankerneldensityestimates. REFERENCES1.L.AdamicandB.Huberman.Zipf'slawandtheInternet.Glottometrics,3(143–150),2002.2.Alexa.http://alexa.com.RetrievedJune1,2012.3.A.BarabásiandR.Albert.EmergenceofScalinginRandomNetworks.Science,286(5439):509–512,1999.4.Y.Benkler.TheWealthofNetworks:HowSocialProductionTransformsMarketsandFreedom.YaleUniversityPress,2006.5.F.Berkes,D.Feeny,B.McCay,andJ.Acheson.TheBenetsoftheCommons.Nature,340(6229):91–93,1989.6.R.Cao,A.Cuevas,andW.GonzalezManteiga.AComparativeStudyofSeveralSmoothingMethodsinDensityEstimation.ComputationalStatistics&DataAnalysis,17(2):153–176,1994.7.A.Dieberger,P.Dourish,K.Höök,P.Resnick,andA.Wexelblat.Socialnavigation:techniquesforbuildingmoreusablesystems.Interactions,7(6):36–45,2000.8.P.DourishandM.Chalmers.RunningOutofSpace:ModelsofInformationNavigation.InProc.CHI,1994.9.L.Gannes.Interview:Imgur'sPathtoaBillionImageViewsPerDay.http://dthin.gs/JgKPqL.RetrievedAugust27,2012.10.E.GilbertandK.Karahalios.UnderstandingDejaReviewers.InProc.CSCW,pages225–228,2010.11.HackerNews.http://news.ycombinator.com.RetrievedJune1,2012.12.G.Hardin.TheTragedyoftheCommons.Science,162(3859):1243–1248,1969.13.T.Khopkar,X.Li,andP.Resnick.Self-selection,Slipping,Salvaging,Slacking,andStoning:TheImpactsofNegativeFeedbackateBay.InProcEC,pages223–231,2005.14.A.Kim.Communitybuildingontheweb.PeachpitPress,2000.15.J.LaherrereandD.Sornette.Stretchedexponentialdistributionsinnatureandeconomy:“fattails”withcharacteristicscales.TheEuropeanPhysicalJournalB-CondensedMatterandComplexSystems,2(4):525–539,1998.16.C.LampeandP.Resnick.Slash(dot)andBurn:DistributedModerationinaLargeOnlineConversationSpace.InProc.CHI,pages543–550,2004.17.K.LermanandA.Galstyan.AnalysisofSocialVotingPatternsonDigg.InProc.WOSN,pages7–12,2008.18.MyGifttoReddit.http://redd.it/7zlyd.August27,2012.19.M.Olson.TheLogicofCollectiveAction:PublicGoodsandtheTheoryofGroups.HarvardUniversityPress,1974.20.E.Ostrom.GoverningtheCommons:TheEvolutionofInstitutionsforCollectiveAction.CambridgeUniversityPress,1999.21.R.Priedhorsky,J.Chen,S.Lam,K.Panciera,L.Terveen,andJ.Riedl.Creating,destroying,andrestoringvalueinWikipedia.InProc.GROUP,pages259–268,2007.22.E.Raymond.TheCathedralandtheBazaar:MusingsonLinuxandOpenSourcebyanAccidentalRevolutionary.O'Reilly&Associates,Inc.,2001.23.Reddit.http://www.reddit.com/help/faq#Whatisreddit.RetrievedJune1,2012.24.Redditlist.http://redditlist.com.RetrievedJune1,2012.25.P.ResnickandR.Zeckhauser.TrustAmongStrangersinInternetTransactions:EmpiricalAnalysisofeBay'sReputationSystem.AdvancesinAppliedMicroeconomics,11:127–157,2002.26.M.SmithandP.Kollock.CommunitiesinCyberspace.PsychologyPress,1999.