Presentation on theme: "Exposing Inconsistent Web Search Results with Bobble X"— Presentation transcript
mightpreventthemfromseeing.Moreover,personalizationfrequentlyoccurswithouttheuser'sinvolvementorevenexplicitagreementsousersmaynotevenbeawarethattheirsearchresultshavebeentailoredaccordingtotheirproleandpreferences.Thegoalofourworkistoexposeandcharacterizeinconsistenciesthatresultfrompersonalization.Inparticular,weseektoquantifytheextenttowhichsearchpersonal-izationalgorithmsreturnresultsthatareinconsistentwiththosethatwouldbereturnedtootherusers,andexposeanydifferencestotheuserinrealtime.WepresentBobble,aChromeWebbrowserextensionthatallowsuserstoseehowthesearchresultsthatGooglereturnstothemdifferfromtheresultsthatarereturnedtootherusers.Bobblecapturesauser'ssearchqueryandreissuesitfromasubsetofover300world-widevantagepoints,includingbothdedicatedPlanetLabmeasurementnodesandthehostsofotherconsentingBobbleusers.Incontrasttoresearchtoolsthathavebeendevelopedtomeasuresearchpersonalizationofine[5],weintenduserstouseBobblewhiletheybrowsetheWeb,providingthemcriticalinsightintohowtheironlineexperienceisbeingpotentiallydistortedbypersonalization.TounderstandthenatureoftheinconsistenciesuncoveredbyBobble,westudymorethan75,000realsearchqueriesissuedbyhundredsofBobbleusersoverninemonths.Wequantifytheextenttowhichpersonalizationaffectssearchresultsanddeterminehowusers'Googlesearchresultsvarybasedonfactorsrangingfromtheirgeographiclocationstotheirpastsearchhistories.OurstudystudyfocusesexclusivelyonGooglesearch,oneofthemorewidelyusedsearchengines,butweexpectthatsimilarphenom-enaexistforotherpopularsearchengines.Wendthat98%ofGoogleWebsearchesreturnatleastonesetofinconsistentsearchresultstypicallyfromavantagepointinadifferentgeographicregionthantheuser,eventhoughBobbleperformsthesesearcheswithoutexposinganyinformationthatlinkstothesearchers'Googleproles.Insum,ourstudyprovidestherstlarge-scaleglimpseintothenatureofinconsis-tentresultsthatarisefromsearchpersonalizationandopensmanyavenuesforfutureresearch.Wequantifyonhowgeographyandsearchhistorymayinuencesearchre-sults,butothershavenotedthatmanyotherfactors(e.g.,devicetype,timeofday)mayalsoaffecttheresultsthatauserseesforagivensearchterm[5].Bobblehasbeendeployedandpubliclyavailablefor21months;usersandresearcherscanextendittomeasurehowotherfactorsmightinduceinconsistenciesinsearchresults.2RelatedWorkResearchershavepreviouslystudiedmeanstopersonalizeWebsearchresults.Douetal.performedalarge-scaleevaluationandanalysisofvepersonalizedsearchalgorithmsusingatwelve-dayMSNquerylog[2].Theyndthatprole-basedpersonalizationalgorithmsaresometimesunstable.Teevanetal.conductauserstudytoinvestigatethevalueofpersonalizedWebsearch[11].Incontrast,wearelessinterestedinthedistinctionbetweendifferentpersonalizationmethods,andfocusinsteadontheeffectsofasinglesearchpersonalizationalgorithm.Weaimtoquantifytheeffectsofdifferentpersonalizationfactorsonsearchinconsistency.Inacontemporaneousstudy,Hannaketal.measurethepersonalizationofGooglesearch.Thebulkoftheireffortfocusesonunderstandingthefeaturesleadingtoperson-
withsamebrowser
withChromeagent
p-value
Windows
11/1,000
16/1,000
0.1725
Linux
23/1,000
21/1,000
0.7517
Mac
15/1,000
15/1,000
1.0
Table1:Thenumberoftermsthatgenerateinconsistentsetsofsearchresultswhensearching1,000distincttermsfromChromebrowsers/agentondifferentOSes.theBobbleserverforpendingsearchterms(Step3)andreissuethemlocallyassearchqueriestoGooglewithoutsigningintoaGoogleaccountorrevealingGoogleatrackablebrowsercookie(Step4).EachagentpushestheresultsitreceivesfromGoogletotheBobbleserver.Toestablishabaselineforcomparinginconsistenciesinsearchresults,wewouldideallyliketoalsoreissuetheuser'squerylocallyfromaseparatebrowsersessionthatisnotsignedintoGoogleanddoesnotpasssessioncookiestoGoogle.Wecalltheseanonymousqueriesorganic,astheyareasfreeaspossiblefromuser-specicinuences(incontrasttoqueriesthatareissuedwhenauserisloggedinorpassingbrowsercookiestoGoogle).Unfortunately,collectingtrueorganicresultsischalleng-ingduetothetechnicalandusabilityobstaclessurroundingloggingtheuseroutinordertoissuesuchaqueryfromanextensionrunningwithinthesameWebbrowser.In-stead,BobblecollectsorganicsearchresultsbyissuingaduplicatequeryfromanearbyChromebrowseragent.(Section3.2presentsadetaileddiscussionoftheeffectsofusinganearbyagenttostand-infortheuser'sbrowser.)3.2ValidationToevaluatewhetherBobbleaccuratelyreportsresultsthatregularuserswouldactuallyreceive,werstvalidatethatBobble'sChromebrowseragentcorrectlyemulatesmajorversionreleasesofChromebrowsersspecically,thattheresultsreturnedtoaBobbleagentreectthosethatwouldbereturnedtoanactualqueryissuedbyauserinherWebbrowser.Second,wemeasuretheeffectsofcollectingorganicsearchresultsindirectlybyissuingqueriesfromnearbyagentsasopposedtoinsidetheuser'sbrowser.DoBobbleagentsemulatebrowserbehavior?WebeginbyensuringthattheGooglesearchresultscollectedusingtheChromebrowseragentdonotdifferstatisticallyfromtheresultsobtainedwhenthequeryisissuedfromtheGooglehomepageviewedwiththeChromebrowseritself.Werandomlyselect1,000uniquesearchtermsfromthedailytop-20GoogletrendingsearchtermsbetweenAugust2011andDecember2011andsearcheachofthesetermsthreetimesfrommachinesrunningLinux,Windows,andMacoperatingsystems.Oneachmachine,werunaChromebrowseragentandtwoGoogleChromebrowserswiththesamereleaseversion.WeusetheSeleniumChromedriver[9]toautomatethetwoChromebrowsersandonebrowseragenttoperformthesameGooglesearchsimultaneously.OnemightexpectthatsimultaneouslyissuedqueriesfromidenticalWebbrowserswouldreturnidenticalsetsofresults,sincethequeriesdonotinvolveanysearchhis-toryandareissuedfromthesamelocationatessentiallythesametime.Whilethisexpectationgenerallyringstrue,itisnotalwaysthecase.Table1showsthenumberof
Fig.2:CDFplot:thedistributionofthenumberofsearchqueries.
Fig.3:Thedistributionofthenumberofsearchquerieswhensendingqueriestogoogle.comandaGoogleIPaddress,re-spectively.searchresults.ThisresultindicatesthatorganicsearchresultsofmostGooglesearchqueriesaretailoredonthebasisofthelocationwherethesesearchesareperformed,eventhoughGoogleusersneithersignintotheiraccountsnoruncovertheirbrowsercookiestoGooglepersonalizedsearchservices.Inthefollowingsection,wefurtherdesignacarefulexaminationtoexplorewhethertheobservedsearchinconsistencyresultsfromlocation-basedpersonalizationratherthandatadiversityacrossdifferentGoogledatacenters.Toquantifytheeffectofgeographiclocationonsearchinconsistency,weclassiedtheinconsistentsearchresultsinthreeways:Atleastonesearchresultappearsinthetop-threesearchresultsofotherPlanetLabnodesbutnotatallinaGoogleuser'sorganicsearchresultset.Wendthat23,394outof76,307searchqueries(30.66%)giverisetothissituation.Atleastonesearchresultsappearsinthetop-10(butnottop-3)searchresultsofotherPlanetLabnodes,butdoesnotappearinaGoogleuser'sorganicsearchresultset;65,939outof76,307searchqueries(86.41%)tthissituation.AtleastonesearchresultappearsintheGoogleuser'sorganicsearchresultsetbutdoesnotappearinsearchresultsofotherPlanetLabnodes;1,434searchqueriesoutof76,307searchqueries(1.88%)tthissituation.Consideringthefactthatthetop-10Googlesearchresultsreceiveabout90%ofclicksandthetop-3Googlesearchresultsusuallyreceivethemostattention[10],thein-consistencythatarisesduetolocationlikelyhassignicantimplicationsforauser'sexperience.5.2DistributedindexinconsistenciesTovalidatetheobservedsearchinconsistencyisinfactderivedfromlocation-basedpersonalizationratherthandatadiversityacrossdifferentdatacenters,weconductanexperiment.Inparticular,wemodifyBobbletoattempttoisolatetheinconsistencycon-tributedbylocation-basedpersonalizationfromthatcontributedbyinconsistenciesin
Fig.4:%ofsearchresultschangedateachrank.
Signed-indataset
Signed-outdataset
Location
Prole
Location
Prole
97.64%
64.19%
97.80%
58.77%
Table2:Howlocationanduserprolecontributetosearchinconsistency.Loca-tionhasmoreeffectoninconsistencythansearchhistorydoes.etal.(seeFigure5inpreviouswork[5]).Onepossiblereasonforthisdiscrepancyisthedifferenceinthemeasurementmethod.PreviousworkrecruiteddifferntGoogleuserstosearchthesamesetofkeywords,wherethekeywordswerechosensuchthattheyweredeemedtonotberelatedtouserproles.Incontrast,weperformourstudyinamorenaturalsettingbecauseitmeasurestheinuenceoftheprole-basedpersonaliza-tionusingeachuser'sownsearchqueries.Becauseauser'spastqueriesaretypicallyrelevanttopersonalizationthatmayoccurinthefuture,weobservethatprole-basedpersonalizationhasmoreinuenceonGoogleusers'searchresults.Inadditiontoinconsistenciesinthesearchresultsets,wealsodiscoveredthefol-lowinginconsistencies:Forsigned-inusers,22,405outof66,138searchqueries(33.88%)haveatleastonesearchresultthatshowsintheprole-basedpersonalizedsearchresultsetbutnotintheorganicsearchresultset.Foranonymoususers,3,148outof10,169searchqueries(30.96%)haveatleastonesearchresultthatshowsintheprole-basedpersonalizedsearchresultsetbutnotintheorganicsearchresultset.Forsigned-inusers,7,352outof66,138searchqueries(11.12%)haveatleastonesearchresultthatshowsinthetop3oftheorganicsearchresultsetbutnotintheproled-basedpersonalizedsearchresultset.Foranonymoususers,1,484outof10,169searchqueries(14.59%)haveatleastonesearchresultthatshowsinthetop3oforganicsearchresultbutnotintheproled-basedpersonalizedsearchresultset.Table2alsoshowsthattheGooglesearchinconsistenciesresultingfromsigned-inusers'prolesarestrongerthanthoseresultingfromsigned-outusers'proles.Finally,wealsoobservelocation-basedfactorsintroducemoreinconsistenciesthanprole-basedfactorsdo.7ConclusionWehavedesigned,implemented,anddeployedBobble,adistributedsystemthattracksandmonitorstheinconsistencyofsearchresultsforusersearchqueries.UsingBob-ble,wecollectusersearchtermsandresultsandmeasurethesearchinconsistencythatarisefrombothgeographiclocationandsearchhistory.Wendthatthegeographic