113K - views

Optimizing Search Engines using Clickthrough Data Thorsten Joachims Cornell Univ

cornelledu ABSTRACT This paper presents an approach to automatically optimiz ing the retrieval quality of search engines using clickthrough data Intuitively a good information retrieval system should present relevant documents high in the ranking wit

Embed :
Pdf Download Link

Download Pdf - The PPT/PDF document "Optimizing Search Engines using Clickthr..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Optimizing Search Engines using Clickthrough Data Thorsten Joachims Cornell Univ






Presentation on theme: "Optimizing Search Engines using Clickthrough Data Thorsten Joachims Cornell Univ"— Presentation transcript:

1.KernelMachineshttp:==svm:first:gmd:de=2.SupportVectorMachinehttp:==jbolivar:freeservers:com=3.SVM-LightSupportVectorMachinehttp:==ais:gmd:de=thorsten=svm light=4.AnIntroductiontoSupportVectorMachineshttp:==www:support�vector:net=5.SupportVectorMachineandKernelMethodsReferenceshttp:==svm:research:bell�labs:com=SVMrefs:html6.ArchivesofSUPPORT-VECTOR-MACHINES@JISCMAIL.AC.UKhttp:==www:jiscmail:ac:uk=lists=SUPPORT�VECTOR�MACHINES:html7.LucentTechnologies:SVMdemoapplethttp:==svm:research:bell�labs:com=SVT=SVMsvt:html8.RoyalHollowaySupportVectorMachinehttp:==svm:dcs:rhbnc:ac:uk=9.SupportVectorMachine-TheSoftwarehttp:==www:support�vector:net=software:html10.LagrangianSupportVectorMachineHomePagehttp:==www:cs:wisc:edu=dmi=lsvm Figure1:Rankingpresentedforthequery\supportvectormachine".Markedinboldarethelinkstheuserclickedon.ofthesearchengine.Inparticular,comparedtoexplicituserfeedback,itdoesnotaddanyoverheadfortheuser.Thequeryqandthereturnedrankingrcaneasilyberecordedwhenevertheresultingrankingisdisplayedtotheuser.Forrecordingtheclicks,asimpleproxysystemcankeepalog le.Fortheexperimentsinthispaper,thefollowingsystemwasused.EachqueryisassignedauniqueIDwhichisstoredinthequery-logalongwiththequerywordsandthepresentedranking.Thelinksontheresults-pagepresentedtotheuserdonotleaddirectlytothesuggesteddocument,butpointtoaproxyserver.Theselinksencodethequery-IDandtheURLofthesuggesteddocument.Whentheuserclicksonthelink,theproxy-serverrecordstheURLandthequery-IDintheclick-log.TheproxythenusestheHTTPLoca-tioncommandtoforwardtheusertothetargetURL.Thisprocesscanbemadetransparenttotheuseranddoesnotin uencesystemperformance.Thisshowsthatclickthroughdatacanberecordedeasilyandatlittlecost.Let'snowaddressthekeyquestionofhowitcanbeanalyzedinaprincipledandecientway.2.2WhatKindofInformationdoesClick­throughDataConvey?Therearestrongdependenciesbetweenthethreepartsof(q;r;c).Thepresentedrankingrdependsonthequeryqasdeterminedbytheretrievalfunctionimplementedinthesearchengine.Furthermore,thesetcofclicked-onlinksdependsonboththequeryqandthepresentedrankingr.First,auserismorelikelytoclickonalink,ifitisrelevanttoq[16].Whilethisdependencyisdesirableandinterestingforanalysis,thedependencyoftheclicksonthepresentedrankingrmuddiesthewater.Inparticular,auserislesslikelytoclickonalinklowintheranking,independentofhowrelevantitis.Intheextreme,theprobabilitythattheuserclicksonalinkatrank10.000isvirtuallyzeroevenifitisthedocumentmostrelevanttothequery.Nouserwillscrolldowntherankingfarenoughtoobservethislink.Therefore,inordertogetinterpretableandmeaningful retrievalfunction bxx tfc hand-tuned avg.clickrank 6.261.14 6.181.33 6.040.92 Table1:Averageclickrankforthreeretrievalfunc-tions(\bxx",\tfc"[23],anda\hand-tuned"strat-egythatusesdi erentweightsaccordingtoHTMLtags)implementedinLASER.RowscorrespondtotheretrievalmethodusedbyLASERatquerytime;columnsholdvaluesfromsubsequentevaluationwithothermethods.Figuresreportedaremeansandtwostandarderrors.Thedataforthistableistakenfrom[5].resultsfromclickthroughdata,itisnecessarytoconsiderandmodelthedependenciesofconqandrappropriately.Beforede ningsuchamodel,let's rstconsideraninter-pretationofclickthroughdatathatisnotappropriate.Aclickonaparticularlinkcannotbeseenasanabsoluterel-evancejudgment.ConsidertheempiricaldatainTable1.Thedataistakenfrom[5]andwasrecordedforthesearchengineLASERcoveringtheWWWoftheCMUSchoolofComputerScience.Thetableshowstheaveragerankoftheclicksperquery(e.g.3:67intheexampleinFigure1).Eachtablecellcontainstheaverageclickrankforthreere-trievalstrategiesaveragedover1400queries.Theaverageclickrankisalmostequalforallmethods.However,accord-ingtosubjectivejudgments,thethreeretrievalfunctionsaresubstantiallydi erentintheirrankingquality.Thelackofdi erenceintheobservedaverageclickrankcanbeex-plainedasfollows.Sinceuserstypicallyscanonlythe rstl(e.g.l10[24])linksoftheranking,clickingonalinkcannotbeinterpretedasarelevancejudgmentonanabso-lutescale.Maybeadocumentrankedmuchlowerinthelistwasmuchmorerelevant,buttheuserneversawit.Itappearsthatusersclickonthe(relatively)mostpromisinglinksinthetopl,independentoftheirabsoluterelevance.Howcantheserelativepreferencejudgmentsbecaptured ismaximal.Notethat(6)is(proportionalto)ariskfunc-tional[25]with�asthelossfunction.Whilethegoaloflearningisnowde ned,thequestionremainswhetheritispossibletodesignlearningmethodsthatoptimize(6)?4.ANSVMALGORITHMFORLEARNINGOFRANKINGFUNCTIONSMostworkonmachinelearningininformationretrievaldoesnotconsidertheformulationofabove,butsimpli esthetasktoabinaryclassi cationproblemwiththetwoclasses\relevant"and\non-relevant".Suchasimpli cationhasseveraldrawbacks.Forexample,duetoastrongma-jorityof\non-relevant"documents,alearnerwilltypicallyachievethemaximumpredictiveclassi cationaccuracy,ifitalwaysresponds\non-relevant",independentofwheretherelevantdocumentsareranked.Butevenmoreimportantly,Section2.2showedthatsuchabsoluterelevancejudgmentscannotbeextractedfromclickthroughdata,sothattheyaresimplynotavailable.Therefore,thefollowingalgorithmdirectlyaddresses(6),takinganempiricalriskminimiza-tionapproach[25].GivenanindependentlyandidenticallydistributedtrainingsampleSofsizencontainingqueriesqwiththeirtargetrankingsr(q1;r1);(q2;r2);:::;(qn;rn):(7)thelearnerLwillselectarankingfunctionffromafamilyofrankingfunctionsFthatmaximizestheempiricalS(f)=1 nnXi=1(rf(qi);ri):(8)onthetrainingsample.Notethatthissetupisanalogoustoe.g.classi cationbyminimizingtrainingerror,justthatthetargetisnotaclasslabel,butabinaryorderingrelation.4.1TheRankingSVMAlgorithmIsitpossibletodesignanalgorithmandafamilyofrank-ingfunctionsFsothat(a) ndingthefunctionf2Fmaxi-mizing(8)isecient,and(b)thatthisfunctiongeneralizeswellbeyondthetrainingdata.Considertheclassoflinearrankingfunctions(di;dj)2f~w(q)()~w(q;di)�~w(q;dj):(9)~wisaweightvectorthatisadjustedbylearning.(q;d)isamappingontofeaturesthatdescribethematchbetweenqueryqanddocumentdlikeinthedescription-orientedre-trievalapproachofFuhretal.[10][11].Suchfeaturesare,forexample,thenumberofwordsthatqueryanddocumentshare,thenumberofwordstheyshareinsidecertainHTMLtags(e.g.TITLE,H1,H2,...),orthepage-rankofd[22](seealsoSection5.2).Figure2illustrateshowtheweightvector~wdeterminestheorderingoffourpointsinatwo-dimensionalexample.Foranyweightvector~w,thepointsareorderedbytheirprojectiononto~w(or,equivalently,bytheirsigneddistancetoahyperplanewithnormalvector~w).Thismeansthatfor~w1thepointsareordered(1;2;3;4),while~w2impliestheordering(2;3;1;4).Insteadofmaximizing(8)directly,itisequivalenttomin-imizethenumberQofdiscordantpairsinEquation(2).Fortheclassoflinearrankingfunctions(9),thisisequivalentto ndingtheweightvectorsothatthemaximumnumberof Figure2:Exampleofhowtwoweightvectors~w1and~w2rankfourpoints.thefollowinginequalitiesisful lled.8(di;dj)2r1:~w(q1;di)�~w(q1;dj)(10):::8(di;dj)2rn:~w(qn;di)�~w(qn;dj)(11)Unfortunately,adirectgeneralizationoftheresultin[13]showsthatthisproblemisNP-hard.However,justlikeinclassi cationSVMs[7],itispossibletoapproximatethesolutionbyintroducing(non-negative)slackvariablesi;j;kandminimizingtheupperboundPi;j;k.AddingSVMreg-ularizationformarginmaximizationtotheobjectiveleadstothefollowingoptimizationproblem,whichissimilartotheordinalregressionapproachin[12].OptimizationProblem1.(RankingSVM)minimize:V(~w;~)=1 2~w~w+CXi;j;k(12)subjectto:8(di;dj)2r1:~w(q1;di)~w(q1;dj)+1�i;j;1:::(13)8(di;dj)2rn:~w(qn;di)~w(qn;dj)+1�i;j;n8i8j8k:i;j;k0(14)Cisaparameterthatallowstrading-o marginsizeagainsttrainingerror.Geometrically,themarginisthedistancebetweentheclosesttwoprojectionswithinalltargetrank-ings.ThisisillustratedinFigure2.OptimizationProblem1isconvexandhasnolocalop-tima.Byrearrangingtheconstraints(13)as~w((qk;di)�(qk;dj))1�i;j;k;(15)itbecomesapparentthattheoptimizationproblemisequiv-alenttothatofaclassi cationSVMonpairwisedi erencevectors(qk;di)�(qk;dj).Duetothissimilarity,itcanbesolvedusingdecompositionalgorithmssimilartothoseusedforSVMclassi cation.Inthefollowing,anadaptationoftheSVMlightalgorithm[14]isusedfortraining1.Itcanbeshownthatthelearnedretrievalfunctionf~wcanalwaysberepresentedasalinearcombinationofthefeature 1Availableathttp:==svmlight:joachims:org RankingA:1.KernelMachineshttp:==svm:first:gmd:de=2.SVM-LightSupportVectorMachinehttp:==ais:gmd:de=thorsten=svm light=3.SupportVectorMachineandKernel...Referenceshttp:==svm:::::com=SVMrefs:html4.LucentTechnologies:SVMdemoapplethttp:==svm:::::com=SVT=SVMsvt:html5.RoyalHollowaySupportVectorMachinehttp:==svm:dcs:rhbnc:ac:uk=6.SupportVectorMachine-TheSoftwarehttp:==www:support�vector:net=software:html7.SupportVectorMachine-Tutorialhttp:==www:support�vector:net=tutorial:html8.SupportVectorMachinehttp:==jbolivar:freeservers:com= RankingB:1.KernelMachineshttp:==svm:first:gmd:de=2.SupportVectorMachinehttp:==jbolivar:freeservers:com=3.AnIntroductiontoSupportVectorMachineshttp:==www:support�vector:net=4.ArchivesofSUPPORT-VECTOR-MACHINES...http:==www:jiscmail:ac:uk=lists=SUPPORT:::5.SVM-LightSupportVectorMachinehttp:==ais:gmd:de=thorsten=svm light=6.SupportVectorMachine-TheSoftwarehttp:==www:support�vector:net=software:html7.LagrangianSupportVectorMachineHomePagehttp:==www:cs:wisc:edu=dmi=lsvm8.ASupport...-Bennett,Blue(ResearchIndex)http:==citeseer:::=bennett97support:html CombinedResults:1.KernelMachineshttp:==svm:first:gmd:de=2.SupportVectorMachinehttp:==jbolivar:freeservers:com=3.SVM-LightSupportVectorMachinehttp:==ais:gmd:de=thorsten=svm light=4.AnIntroductiontoSupportVectorMachineshttp:==www:support�vector:net=5.SupportVectorMachineandKernelMethodsReferenceshttp:==svm:research:bell�labs:com=SVMrefs:html6.ArchivesofSUPPORT-VECTOR-MACHINES@JISCMAIL.AC.UKhttp:==www:jiscmail:ac:uk=lists=SUPPORT�VECTOR�MACHINES:html7.LucentTechnologies:SVMdemoapplethttp:==svm:research:bell�labs:com=SVT=SVMsvt:html8.RoyalHollowaySupportVectorMachinehttp:==svm:dcs:rhbnc:ac:uk=9.SupportVectorMachine-TheSoftwarehttp:==www:support�vector:net=software:html10.LagrangianSupportVectorMachineHomePagehttp:==www:cs:wisc:edu=dmi=lsvm Figure3:Exampleforquery\supportvectormachine".ThetwoupperboxesshowtherankingsreturnedbyretrievalfunctionsAandB.Thelowerboxcontainsthecombinedrankingpresentedtotheuser.Thelinkstheuserclickedonaremarkedinbold.andBarepresentedinthesameway)itisprovenandempiricallyveri edin[16]thattheconclu-sionsdrawnfromthismethodleadtothesameresultasanevaluationwithexplicitmanualrelevancejudgmentsforlarges.5.2OfineExperimentThisexperimentveri esthattheRankingSVMcanin-deedlearnregularitiesusingpartialfeedbackfromclick-throughdata.Togeneratea rsttrainingset,IusedtheStriversearchengineforallofmyownqueriesduringOc-tober,2001.StriverdisplayedtheresultsofGoogleandMSNSearchusingthecombinationmethodfromtheprevi-oussection.Allclickthroughtripletswererecorded.Thisresultedin112querieswithanon-emptysetofclicks.Thisdataprovidesthebasisforthefollowingoineexperiment.TolearnaretrievalfunctionusingtheRankingSVM,itisnecessarytodesignasuitablefeaturemapping(q;d)describingthematchbetweenaqueryqandadocumentd.Thefollowingfeaturesareusedintheexperiment.However,thissetoffeaturesislikelytobefarfromoptimal.Whiletheattributesre ectsomeofmyintuitionaboutwhatcouldbeimportantforlearningagoodranking,Iincludedonlythosefeaturesthatwereeasytoimplement.Furthermore,Ididnotdoanyfeatureselectionorsimilartuning,sothatanappropriatedesignoffeaturespromisesmuchroomforimprovement.Theimplementedfeaturesarethefollowing:1.Rankinothersearchengines(38featurestotal):rank X:100minusrankinX2fGoogle,MSN-Search,Altavista,Hotbot,Excitegdividedby100(mini-mum0)top1 X:ranked#1inX2fGoogle,MSNSearch,Al-tavista,Hotbot,Exciteg(binaryf0;1g)top10 X:rankedintop10inX2fGoogle,MSN-Search,Altavista,Hotbot,Exciteg(binaryf0;1g)top50 X:rankedintop50inX2fGoogle,MSN-Search,Altavista,Hotbot,Exciteg(binaryf0;1g)top1count X:ranked#1inXofthe5searchenginestop10count X:rankedintop10inXofthe5searchenginestop50count X:rankedintop50inXofthe5searchengines2.Query/ContentMatch(3featurestotal):query url cosine:cosinebetweenURL-wordsandquery(range[0;1])query abstract cosine:cosinebetweentitle-wordsandquery(range[0;1])domain name in query:querycontainsdomain-namefromURL(binaryf0;1g)3.Popularity-Attributes(20:000featurestotal):url length:lengthofURLincharactersdividedby30country X:countrycodeXofURL(binaryattributef0;1gforeachcountrycode) weightfeature 0.60query abstract cosine0.48top10 google0.24query url cosine0.24top1count 10.24top10 msnsearch0.22host citeseer0.21domain nec0.19top10count 30.17top1 google0.17country de...0.16abstract contains home0.16top1 hotbot...0.14domain name in query...-0.13domain tu-bs-0.15country -0.16top50count 4-0.17url length-0.32top10count 0-0.38top1count 0 Table3:Featureswithlargestandsmallestweightsaslearnedfromthetrainingdataintheonlineex-periment.5.4AnalysisoftheLearnedFunctionThepreviousresultshowsthatthelearnedfunctionim-provesretrieval.Butwhatdoesthelearnedfunctionlooklike?Isitreasonableandintuitive?SincetheRankingSVMlearnsalinearfunction,onecananalyzethefunctionbystudyingthelearnedweights.Table3displaystheweightsofsomefeatures,inparticular,thosewiththehighestabso-luteweights.Roughlyspeaking,ahighpositive(negative)weightindicatesthatdocumentswiththesefeaturesshouldbehigher(lower)intheranking.TheweightsinTable3arereasonableforthisgroupofusers.Sincemanyquerieswereforscienti cmaterial,itap-pearsnaturalthatURLsfromthedomain\citeseer"(andthealias\nec")receivedpositiveweight.Themostin u-entialweightsareforthecosinematchbetweenqueryandabstract,whethertheURLisinthetop10fromGoogle,andforthecosinematchbetweenqueryandthewordsintheURL.Adocumentreceiveslargenegativeweights,ifitisnotrankedtop1byanysearchengine,ifitnotinthetop10ofanysearchengine(notethatthesecondimpliesthe rst),andiftheURLislong.Alltheseweightsarereasonableandmakesenseintuitively.6.DISCUSSIONANDRELATEDWORKTheexperimentalresultsshowthattheRankingSVMcansuccessfullylearnanimprovedretrievalfunctionfromclick-throughdata.Withoutanyexplicitfeedbackormanualpa-rametertuning,ithasautomaticallyadaptedtothepartic-ularpreferencesofagroupof20users.Thisimprovementisnotonlyaveri cationthattheRankingSVMcanlearnusingpartialrankingfeedback,butalsoanargumentforper-sonalizingretrievalfunctions.Unlikeconventionalsearchenginesthathaveto\ t"theirretrievalfunctiontolargeandthereforeheterogeneousgroupsofusersduetothecostofmanualtuning,machinelearningtechniquescanimproveretrievalsubstantiallybytailoringtheretrievalfunctiontosmallandhomogenousgroups(orevenindividuals)withoutprohibitivecosts.Whilepreviousworkonlearningretrievalfunctionsexists(e.g.[10]),mostmethodsrequireexplicitrelevancejudg-ments.MostcloselyrelatedistheapproachofBartelletal.[2].Theypresentamixture-of-expertsalgorithmsforlinearlycombiningrankingexpertsbymaximizingadi er-entrankcorrelationcriterion.However,intheirsetuptheyrelyonexplicitrelevancejudgments.AsimilaralgorithmforcombiningrankingswasproposedbyCohenatal.[6].Theyshowempiricallyandtheoreticallythattheiralgorithm ndsacombinationthatperformsclosetothebestofthebasicexperts.TheboostingalgorithmofFreundetal.[9]isanap-proachtocombiningmanyweakrankingrulesintoastrongrankingfunctions.Whiletheyalso(approximately)mini-mizethenumberofinversions,theydonotexplicitlycon-sideradistributionoverqueriesandtargetrankings.How-ever,theiralgorithmcanprobablybeadaptedtothesettingconsideredinthispaper.Algorithmicallymostcloselyre-latedistheSVMapproachtoordinalregressionbyHerbrichetal.[12].But,again,theyconsideradi erentsamplingmodel.Inordinalregressionallobjectsinteractandtheyarerankedonthesamescale.Fortherankingproblemininfor-mationretrieval,rankingsneedtobeconsistentonlywithinaquery,butnotbetweenqueries.Thismakestherankingproblemlessconstrained.Forexample,intherankingprob-lemtwodocumentsdianddjcanendupatverydi erentranksfortwodi erentqueriesqkandqleveniftheyhaveexactlythesamefeaturevector(i.e.(qk;di)=(ql;dj)).Anelegantperceptron-likealgorithmforordinalregressionwasrecentlyproposedbyCrammerandSinger[8].Anin-terestingquestioniswhethersuchanonlinealgorithmcanalsobeusedtosolvetheoptimizationproblemconnectedtotheRankingSVM.Someattemptshavebeenmadetouseimplicitfeedbackbyobservingclickingbehaviorinretrievalsystems[5]andbrowsingassistants[17][20].However,thesemanticsofthelearningprocessanditsresultsareunclearasdemonstratedinSection2.2.Thecommercialsearchengine\DirectHit"makesuseofclickthroughdata.Theprecisemechanism,however,isunpublished.Whileforadi erentproblem,aninterestinguseofclickthroughdatawasproposedin[3].TheyuseclickthroughdataforidentifyingrelatedqueriesandURLs.WhatarethecomputationaldemandsoftrainingtheRank-ingSVMonclickthroughdata?SinceSVMlight[15]solvesthedualoptimizationproblem,itdependsonlyoninnerproductsbetweenfeaturevectors(q;d).Ifthesefeaturevectorsaresparseasabove,SVMlightcanhandlemillionsoffeatureseciently.Mostin uentialonthetrainingtimeisthenumberofconstraintsinOptimizationProblem2.However,whenusingclickthroughdata,thenumberofcon-straintsscalesonlylinearlywiththenumberofqueries,ifthenumberofclicksperqueryisupperbounded.Inotherapplications,SVMlighthasalreadyshowedthatitcansolveproblemswithseveralmillionsofconstraintsusingaregulardesktopcomputer.However,scalingtotheorderofmag-nitudefoundinmajorsearchenginesisaninterestingopenproblem. [15]T.Joachims.LearningtoClassifyTextUsingSupportVectorMachines{Methods,Theory,andAlgorithms.Kluwer,2002.[16]T.Joachims.Unbiasedevaluationofretrievalqualityusingclickthroughdata.Technicalreport,CornellUniversity,DepartmentofComputerScience,2002.http://www.joachims.org.[17]T.Joachims,D.Freitag,andT.Mitchell.WebWatcher:atourguidefortheworldwideweb.InProceedingsofInternationalJointConferenceonArti cialIntelligence(IJCAI),volume1,pages770{777.MorganKaufmann,1997.[18]J.KemenyandL.Snell.MathematicalModelsintheSocialSciences.Ginn&Co,1962.[19]M.Kendall.RankCorrelationMethods.Hafner,1955.[20]H.Lieberman.Letizia:AnagentthatassistsWebbrowsing.InProceedingsoftheFifteenthInternationalJointConferenceonArti cialIntelligence(IJCAI'95),Montreal,Canada,1995.MorganKaufmann.[21]A.Mood,F.Graybill,andD.Boes.IntroductiontotheTheoryofStatistics.McGraw-Hill,3edition,1974.[22]L.PageandS.Brin.Pagerank,aneigenvectorbasedrankingapproachforhypertext.In21stAnnualACM/SIGIRInternationalConferenceonResearchandDevelopmentinInformationRetrieval,1998.[23]G.SaltonandC.Buckley.Termweightingapproachesinautomatictextretrieval.InformationProcessingandManagement,24(5):513{523,1988.[24]C.Silverstein,M.Henzinger,H.Marais,andM.Moricz.Analysisofaverylargealtavistaquerylog.TechnicalReportSRC1998-014,DigitalSystemsResearchCenter,1998.[25]V.Vapnik.StatisticalLearningTheory.Wiley,Chichester,GB,1998.[26]Y.Yao.Measuringretrievale ectivenessbasedonuserpreferenceofdocuments.JournaloftheAmericanSocietyforInformationScience,46(2):133{145,1995.APPENDIXTheorem1.Letrrelbetherankingplacingallrelevantdocumentsaheadofallnon-relevantdocumentsandletrsysbethelearnedranking.IfQisthenumberofdiscordantpairsbetweenrrelandrsys,thentheaverageprecisonisatleastAvgPrec(rsys;rrel)1 RQ+R+12�1 RXi=1p i!2ifthereareRrelevantdocuments.Proof.Ifp1;:::;pRaretheranksoftherelevantdocu-mentsinrsyssortedinincreasingorder,thenAveragePre-cisioncanbecomputedasAvgPrec(rsys;rrel)=1 RRXi=1i pi(24)WhatistheminimumvalueofAvgPrec(rsys;rrel),giventhatthenumberofdiscordantpairsis xed.Itiseasytoseethatthesumoftheranksp1+:::+pRisrelatedtothenumberofdiscordantQasfollows.p1+:::+pR=Q+R+12(25)Itisnowpossibletowritethelowerboundasthefollow-ingintegeroptimizationproblem.ItcomputestheworstpossibleAveragePrecisionfora xedvalueofQ.minimize:P(p1;:::;pR)=1 RRXi=1i pi(26)subject.to:p1+:::+pR=Q+R+12(27)1p1:::pR(28)p1;:::;pRinteger(29)Relaxingtheproblembyremovingthelasttwosetsofcon-straintscanonlydecreasetheminimum,sothatthesolutionwithouttheconstraintsisstillalowerbound.Theremain-ingproblemisconvexandcanbesolvedusingLagrangemultipliers.TheLagrangianisL(p1;:::;pR; )=1 RRXi=1i pi+ "RXi=1pi�Q�R+12#:(30)Attheminimumoftheoptimizationproblem,theLagrangianisknowntohavepartialderivativesequaltozero.StartingwiththepartialderivativesforthepiL(p1;:::;pR; ) pi=�iR�1p�2i+ :=0;(31)solvingforpi,andsubstitutingbackintotheLagrangianleadstoL(p1;:::;pR; )=2R�1 2 1 2RXi=1i1 2� Q�R+12:(32)Nowtakingthederivativewithrespectto L(p1;:::;pR; )  =R�1 2 �1 2RXi=1i1 2�Q�R+12:=0;(33)solvingfor ,andagainsubstitutingintotheLagrangianleadstothedesiredsolution.