Welch Junghoo Cho and Walter Chang mjwelchcsuclaedu chocsuclaedu wachangadobecom Abstract With the proliferation of online distribution methods for videos content owners require easier and more e64256ective methods for monetization through advertis ID: 1529
Download Pdf The PPT/PDF document " UCLA Computer Science Department Techni..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
UCLAComputerScienceDepartmentTechnicalReport#100025 Theremainderofthispaperisorganizedasfollows.WediscussrelatedworkinSection2andprovideanoverviewofourgoalsandapproachinSection3.Section4describesoursystemforextractingkeywordsfromcontent-basedtextsources,andinSection5wepresentmethodsforexpandingthosetermstoaddressvocabularymismatchproblemsbetweenthesourcekeywordsandthosechosenbyadvertisers.WeevaluatethekeywordsgeneratedbythesemethodsforvariousvideotypesandsourcesoftextinSection6,anddiscussourobservations,conclusions,andfutureworkinSection7.2RelatedWorkSponsoredsearch,oradvertisingdisplayedalongsidethesearchresultsofauser-suppliedkeywordquery,typicallyinvolvesacomplexcombinationofadvertisersbiddingonkeywords,reviewofadvertisementsforrelevance,andanauctionprocesstoplaceadsalongsidesearchresults.See[2]foranoverviewofsponsoredsearch.Indisplayorcontent-matchadvertising,however,explicitkeywordsforthecontentarenotpro-vided.Inonlineadvertisingitisimportanttodisplayadsrelevanttoapage'scontent[24].Withoutuser-suppliedkeywords,researchershaveinvestigatednumerouskeywordidenticationtechniquesandapproachestomatchadvertisementswiththecontentofWebpages.Ontologiesortaxonomieshavebeenusedincombinationwithfeatureidenticationforsemanticapproachestomatchingadvertisementswithcontent[5,7].Ontologiesareoftendomain-specicandtedioustoconstruct,andtheindividualtextelementsfromscriptsordialogareoftenterse,makingtheuseofclassicationtechniquesorontologiesmoreerrorprone.Yih,Goodman,andCarvalho[25]establishseveralfeaturesofWebpagesandquerylogs,suchasfrequency,textualcharacteristics(e.g.capitalization),andstructuralcuesforidentifyingadvertisingkeywords.Ribeiro-Netoetal.[18]proposestrategiesformatchingthetextofaWebpagewithtext-basedadvertisementsinaknownadinventory.Theyaddressthevocabularyimpedanceproblembyrepresentingapagewithconceptsfromitsnearest(mostsimilar)neighbors.Ravietal.[17]proposeatwophasegenerativemodelforidentifyingrelevantadvertisingkeywordsforagivenWebpage.TheyuseapopularmachinetranslationmethodtolearnaprobabilisticsetofkeywordmappingsfromatrainingcorpusofWebpagesassociatedwithadsandadvertiserchosenkeywords.TermweightsareassignedbasedonHTMLfeatures.Abigramlanguagemodeltrainedonsearchqueriesisusedtohelprankthegeneratedcandidatekeywords.Findingrelatedbut\lessobvious"(andthereforelessexpensive)keywordsfromanadvertiserspointofview[11,1]hasbeenaddressedaswell.Ourworkdiersfromtheseproblemsinseveralways.OurtextsourcesareplaintextwhichlackexplicitstructuralcuessuchasHTMLmarkup.Wethereforemustresorttostatisticalmethodsforrankingandselectingkeywordsfromthesourcetext.Severaloftheabovetechniquesalsorequiretaggedtrainingdata,languageanddomain-specicontologies,orpre-constructedpoolsofavailableadvertisements.Ourmethodsarelanguage-independentandunsupervised,notrequiringanytrainingdata.Relatedtermidenticationisawellresearchedproblemintheinformationretrievaldomain,wheretaskssuchasqueryrewritingorexpansionarewidelystudied.Voorhees[23]describedtheuseoflexicalrelationshipscontainedinWordNetforqueryexpansion.Buckleyetal.[6]notedthatrelatedtermswilltypicallyco-occurnon-randomlyindocumentsrelevanttoaquery.Morerecently,SahamiandHeilman[19]useasimilarnotiontocomputethesemanticsimilarityofshorttextsnippetsusingWebsearchresultsasanopaquecontext.OurworkfollowsalongtheselinesbyusingWebsearchresultstodiscoverrelatedco-occurringterms.WealsousetheimplicitsemanticrelationshipscapturedinthehyperlinkedstructureofWikipediatoidentifyrelatedterms.Hauptmannsummarizesmany\lessonslearned"regardingspeechrecognitionaccuracyandtheeectsofworderrorrateoninformationretrievalprecision[9].Inparticulartheirresearchshowsthatthebestsystemsachieveworderrorratesaround0.15underidealconditions(suchasin-studioanchorsforbroadcastnews),andthatretrievalperformancedegradesrelativelygracefullywithrespecttoperfecttexttranscriptsuntilworderrorratesapproach0.40.Whiletheirworkonretrievalisorthogonaltoourfocusofadvertisingkeywordselection,weconsiderthesendingsaswecompareourresultsbetweenperfecttexttranscripts(closedcaptioningtracks)andspeechtranscripts.Keywordidenticationformultimediaoftenutilizes,inpart,attributesextractedfromimagesaspartofalargerfeaturespaceformachinelearning.VelivelliandHuang[22]predicttagsforvideosbasedonimagefeaturesandspeechtranscripts.Usingacollectionofspeechtranscripts,theyperformaPLSI-basedclusteringtoformktopicthemes.Eachclusterisusedtogenerateaunigramlanguagemodeli,andasceneisassumedtobeamixtureofthesemodelsandanunderlyingbasemodel.Tagsarethenpredictedusingacombinationofshotfeaturesandkeywordco-occurrencebasedonaconstructedtrainingset.2 UCLAComputerScienceDepartmentTechnicalReport#100025 Figure1:ScriptProcessingWork owthisstage,ourgoalistoincreasethelikelihoodofmatchinganadvertiser'skeywordswhileminimizingdeclineinrelevancyoftheadswhenmatchesdooccur.ThisrelatedtermminingprocessisdescribedinSection5.Inbothsteps,keywordsareidentiedandrankedwithoutconsultinganinventoryofadsoradvertisersuppliedkeywords.4ProcessingSourceTextIntherststageofprocessing,weanalyzetheformatandcomplexitiesofvideo-basedtextsources,suchasscripts,anddescribemethodsoftextanalysisbasedontraditionalstatisticalanalysisandgenerativemodels.Inthisworkweconsiderthreesourcesoftextdataforavideo:MovieScript-ascriptorscreenplayisadocumentthatoutlinesallofthevisual,audio,behavioral,andspokenelementsrequiredtotellastory.Sincelmproductionisahighlycollaborativemedium,thedirector,cast,editors,andproductioncrewwillusevariousformsofthescripttointerprettheunderlyingstoryduringtheproductionlmingprocess.Numerousindividualsareinvolvedinthemakingofalm,thereforeascriptmustconformtospecicstandardsandconventionsthatallinvolvedpartiesunderstandandthuswilluseaspecicformatwithrespecttothelayout,margins,notation,andotherproductionconventions.Thisdocumentisintendedtostructureallofthescriptelementsusedinascreenplay.ClosedCaptioning(CC)track-adocumentwhichcontainsaseriesoftimecodesandtextofthespokendialog.Eachtimecodeindicateswhenandforwhatdurationthecorrespondingtextappearsonscreen.Closedcaptioningtrackslackadditionalcues,suchasvisualinformationorindicatorsofthecurrentspeaker.Speech-To-Text(STT)-isaprocessbywhichaudiodatacontainingdialogornarrativecontentisautomaticallyconvertedtoatexttranscription.Theoutputtypicallyconsistsofaseriesofwords,eachwithanassociatedtimecodeandduration.Thesourceaudiomaybeofpoorqualityorcontainnon-speechsoundssuchasmusicorsoundeectartifacts,whichgenerallycontributetotranscriptionerrors.TranscriptionqualityistypicallymeasuredbytheoverallWordErrorRate(WER).AfrequentgoalofSTTsystemsistoreducetheimpactofahighWER,thougherrorratesonheterogeneouscontentistypicallyquitehigh.Figure1outlinestheprocessingwork owforacompletemoviescript,whichincludesnon-speechelementssuchassceneheadingsandactiondescriptions.Wewilldescribeeachofthesestepsnext.Notethatthework owislargelythesameforclosedcaptioningtracksandspeechtranscripts,whichcanbeformattedto\look"likeascreenplay.Inthosecases,script-specicprocessingstepsaresimplyomitted.4.1ScriptParsingTelevisionandmoviescriptsarefrequentlywritteninplaintextandfollowaconventional\screenplay"formatwhichallowshumanreaderstoeasilydierentiateandinferthepropersemanticsfordierentscriptelements,suchasdialogorsceneheadings.Forexample,sceneheadingsaretypicallywrittenonasinglelineinallcapitalletters,beginningwithINTorEXTtodenotewhetherthesettingisinteriororexterior,andendingwithanindicatoroftimeofdaysuchasMORNINGorNIGHT.Figure2showsabriefsnippetofatypicalscript.Understandingthesemanticsofatextelementishelpfulwhenprocessingit.Forexample,characternamesappearfrequentlyinascriptpriortoeachoftheirlinesofdialog,thoughwegenerallyndthemtobeapoorchoiceforadvertisingkeywords.Weaddamachine-readablehierarchicalstructure4 UCLAComputerScienceDepartmentTechnicalReport#100025 Figure2:ExampleScriptSnippetandsemanticstoeachtextsegmentofascriptusinganitestatemachinebasedparserderivedfromconventionalscreenplaywritingrules.Thisisdepictedasstep(1)inFigure1.Moviescriptdocumentsareconvertedintoastructuredandtaggedrepresentationwhereallscriptelements(sceneheadings,actiondescriptions,dialoglines,etc.)aresystematicallyextracted,tagged,andrecordedasobjectsintoaspecializeddocumentobjectmodel(DOM)forsubsequentprocessing.AllobjectswithintheDOM(e.g.,entiresentencestaggedbytheircorrespondingtypeandscriptsection)arethenprocessedusingbothstatisticalmethodstoidentifykeywordsofinterest,andanaturallanguageprocessing(NLP)enginethatidentiesandtagsthenounitemsidentiedineachsentence.Theseextractedandtaggednounelementsarethencombinedwithtime-alignmentinformationandrecordedintoametadatarepository.Wedescribethisalignmentprocessnext.4.2Speech-to-Text(STT)ProcessingSTTtranscriptscontaintimecodeinformationthatplaysanimportantroleinassociatingscriptkeywordstospecicpointsintimeinthevideocontent.Inthissectionofthework ow,avideooraudiolethatcontainsspokendialogthatcorrespondstothedialogsectionsoftheinputscriptisreadandprocessedusingaSpeech-to-Textenginethatgeneratesatranscriptionofthespokendialog,shownas(2)inFigure1.Forthisprocess,wealsoperformanimportantoptimization.Automaticspeechrecognitionenginestypicallyincorporateaknownvocabularyandprobabilisticmodelsofspeech(oftenbasedonwordN-grams).Whenthedialogdataisavailablefromascript,weconstructacustomlanguagemodeltobiasthetranscriptionenginetowardstheexpectedvocabularyandwordsequences,whichhelpstoincreasethetranscriptionaccuracy.4.3ScriptandSTTTranscriptAlignmentAtthisstage,wehavetaggedandstructuredscriptdata(withoutanytimeinformation)fromstep(1),andanoisy,relativelyinaccurateSTTtranscriptwithveryprecisetimecodeinformationfromstep(2).Tomakeuseofthekeywordsandconceptsgeneratedbythelaterprocessingsteps,thescriptdatamustbetime-alignedwiththeSTTdata.Thisisaccomplishedinstep(3)byusingtheLevenshteinWordEditDistance[12]algorithmtondthebestwordalignmentbetweenscriptdialogandtheSTTtranscript.Theresultofthisphaseofprocessingisatime-alignedsourcescriptthatcanassociatescriptactionanddialogkeywordswithprecisepointsintimewithinthevideocontent.Thisdataisstoredintoametadatarepository.4.4StatisticalGenerationofKeywordTermsInthenalstepforasourcetext(script,CC,ortranscript),thetime-codedtextelementsfromthemetadatarepositoryareusedtobuildasuxwordN-gramtreethatisprunedbyN-gramtermfrequencytodiscoverthemostdominantterms,basedinlargepartontheworkofChimandDeng[8].Thisisshownas(5)inFigure1.BeforeN-gramtermgeneration,weperformedaone-timeprocessofselectingastopwordvocabularyspecictothedomainofmoviescripts.Usingfrequencystatisticscomputedfromalargecorpusofscripts,wemanuallyidentiedasetofstopwordsfromthemostfrequentlyoccurringterms.DuringN-gramtermgeneration,thefollowingstepsarefollowed:1.Corpusstopwordsareremovedfromthesourcetext.2.AnN-gramtermtreewithsequencesuptolengthN=4iscreatedbycollectingandcountingN-gramoccurrencesfromthescript.5 UCLAComputerScienceDepartmentTechnicalReport#100025 3.TheresultingsuxtreeisthenprunedbytraversingthetreetocollectandrankthetopmostMfrequentterms.Inourexperiments,weselectthemostfrequentM=20keywords.4.5GenerativeModelsForNoisyDataThestatisticalN-grammethodsworkwellwhenkeywordsandphrasesarerepeatedmultipletimes.Whilethisisoftenthecaseforlongerorwell-formedtextinput,shortornoisytextoftenresultsinthemajorityof(non-stopword)keywordsonlybeingmentionedonce.Withthistypeofinput,statisticalmodelsareunabletodecipherwhichkeywordsaremostimportant.Tobetterhandleshortornoisytextinput,weuseakeywordselectionmethodbasedongenerativetopicmodeling.Inthismodel,weassumethatavideocomprisesasmallnumberofhiddentopics,whichcanberepresentedaskeywordprobabilities,andthatavideo'stextisgeneratedfromsomedistributionoverthosetopics.Thehighlyprobablekeywordsinthosetopicsarelikelytobemostrepresentativeofthevideocontent.WeuseLatentDirichletAllocation(LDA)[4]tolearnthetopicsandcorrespondingtopic-keywordprobabilitydistributionfromtheinputtext.Wethencombinethesetopicstoformarankedkeywordlist.4.5.1GeneratingTopicsTodiscovertheunderlyingtopicsinavideo,wesegmenttheinputtextintosentencesandperformtopicmodelingwithLDA.Theresultingtopic-termdistributionisaKxVmatrix,whereKisthenumberoftopics,Visthesizeoftheinputvocabulary,and[i][j]istheprobabilityofkeywordjintopici.Weformanorderedlistofkeywordskiforeachtopic,sortedbytheirprobabilityin~[i].ThisresultsinKrankedlistsofkeywords,onepertopic,whichmustthenbemergedintoasinglelisttoselectthetopM.WhilesimplyselectingthetopM Kkeywordsfromeachtopicisoneoption,wedescribeamoregeneralsolutionformergingmultiplerankedlistswhenwediscussourapproachtondingrelatedkeywords.ThismethodisdescribedinSection5.3.4.6Statistical-GenerativeHybridMethodTheLDAmodellearnskeywordprobabilitiesfortermswhichareseparatedbywhitespace.Whenpos-sible,however,itispreferabletoidentifymulti-termkeywordsforadvertising.Forexample,thephrase\relationaldatabase"ismorespecicthaneitheroftheindividualwords\relational"or\database",andthushashighervaluetoadvertisers.Tohelpidentifythesemulti-tokenkeywordsinshortornoisytextsources,weuseahybridofstatisticalandgenerativetechniques.WerstprocessthesourcetextusingtheN-grammethodtoidentifyanysignicantmulti-tokenkeywords.Wetheneditthesourcetextbyremovingthewhitespacebetweenthetermsofthesemulti-tokenphrasessotheyappearasasingletoken.Thismodiedsourcetextisthenprocessedusingthegenerativemodel.4.7FilteringtheKeywordsWeapplytwolters,whenpossible,toremovefrequentlyoccurringwordswhichareoftennotusefulinthecontextofmatchingadvertisements.Fromallinputsources,keywordsmatchingalistofEnglishprofanityareremoved.Wealsondthatmaincharacternamesareoftenamongstthetopkeywords,butgenerallydonotretrieverelevantadvertisements.Whengivenacompletescript,weremovecharacternamesfromthekeywordlistusingadictionaryconstructedduringtheparsingandtaggingstage.Forclosedcaptioningandspeechtranscripts,however,thesenamesareunknownandthusmaystillappearinthetopkeywords.Thisismorecommonforclosedcaptioningthanspeechtranscripts,however,aspropernamesarelesslikelytobecorrectlytranscribedbytheSTTengine.Atthispointthemostdominant(possiblymulti-term)keywordswhichoccurinthesourcetext,alongwithassociatedtimecodeinformation,havebeenidentiedandcanbesuggestedasadvertisingkeywordsrelevanttoaparticulartimepointofavideo.AsRibeiro-Netoetal.[18]describe,however,thekeywordschosendirectlyfromasourceandthekeywordsbidonbyadvertisersmaysueravocabularyimpedanceproblem.Inthenextsectionwedescribenoveltermminingtechniqueswhichcanprovidearicher,morecompletesetofrelevantadvertisingkeywords.6 UCLAComputerScienceDepartmentTechnicalReport#100025 Figure3:RelatedTermsFromSearchResultsAftertheselteringsteps,weconstructavectorspacemodelMforthissmallcorpusof\documents"relevanttoT.BasedonthepopularTF-IDF[20]termweighting,wecomputethecorpusfrequency(CF)andinverse-document-frequency(IDF)weightforeachterminM,andrankthekeywordsaccordingtotheirCF*IDFscore.Thisstepisshowninthework owas(3)inFigure3,producingthenallistofrankedrelatedkeywordsfromsearchresults.5.2MiningwithWikipediaTheseconddatasourceweanalyzeforrelatedtermsisWikipedia,anextensiveknowledgebasewithover3.1millionEnglisharticlesavailableatthetimeofthiswriting.WhereasintheWebcorpuswefocusedonsearchresultsinresponsetoaquery,withWikipediawedirectourattentiontohyperlinks.WithinthetextofaWikipediaarticle,numerousinter-wikilinkspointtootherWikipediapages,whichallowsustomodelWikipediaasadirectedgraphG=fV;Eg.WeconstructtheWikipediagraphwherenodesVrepresentpagesinthemainarticlenamespace,andedgesEdenoteinter-wikilinksbetweenthosepages.Whenbuildingthegraph,twoarticletypesinthemainnamespaceareprocessedspecially.Forambiguoustermssuchas\coach",adisambiguationpageinWikipedialiststheavailablearticlesfordierentsensesoftheterm.Thesepagesserveprimarilyasnavigationalaidesforusers,ratherthanconveyingasemanticrelationshipbetweenterms,andwethere-foreexcludetheminthegraph.Thesecondcategoryofpagesweprocessspeciallyareredirectionpages,whichprovideatranslationforalternateormisspelledwords,inconsistentcapitalization,acronyms,andsoon,intoacanonicalform.InourWikipediagraph,anarticleandallofthepageswhichredirecttoitaremergedintoasinglenode.Weusethelinkstructureofthegraphtobothidentifyandrankcandidaterelatedterms.Thesestepsaredescribedindetailinthefollowingsections.5.2.1IdentifyingCandidateRelatedTermsWithoutclearlydeneddirectedlinksbetweenindividualtermsintheWebcorpus,theapproachesusingWebsearchresultsdescribedabovedependontheassumptionsthatdocumentsretrievedbythesearchenginearerelevanttotheinputterms,andthatothertagsorkeywordsforthosepagesarepotentiallyrelated.Thatis,werelyonco-occurrencebasedmeasurestoidentifywhichtermsaremostlikelyrelated.WithWikipedia,however,wehaveanexplicitlinkstructurebetweenarticleswhichcanbeusedasanindicatorofrelatedness.Werequiretherelatednessbetweentwoarticlenodesaandbtobeasymmetricrelationship:aisrelatedtobifandonlyifbisrelatedtoa.TranslatingthisrequirementtotheWikipediagraphisrelativelystraightforward.Toidentifycandi-daterelatedtermsfortermT,werstlocatetheWikipediapagewithTasthetitle.3GiventhenodetforT,weidentifyanynodesinthegraphwhichformadirectcyclewithtascandidaterelatedterms.Thatis,weselectthesubsetofnodesNVsuchthat:8n2V;n2N=)ft;ng^fn;tg2E(1)Figure4showsasimpleexample,wherefortermt,termsn1andn2arecandidaterelatedterms,butXandYarenot.5.2.2RankingCandidateTermsAgivensetofcandidaterelatedtermsmaybequitelarge.Wenowlookathowtorankthecandidateterms.Tobeagoodsuggestionasanadvertisingkeyword,atermshouldberelativelypopular.Whilewecouldmeasurepopularitythroughexternalsources,suchasquerylogfrequency,wechosetoutilizethegraphstructureofWikipedia.Weapproximatetherelativeimportanceoftermsbycomputing 3WecanrelaxthisrequirementandsearchthetextofWikipediaarticlestoidentifythetoppageorpagesforanyparticularinputterm,albeitatalikelyreductioninqualityofthegeneratedrelatedterms.8 UCLAComputerScienceDepartmentTechnicalReport#100025 SR(camera) WP(camera) Combined SR(advertising) WP(advertising) Combined digitalcamera photography digitalcamera product internet internet lens pornography photography marketing newspaper product canon visualarts canon advertiser videogame marketing nikon photograph nikon business americanfootball newspaper zoom digitalcamera pornography campaign magazine advertiser lmcamera photojournalism lens advertisingagency worldwideweb magazine digitalslr photographiclm digitalphotography internet marketing advertisingagency megapixels aperture photograph consumer mtv publicrelations digitalphotography canon aperture job blog google compact photographiclens shutterspeed newspaper publicbroadcastingservice billboard camcorder aerialphotography visualarts agency massmedia videogame slrcamera holography exposure publicrelations google publicity lense single-lensre excamera viewnder company brand productplacement digitalslrcamera focallength moviecamera service broadcasting graphicdesign olympus nikon cameraphone budget musicvideo promotion Table2:ExampleRelatedTermsbyMethod6.1EvaluationDesignWeidentiedthetop20keywordsfromeachavailabletextsourceusingboththestatisticalandhybridapproachesdescribedinSection4.ForeachofthesekeywordsweusetherelatedtermminingtechniquesofSection5toidentifythetop10relatedterms.Thesekeywordswerethenevaluatedwithausersurvey.Forthetopicmodelingphaseofthehybridtechnique,wesetthenumberoftopicsK=5withtheLDAparameters=0:3and=0:1.Userswereshownavideoclip,typicallyaround3minutesinlength,andasetofkeywords.Tokeepthesizeofthekeywordsetmanageable,weshow5ofthetop20keywordsforeachmethodfromeachavailabletextsource,and1ofthetop10relatedtermsforeachofthosekeywords,allchosenandorderedatrandom.Userswereaskedtomakeabinaryassessmentontherelevanceofeachdisplayedkeyword.ForthenewsandeducationalvideosandtheamateurclipsavailableonYouTube,usersareshownthecompletevideo.Forfulllengthlms,usersareshownthetheatricaltrailerandaskedtomakejudgementsbasedonthetrailerandtheirpriorknowledgeofthemovie.Over23peopleparticipatedinthesurvey(personallyidentiableinformationwasnotrequired),withaminimumof9andanaverageof13usersevaluatingeachvideo.6.2EvaluationMetricsWeevaluatethekeywordsgeneratedbyourmethodsusingfourmetrics.Theaveragerelevancyofthekeywordsdisplayedtouserswecalltheprecision.Multipleusersviewingthesamesetofkeywordsmaynotcompletelyagreeonwhichkeywordsarerelevant.Wethereforecomputethepotentialofasource,whichmeasuresthefractionofthekeywordsjudgedrelevantbyatleastoneuser.Moreformally,wedenetheprecisionandpotentialoftextsourceSas:Precision(S)=1 iXijKi(S)\Rij jKi(S)jPotential(S)=jR(S)j jK(S)jRiisthesetofkeywordsjudgedrelevantinevaluationiandKi(S)arethekeywordsdisplayedtotheuserforevaluationiwhichcomefromsourceS.K(S)arethekeywordsfromsourceSdisplayedinatleastoneevaluation,andR(S)arethekeywordsfromsourceSjudgedrelevantbyatleastoneuser,denedas:R(S)=[iKi(S)\RiK(S)=[iKi(S)Theothermetricswedeneareappealandpopularity,whichserveasindicatorsofhowpertinentthekeywordsaretoadvertisers.Appealestimatesthelikelihoodthatakeyworddeemedrelevanttothecontentwillalsobemeaningfultoanadvertiser.Popularitymeasurestheaveragenumberofadvertisersinterestedinarelevantkeyword.WedenetheappealandpopularityofasourceSas:Appeal(S)=jR(S)\Aj jR(S)j10 UCLAComputerScienceDepartmentTechnicalReport#100025 VideoType Precision Potential Statistical Hybrid Statistical Hybrid StudioFilms 0.268 0.252 0.479 0.480 News/Educational 0.442 0.473 0.548 0.717 UserGenerated 0.268 0.368 0.390 0.473 Table4:PrecisionandPotentialforSTT VideoType WER Statistical Hybrid StudioFilms 0.857 0.723 0.690 News/Educational 0.406 0.731 0.961 Table5:RelativePrecisionandWordErrorRateHauptmann'sworkindicatesthatspeech-to-textworderrorratesunder0.4resultinretrievalper-formancecomperabletoaperfecttranscript[9].Atthe0.4threshold,relativeretrievalprecisionisapproximately80%.Wecomputetheaverageworderrorrateforstudiolmsandnews/educationalvideos(usingthedefault\general"languagemodelsforourSTTengine),andcomparetherelativepre-cisionofSTTwithrespecttoclosedcaptioningforthestatisticalandhybridmethods,showninTable5.Usergeneratedvideosarenotincludedbecauseno\correct"transcriptsareavailableforthecontent.Asexpected,theaverageworderrorratesfornewsandeducationalvideosaresubstantiallylower,thoughstillaround0.4.Forthistypeofcontent,therelativeprecisionofSTTis96%oftheclosedcaptioning.Forthehigherworderrorrateoflmswecanstillachieveover70%averagerelativeprecision.Theseresultsfurthersupportuseofthestatisticalselectionmethodsonlongertextinputsandthegenerativemethodsonshortertext,andsuggestthatspeechtranscriptsalonemaybesucienttondmeaningfuladvertisingkeywordsforvideossuchasnewsbroadcasts.6.5PrecisionandPotentialofRelatedTermsWenextlookattheprecisionandpotentialoftherelatedterms.Table6showstheprecisionandpotentialscoresforthetop10relatedtermsfromboththestatistical(S-Related)andhybrid(H-Related)methods.TheseresultsaremostlyconsistentwithTable3,withthemostpreciseinputsource(closedcaptioning)producingthemostrelevantrelatedkeywords.Foreachmethodandsource,theprecisionandpotentialofthesourcekeywordsarehigherthantherelatedterms.InourexperimentswerandomlyselectedfromthetopN=10relatedtermsforeachsourcekeyword.Wenowinvestigatehowtheaverageprecisionoftherelatedtermsisaectedaswevarythisrangefor1N10.Figure5plotstheprecisionoftherelatedkeywordsforeachtextsourceusingthestatisticalselectionmethod.Forclosedcaptioningthetop2relatedtermsgivethehighestprecision,whichislowerthantheprecisionofthesourcetermsbutsignicantlyhigher(p=0:003)thanchoosingfromthetop10.Bothscriptandspeechtranscriptinputsshowanincreaseinprecisionwhenselectingfromthetop3-6terms.Whiletheprecisionisagainlowerthanthesourcekeywords,thereisnoticeableimprovementbetweenselectingfromthetopN=6andN=10forbothscript(p=0:06)andSTT(p=0:03)input.Thisresultsuggeststhatthenumberofrelatedtermstoconsidertoachievethemaximumoverallprecisiondependsontheinputtexttype,withhigherprecisioninputlikeclosedcaptioningachievingitsbestprecisionwithasmallernumberrelatedtermsthanscriptsorspeechtranscripts.Resultsforthehybridselectionmethodexhibitsimilarbehavior.Anotherfactortoconsiderwhenevaluatingtheprecisionoftherelatedkeywordsistherelevancyofthesourcetermbeingexpanded.Anirrelevantsourcetermislesslikelytoresultinrelevantrelated Source Precision Potential S-Related H-Related S-Related H-Related Script 0.254 0.215 0.253 0.222 CC 0.260 0.221 0.262 0.221 STT 0.208 0.186 0.200 0.191 Table6:PrecisionandPotentialofRelatedTerms12 UCLAComputerScienceDepartmentTechnicalReport#100025 Source Statistical S-Related Hybrid H-Related Script 0.726 0.788 0.607 0.792 CC 0.578 0.785 0.543 0.796 STT 0.681 0.827 0.594 0.820 Table7:AppealofKeywordsbySource Source Statistical S-Related Hybrid H-Related Script 3.59 3.96 3.00 4.18 CC 2.11 3.81 2.00 3.77 STT 2.54 4.39 2.56 4.30 Table8:PopularityofKeywordsbySourceForbothappealandpopularitywenoticethat,whileclosedcaptioningwasgenerallyconsideredthemostprecisesourceofkeywords,italsoproducestheleastmeaningfulkeywordsforadvertisers.Thismaybearesultofcharacternamesappearingintheclosedcaptioningkeywords,whichwenotedearlierarelteredoutfromscriptinputtextandarelesslikelytoretrieverelevantads.Finally,welookcloseratthepopularityofkeywordsforspeechtranscripts.Table9comparesthepopularityforsourceandrelatedkeywordsforvariousvideotypes.Inallcasestherelatedkeywordshavehigherpopularitythanthesourcekeywordsbyastatisticallysignicantmargin.Italsoshowsthatnewsandeducationalcontentcontainslesspopularkeywordsforadvertisers.6.7Precision-PopularityTradeosTheresultsabovedemonstratethat,whenrelevant,relatedkeywordsaresignicantlymoreattractivetoadvertisersthansourcekeywords.Theoverallprecisionoftherelatedterms,however,islowerthansourceterms.Weexploretheinherenttradeobetweenkeywordrelevanceandpopularitybycomputingaprecision-weightedpopularitymetric:PWP(S)=Pk2K(S)AkP(S;k) jK(S)j(4)WhereP(S;k)istheprecisionofkeywordkfromsourceS,denedas:P(S;k)=Pijfkg\Ri(S)j Pijfkg\Ki(S)jTable10showstheprecision-weightedpopularityforthestatisticalmethodforeachtextsourceusingthetop5relatedkeywordsfromeachsourcekeyword.Theresultssuggeststhatforscriptinput,theminorimprovementinpopularityofrelatedkeywords(showninTable8)maynotosetthedecreaseinprecision.Forspeechtranscriptinput,however,thereappearstobesomebenetfromrelatedterms.WeexamineSTTinputfurtherinTable11,whichshowsthatoverall,evenwiththedropinprecision,relatedkeywordsarebenecialtoadvertisersfornewsandusergeneratedvideoswhenonlyspeechtranscriptsareavailable.Althoughtherelatedkeywordsforstudiolmspeechtranscriptshavehigherpopularitythansourcekeywords(Table9),therelativeincreaseisnoticeablylowerthanforCCorSTT,andtheresultingprecision-weightedpopularitydoesnotoerimprovement. Source Statistical S-Related Hybrid H-Related StudioFilms 2.97 4.35 2.67 4.39 News/Educational 1.69 4.11 2.21 3.50 UserGenerated 1.89 4.83 2.63 4.75 Table9:PopularityforSpeechTranscripts14 UCLAComputerScienceDepartmentTechnicalReport#100025 [4]D.M.Blei,A.Y.Ng,M.I.Jordan,andJ.Laerty.Latentdirichletallocation.JournalofMachineLearningResearch,3,2003.[5]A.Broder,M.Fontoura,V.Josifovski,andL.Riedel.Asemanticapproachtocontextualadvertising.InSIGIR'07,pages559{566,2007.[6]C.Buckley,G.Salton,J.Allan,andA.Singhal.AutomaticqueryexpansionusingSMART:TREC3.InTextREtrievalConference,1994.[7]Y.Chen,G.-R.Xue,andY.Yu.Advertisingkeywordsuggestionbasedonconcepthierarchy.InWSDM'08,pages251{260,2008.[8]H.ChimandX.Deng.Anewsuxtreesimilaritymeasurefordocumentclustering.InWWW'07,pages121{130,2007.[9]A.Hauptmann.Lessonsforthefuturefromadecadeofinformediavideoanalysisresearch.InCIVR'05,pages1{10,2005.[10]G.JehandJ.Widom.Simrank:ameasureofstructural-contextsimilarity.InKDD'02,pages538{543,2002.[11]A.JoshiandR.Motwani.Keywordgenerationforsearchengineadvertising.InICDMW'06,pages490{496,2006.[12]V.Levenshtein.Binarycodescapableofcorrectingdeletions,insertionsandreversals.InSovietPhysicsDoklady,1966.[13]H.Ma,H.Yang,I.King,andM.R.Lyu.Learninglatentsemanticrelationsfromclickthroughdataforquerysuggestion.InCIKM'08,pages709{718,2008.[14]C.D.Manning,P.Raghavan,andH.Schtze.IntroductiontoInformationRetrieval.CambridgeUniversityPress,NewYork,NY,USA,2008.[15]E.Moxley,T.Mei,X.-S.Hua,W.-Y.Ma,andB.Manjunath.Automaticvideoannotationthroughsearchandmining.InICME'08,pages685{688,2008.[16]L.Page,S.Brin,R.Motwani,andT.Winograd.Thepagerankcitationranking:Bringingordertotheweb.Technicalreport,StanfordDigitalLibraryTechnologiesProject,1998.[17]S.Ravi,A.Broder,E.Gabrilovich,V.Josifovski,S.Pandey,andB.Pang.Automaticgenerationofbidphrasesforonlineadvertising.InWSDM'10,pages341{350,2010.[18]B.Ribeiro-Neto,M.Cristo,P.B.Golgher,andE.SilvadeMoura.Impedancecouplingincontent-targetedadvertising.InSIGIR'05,pages496{503,2005.[19]M.SahamiandT.D.Heilman.Aweb-basedkernelfunctionformeasuringthesimilarityofshorttextsnippets.InWWW'06,pages377{386,2006.[20]G.SaltonandC.Buckley.Term-weightingapproachesinautomatictextretrieval.InInformationProcessingandManagement,pages513{523,1988.[21]S.Siersdorfer,J.SanPedro,andM.Sanderson.Automaticvideotaggingusingcontentredundancy.InSIGIR'09,pages395{402,2009.[22]A.VelivelliandT.S.Huang.Automaticvideoannotationbyminingspeechtranscripts.InCVPRW'06,page115,2006.[23]E.M.Voorhees.Queryexpansionusinglexical-semanticrelations.InSIGIR'94,pages61{69,1994.[24]C.Wang,P.Zhang,R.Choi,andM.D.Eredita.Understandingconsumersattitudetowardadver-tising.InEighthAmericasConf.onInformationSystem,pages1143{1148,2002.[25]W.T.Yih,J.Goodman,andV.R.Carvalho.Findingadvertisingkeywordsonwebpages.InWWW'06,pages213{222,2006.[26]S.Zanetti,L.Zelnik-Manor,andP.Perona.Awalkthroughtheweb'svideoclips.CVPRW'08,pages1{8,2008.16