/
 UCLA Computer Science Department Technical Report  UCLA Computer Science Department Technical Report

UCLA Computer Science Department Technical Report - PDF document

phoebe-click
phoebe-click . @phoebe-click
Follow
480 views
Uploaded On 2014-09-30

UCLA Computer Science Department Technical Report - PPT Presentation

Welch Junghoo Cho and Walter Chang mjwelchcsuclaedu chocsuclaedu wachangadobecom Abstract With the proliferation of online distribution methods for videos content owners require easier and more e64256ective methods for monetization through advertis ID: 1529

Welch Junghoo Cho

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document " UCLA Computer Science Department Techni..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

UCLAComputerScienceDepartmentTechnicalReport#100025 Theremainderofthispaperisorganizedasfollows.WediscussrelatedworkinSection2andprovideanoverviewofourgoalsandapproachinSection3.Section4describesoursystemforextractingkeywordsfromcontent-basedtextsources,andinSection5wepresentmethodsforexpandingthosetermstoaddressvocabularymismatchproblemsbetweenthesourcekeywordsandthosechosenbyadvertisers.WeevaluatethekeywordsgeneratedbythesemethodsforvariousvideotypesandsourcesoftextinSection6,anddiscussourobservations,conclusions,andfutureworkinSection7.2RelatedWorkSponsoredsearch,oradvertisingdisplayedalongsidethesearchresultsofauser-suppliedkeywordquery,typicallyinvolvesacomplexcombinationofadvertisersbiddingonkeywords,reviewofadvertisementsforrelevance,andanauctionprocesstoplaceadsalongsidesearchresults.See[2]foranoverviewofsponsoredsearch.Indisplayorcontent-matchadvertising,however,explicitkeywordsforthecontentarenotpro-vided.Inonlineadvertisingitisimportanttodisplayadsrelevanttoapage'scontent[24].Withoutuser-suppliedkeywords,researchershaveinvestigatednumerouskeywordidenti cationtechniquesandapproachestomatchadvertisementswiththecontentofWebpages.Ontologiesortaxonomieshavebeenusedincombinationwithfeatureidenti cationforsemanticapproachestomatchingadvertisementswithcontent[5,7].Ontologiesareoftendomain-speci candtedioustoconstruct,andtheindividualtextelementsfromscriptsordialogareoftenterse,makingtheuseofclassi cationtechniquesorontologiesmoreerrorprone.Yih,Goodman,andCarvalho[25]establishseveralfeaturesofWebpagesandquerylogs,suchasfrequency,textualcharacteristics(e.g.capitalization),andstructuralcuesforidentifyingadvertisingkeywords.Ribeiro-Netoetal.[18]proposestrategiesformatchingthetextofaWebpagewithtext-basedadvertisementsinaknownadinventory.Theyaddressthevocabularyimpedanceproblembyrepresentingapagewithconceptsfromitsnearest(mostsimilar)neighbors.Ravietal.[17]proposeatwophasegenerativemodelforidentifyingrelevantadvertisingkeywordsforagivenWebpage.TheyuseapopularmachinetranslationmethodtolearnaprobabilisticsetofkeywordmappingsfromatrainingcorpusofWebpagesassociatedwithadsandadvertiserchosenkeywords.TermweightsareassignedbasedonHTMLfeatures.Abigramlanguagemodeltrainedonsearchqueriesisusedtohelprankthegeneratedcandidatekeywords.Findingrelatedbut\lessobvious"(andthereforelessexpensive)keywordsfromanadvertiserspointofview[11,1]hasbeenaddressedaswell.Ourworkdi ersfromtheseproblemsinseveralways.OurtextsourcesareplaintextwhichlackexplicitstructuralcuessuchasHTMLmarkup.Wethereforemustresorttostatisticalmethodsforrankingandselectingkeywordsfromthesourcetext.Severaloftheabovetechniquesalsorequiretaggedtrainingdata,languageanddomain-speci contologies,orpre-constructedpoolsofavailableadvertisements.Ourmethodsarelanguage-independentandunsupervised,notrequiringanytrainingdata.Relatedtermidenti cationisawellresearchedproblemintheinformationretrievaldomain,wheretaskssuchasqueryrewritingorexpansionarewidelystudied.Voorhees[23]describedtheuseoflexicalrelationshipscontainedinWordNetforqueryexpansion.Buckleyetal.[6]notedthatrelatedtermswilltypicallyco-occurnon-randomlyindocumentsrelevanttoaquery.Morerecently,SahamiandHeilman[19]useasimilarnotiontocomputethesemanticsimilarityofshorttextsnippetsusingWebsearchresultsasanopaquecontext.OurworkfollowsalongtheselinesbyusingWebsearchresultstodiscoverrelatedco-occurringterms.WealsousetheimplicitsemanticrelationshipscapturedinthehyperlinkedstructureofWikipediatoidentifyrelatedterms.Hauptmannsummarizesmany\lessonslearned"regardingspeechrecognitionaccuracyandthee ectsofworderrorrateoninformationretrievalprecision[9].Inparticulartheirresearchshowsthatthebestsystemsachieveworderrorratesaround0.15underidealconditions(suchasin-studioanchorsforbroadcastnews),andthatretrievalperformancedegradesrelativelygracefullywithrespecttoperfecttexttranscriptsuntilworderrorratesapproach0.40.Whiletheirworkonretrievalisorthogonaltoourfocusofadvertisingkeywordselection,weconsiderthese ndingsaswecompareourresultsbetweenperfecttexttranscripts(closedcaptioningtracks)andspeechtranscripts.Keywordidenti cationformultimediaoftenutilizes,inpart,attributesextractedfromimagesaspartofalargerfeaturespaceformachinelearning.VelivelliandHuang[22]predicttagsforvideosbasedonimagefeaturesandspeechtranscripts.Usingacollectionofspeechtranscripts,theyperformaPLSI-basedclusteringtoformktopicthemes.Eachclusterisusedtogenerateaunigramlanguagemodeli,andasceneisassumedtobeamixtureofthesemodelsandanunderlyingbasemodel.Tagsarethenpredictedusingacombinationofshotfeaturesandkeywordco-occurrencebasedonaconstructedtrainingset.2 UCLAComputerScienceDepartmentTechnicalReport#100025 Figure1:ScriptProcessingWork owthisstage,ourgoalistoincreasethelikelihoodofmatchinganadvertiser'skeywordswhileminimizingdeclineinrelevancyoftheadswhenmatchesdooccur.ThisrelatedtermminingprocessisdescribedinSection5.Inbothsteps,keywordsareidenti edandrankedwithoutconsultinganinventoryofadsoradvertisersuppliedkeywords.4ProcessingSourceTextInthe rststageofprocessing,weanalyzetheformatandcomplexitiesofvideo-basedtextsources,suchasscripts,anddescribemethodsoftextanalysisbasedontraditionalstatisticalanalysisandgenerativemodels.Inthisworkweconsiderthreesourcesoftextdataforavideo:MovieScript-ascriptorscreenplayisadocumentthatoutlinesallofthevisual,audio,behavioral,andspokenelementsrequiredtotellastory.Since lmproductionisahighlycollaborativemedium,thedirector,cast,editors,andproductioncrewwillusevariousformsofthescripttointerprettheunderlyingstoryduringtheproduction lmingprocess.Numerousindividualsareinvolvedinthemakingofa lm,thereforeascriptmustconformtospeci cstandardsandconventionsthatallinvolvedpartiesunderstandandthuswilluseaspeci cformatwithrespecttothelayout,margins,notation,andotherproductionconventions.Thisdocumentisintendedtostructureallofthescriptelementsusedinascreenplay.ClosedCaptioning(CC)track-adocumentwhichcontainsaseriesoftimecodesandtextofthespokendialog.Eachtimecodeindicateswhenandforwhatdurationthecorrespondingtextappearsonscreen.Closedcaptioningtrackslackadditionalcues,suchasvisualinformationorindicatorsofthecurrentspeaker.Speech-To-Text(STT)-isaprocessbywhichaudiodatacontainingdialogornarrativecontentisautomaticallyconvertedtoatexttranscription.Theoutputtypicallyconsistsofaseriesofwords,eachwithanassociatedtimecodeandduration.Thesourceaudiomaybeofpoorqualityorcontainnon-speechsoundssuchasmusicorsounde ectartifacts,whichgenerallycontributetotranscriptionerrors.TranscriptionqualityistypicallymeasuredbytheoverallWordErrorRate(WER).AfrequentgoalofSTTsystemsistoreducetheimpactofahighWER,thougherrorratesonheterogeneouscontentistypicallyquitehigh.Figure1outlinestheprocessingwork owforacompletemoviescript,whichincludesnon-speechelementssuchassceneheadingsandactiondescriptions.Wewilldescribeeachofthesestepsnext.Notethatthework owislargelythesameforclosedcaptioningtracksandspeechtranscripts,whichcanbeformattedto\look"likeascreenplay.Inthosecases,script-speci cprocessingstepsaresimplyomitted.4.1ScriptParsingTelevisionandmoviescriptsarefrequentlywritteninplaintextandfollowaconventional\screenplay"formatwhichallowshumanreaderstoeasilydi erentiateandinferthepropersemanticsfordi erentscriptelements,suchasdialogorsceneheadings.Forexample,sceneheadingsaretypicallywrittenonasinglelineinallcapitalletters,beginningwithINTorEXTtodenotewhetherthesettingisinteriororexterior,andendingwithanindicatoroftimeofdaysuchasMORNINGorNIGHT.Figure2showsabriefsnippetofatypicalscript.Understandingthesemanticsofatextelementishelpfulwhenprocessingit.Forexample,characternamesappearfrequentlyinascriptpriortoeachoftheirlinesofdialog,thoughwegenerally ndthemtobeapoorchoiceforadvertisingkeywords.Weaddamachine-readablehierarchicalstructure4 UCLAComputerScienceDepartmentTechnicalReport#100025 Figure2:ExampleScriptSnippetandsemanticstoeachtextsegmentofascriptusinga nitestatemachinebasedparserderivedfromconventionalscreenplaywritingrules.Thisisdepictedasstep(1)inFigure1.Moviescriptdocumentsareconvertedintoastructuredandtaggedrepresentationwhereallscriptelements(sceneheadings,actiondescriptions,dialoglines,etc.)aresystematicallyextracted,tagged,andrecordedasobjectsintoaspecializeddocumentobjectmodel(DOM)forsubsequentprocessing.AllobjectswithintheDOM(e.g.,entiresentencestaggedbytheircorrespondingtypeandscriptsection)arethenprocessedusingbothstatisticalmethodstoidentifykeywordsofinterest,andanaturallanguageprocessing(NLP)enginethatidenti esandtagsthenounitemsidenti edineachsentence.Theseextractedandtaggednounelementsarethencombinedwithtime-alignmentinformationandrecordedintoametadatarepository.Wedescribethisalignmentprocessnext.4.2Speech-to-Text(STT)ProcessingSTTtranscriptscontaintimecodeinformationthatplaysanimportantroleinassociatingscriptkeywordstospeci cpointsintimeinthevideocontent.Inthissectionofthework ow,avideooraudio lethatcontainsspokendialogthatcorrespondstothedialogsectionsoftheinputscriptisreadandprocessedusingaSpeech-to-Textenginethatgeneratesatranscriptionofthespokendialog,shownas(2)inFigure1.Forthisprocess,wealsoperformanimportantoptimization.Automaticspeechrecognitionenginestypicallyincorporateaknownvocabularyandprobabilisticmodelsofspeech(oftenbasedonwordN-grams).Whenthedialogdataisavailablefromascript,weconstructacustomlanguagemodeltobiasthetranscriptionenginetowardstheexpectedvocabularyandwordsequences,whichhelpstoincreasethetranscriptionaccuracy.4.3ScriptandSTTTranscriptAlignmentAtthisstage,wehavetaggedandstructuredscriptdata(withoutanytimeinformation)fromstep(1),andanoisy,relativelyinaccurateSTTtranscriptwithveryprecisetimecodeinformationfromstep(2).Tomakeuseofthekeywordsandconceptsgeneratedbythelaterprocessingsteps,thescriptdatamustbetime-alignedwiththeSTTdata.Thisisaccomplishedinstep(3)byusingtheLevenshteinWordEditDistance[12]algorithmto ndthebestwordalignmentbetweenscriptdialogandtheSTTtranscript.Theresultofthisphaseofprocessingisatime-alignedsourcescriptthatcanassociatescriptactionanddialogkeywordswithprecisepointsintimewithinthevideocontent.Thisdataisstoredintoametadatarepository.4.4StatisticalGenerationofKeywordTermsInthe nalstepforasourcetext(script,CC,ortranscript),thetime-codedtextelementsfromthemetadatarepositoryareusedtobuildasuxwordN-gramtreethatisprunedbyN-gramtermfrequencytodiscoverthemostdominantterms,basedinlargepartontheworkofChimandDeng[8].Thisisshownas(5)inFigure1.BeforeN-gramtermgeneration,weperformedaone-timeprocessofselectingastopwordvocabularyspeci ctothedomainofmoviescripts.Usingfrequencystatisticscomputedfromalargecorpusofscripts,wemanuallyidenti edasetofstopwordsfromthemostfrequentlyoccurringterms.DuringN-gramtermgeneration,thefollowingstepsarefollowed:1.Corpusstopwordsareremovedfromthesourcetext.2.AnN-gramtermtreewithsequencesuptolengthN=4iscreatedbycollectingandcountingN-gramoccurrencesfromthescript.5 UCLAComputerScienceDepartmentTechnicalReport#100025 3.TheresultingsuxtreeisthenprunedbytraversingthetreetocollectandrankthetopmostMfrequentterms.Inourexperiments,weselectthemostfrequentM=20keywords.4.5GenerativeModelsForNoisyDataThestatisticalN-grammethodsworkwellwhenkeywordsandphrasesarerepeatedmultipletimes.Whilethisisoftenthecaseforlongerorwell-formedtextinput,shortornoisytextoftenresultsinthemajorityof(non-stopword)keywordsonlybeingmentionedonce.Withthistypeofinput,statisticalmodelsareunabletodecipherwhichkeywordsaremostimportant.Tobetterhandleshortornoisytextinput,weuseakeywordselectionmethodbasedongenerativetopicmodeling.Inthismodel,weassumethatavideocomprisesasmallnumberofhiddentopics,whichcanberepresentedaskeywordprobabilities,andthatavideo'stextisgeneratedfromsomedistributionoverthosetopics.Thehighlyprobablekeywordsinthosetopicsarelikelytobemostrepresentativeofthevideocontent.WeuseLatentDirichletAllocation(LDA)[4]tolearnthetopicsandcorrespondingtopic-keywordprobabilitydistributionfromtheinputtext.Wethencombinethesetopicstoformarankedkeywordlist.4.5.1GeneratingTopicsTodiscovertheunderlyingtopicsinavideo,wesegmenttheinputtextintosentencesandperformtopicmodelingwithLDA.Theresultingtopic-termdistributionisaKxVmatrix,whereKisthenumberoftopics,Visthesizeoftheinputvocabulary,and[i][j]istheprobabilityofkeywordjintopici.Weformanorderedlistofkeywordskiforeachtopic,sortedbytheirprobabilityin~[i].ThisresultsinKrankedlistsofkeywords,onepertopic,whichmustthenbemergedintoasinglelisttoselectthetopM.WhilesimplyselectingthetopM Kkeywordsfromeachtopicisoneoption,wedescribeamoregeneralsolutionformergingmultiplerankedlistswhenwediscussourapproachto ndingrelatedkeywords.ThismethodisdescribedinSection5.3.4.6Statistical-GenerativeHybridMethodTheLDAmodellearnskeywordprobabilitiesfortermswhichareseparatedbywhitespace.Whenpos-sible,however,itispreferabletoidentifymulti-termkeywordsforadvertising.Forexample,thephrase\relationaldatabase"ismorespeci cthaneitheroftheindividualwords\relational"or\database",andthushashighervaluetoadvertisers.Tohelpidentifythesemulti-tokenkeywordsinshortornoisytextsources,weuseahybridofstatisticalandgenerativetechniques.We rstprocessthesourcetextusingtheN-grammethodtoidentifyanysigni cantmulti-tokenkeywords.Wetheneditthesourcetextbyremovingthewhitespacebetweenthetermsofthesemulti-tokenphrasessotheyappearasasingletoken.Thismodi edsourcetextisthenprocessedusingthegenerativemodel.4.7FilteringtheKeywordsWeapplytwo lters,whenpossible,toremovefrequentlyoccurringwordswhichareoftennotusefulinthecontextofmatchingadvertisements.Fromallinputsources,keywordsmatchingalistofEnglishprofanityareremoved.Wealso ndthatmaincharacternamesareoftenamongstthetopkeywords,butgenerallydonotretrieverelevantadvertisements.Whengivenacompletescript,weremovecharacternamesfromthekeywordlistusingadictionaryconstructedduringtheparsingandtaggingstage.Forclosedcaptioningandspeechtranscripts,however,thesenamesareunknownandthusmaystillappearinthetopkeywords.Thisismorecommonforclosedcaptioningthanspeechtranscripts,however,aspropernamesarelesslikelytobecorrectlytranscribedbytheSTTengine.Atthispointthemostdominant(possiblymulti-term)keywordswhichoccurinthesourcetext,alongwithassociatedtimecodeinformation,havebeenidenti edandcanbesuggestedasadvertisingkeywordsrelevanttoaparticulartimepointofavideo.AsRibeiro-Netoetal.[18]describe,however,thekeywordschosendirectlyfromasourceandthekeywordsbidonbyadvertisersmaysu eravocabularyimpedanceproblem.Inthenextsectionwedescribenoveltermminingtechniqueswhichcanprovidearicher,morecompletesetofrelevantadvertisingkeywords.6 UCLAComputerScienceDepartmentTechnicalReport#100025 Figure3:RelatedTermsFromSearchResultsAfterthese lteringsteps,weconstructavectorspacemodelMforthissmallcorpusof\documents"relevanttoT.BasedonthepopularTF-IDF[20]termweighting,wecomputethecorpusfrequency(CF)andinverse-document-frequency(IDF)weightforeachterminM,andrankthekeywordsaccordingtotheirCF*IDFscore.Thisstepisshowninthework owas(3)inFigure3,producingthe nallistofrankedrelatedkeywordsfromsearchresults.5.2MiningwithWikipediaTheseconddatasourceweanalyzeforrelatedtermsisWikipedia,anextensiveknowledgebasewithover3.1millionEnglisharticlesavailableatthetimeofthiswriting.WhereasintheWebcorpuswefocusedonsearchresultsinresponsetoaquery,withWikipediawedirectourattentiontohyperlinks.WithinthetextofaWikipediaarticle,numerousinter-wikilinkspointtootherWikipediapages,whichallowsustomodelWikipediaasadirectedgraphG=fV;Eg.WeconstructtheWikipediagraphwherenodesVrepresentpagesinthemainarticlenamespace,andedgesEdenoteinter-wikilinksbetweenthosepages.Whenbuildingthegraph,twoarticletypesinthemainnamespaceareprocessedspecially.Forambiguoustermssuchas\coach",adisambiguationpageinWikipedialiststheavailablearticlesfordi erentsensesoftheterm.Thesepagesserveprimarilyasnavigationalaidesforusers,ratherthanconveyingasemanticrelationshipbetweenterms,andwethere-foreexcludetheminthegraph.Thesecondcategoryofpagesweprocessspeciallyareredirectionpages,whichprovideatranslationforalternateormisspelledwords,inconsistentcapitalization,acronyms,andsoon,intoacanonicalform.InourWikipediagraph,anarticleandallofthepageswhichredirecttoitaremergedintoasinglenode.Weusethelinkstructureofthegraphtobothidentifyandrankcandidaterelatedterms.Thesestepsaredescribedindetailinthefollowingsections.5.2.1IdentifyingCandidateRelatedTermsWithoutclearlyde neddirectedlinksbetweenindividualtermsintheWebcorpus,theapproachesusingWebsearchresultsdescribedabovedependontheassumptionsthatdocumentsretrievedbythesearchenginearerelevanttotheinputterms,andthatothertagsorkeywordsforthosepagesarepotentiallyrelated.Thatis,werelyonco-occurrencebasedmeasurestoidentifywhichtermsaremostlikelyrelated.WithWikipedia,however,wehaveanexplicitlinkstructurebetweenarticleswhichcanbeusedasanindicatorofrelatedness.Werequiretherelatednessbetweentwoarticlenodesaandbtobeasymmetricrelationship:aisrelatedtobifandonlyifbisrelatedtoa.TranslatingthisrequirementtotheWikipediagraphisrelativelystraightforward.Toidentifycandi-daterelatedtermsfortermT,we rstlocatetheWikipediapagewithTasthetitle.3GiventhenodetforT,weidentifyanynodesinthegraphwhichformadirectcyclewithtascandidaterelatedterms.Thatis,weselectthesubsetofnodesNVsuchthat:8n2V;n2N=)ft;ng^fn;tg2E(1)Figure4showsasimpleexample,wherefortermt,termsn1andn2arecandidaterelatedterms,butXandYarenot.5.2.2RankingCandidateTermsAgivensetofcandidaterelatedtermsmaybequitelarge.Wenowlookathowtorankthecandidateterms.Tobeagoodsuggestionasanadvertisingkeyword,atermshouldberelativelypopular.Whilewecouldmeasurepopularitythroughexternalsources,suchasquerylogfrequency,wechosetoutilizethegraphstructureofWikipedia.Weapproximatetherelativeimportanceoftermsbycomputing 3WecanrelaxthisrequirementandsearchthetextofWikipediaarticlestoidentifythetoppageorpagesforanyparticularinputterm,albeitatalikelyreductioninqualityofthegeneratedrelatedterms.8 UCLAComputerScienceDepartmentTechnicalReport#100025 SR(camera) WP(camera) Combined SR(advertising) WP(advertising) Combined digitalcamera photography digitalcamera product internet internet lens pornography photography marketing newspaper product canon visualarts canon advertiser videogame marketing nikon photograph nikon business americanfootball newspaper zoom digitalcamera pornography campaign magazine advertiser lmcamera photojournalism lens advertisingagency worldwideweb magazine digitalslr photographic lm digitalphotography internet marketing advertisingagency megapixels aperture photograph consumer mtv publicrelations digitalphotography canon aperture job blog google compact photographiclens shutterspeed newspaper publicbroadcastingservice billboard camcorder aerialphotography visualarts agency massmedia videogame slrcamera holography exposure publicrelations google publicity lense single-lensre excamera view nder company brand productplacement digitalslrcamera focallength moviecamera service broadcasting graphicdesign olympus nikon cameraphone budget musicvideo promotion Table2:ExampleRelatedTermsbyMethod6.1EvaluationDesignWeidenti edthetop20keywordsfromeachavailabletextsourceusingboththestatisticalandhybridapproachesdescribedinSection4.ForeachofthesekeywordsweusetherelatedtermminingtechniquesofSection5toidentifythetop10relatedterms.Thesekeywordswerethenevaluatedwithausersurvey.Forthetopicmodelingphaseofthehybridtechnique,wesetthenumberoftopicsK=5withtheLDAparameters =0:3and =0:1.Userswereshownavideoclip,typicallyaround3minutesinlength,andasetofkeywords.Tokeepthesizeofthekeywordsetmanageable,weshow5ofthetop20keywordsforeachmethodfromeachavailabletextsource,and1ofthetop10relatedtermsforeachofthosekeywords,allchosenandorderedatrandom.Userswereaskedtomakeabinaryassessmentontherelevanceofeachdisplayedkeyword.ForthenewsandeducationalvideosandtheamateurclipsavailableonYouTube,usersareshownthecompletevideo.Forfulllength lms,usersareshownthetheatricaltrailerandaskedtomakejudgementsbasedonthetrailerandtheirpriorknowledgeofthemovie.Over23peopleparticipatedinthesurvey(personallyidenti ableinformationwasnotrequired),withaminimumof9andanaverageof13usersevaluatingeachvideo.6.2EvaluationMetricsWeevaluatethekeywordsgeneratedbyourmethodsusingfourmetrics.Theaveragerelevancyofthekeywordsdisplayedtouserswecalltheprecision.Multipleusersviewingthesamesetofkeywordsmaynotcompletelyagreeonwhichkeywordsarerelevant.Wethereforecomputethepotentialofasource,whichmeasuresthefractionofthekeywordsjudgedrelevantbyatleastoneuser.Moreformally,wede netheprecisionandpotentialoftextsourceSas:Precision(S)=1 iXijKi(S)\Rij jKi(S)jPotential(S)=jR(S)j jK(S)jRiisthesetofkeywordsjudgedrelevantinevaluationiandKi(S)arethekeywordsdisplayedtotheuserforevaluationiwhichcomefromsourceS.K(S)arethekeywordsfromsourceSdisplayedinatleastoneevaluation,andR(S)arethekeywordsfromsourceSjudgedrelevantbyatleastoneuser,de nedas:R(S)=[iKi(S)\RiK(S)=[iKi(S)Theothermetricswede neareappealandpopularity,whichserveasindicatorsofhowpertinentthekeywordsaretoadvertisers.Appealestimatesthelikelihoodthatakeyworddeemedrelevanttothecontentwillalsobemeaningfultoanadvertiser.Popularitymeasurestheaveragenumberofadvertisersinterestedinarelevantkeyword.Wede netheappealandpopularityofasourceSas:Appeal(S)=jR(S)\Aj jR(S)j10 UCLAComputerScienceDepartmentTechnicalReport#100025 VideoType Precision Potential Statistical Hybrid Statistical Hybrid StudioFilms 0.268 0.252 0.479 0.480 News/Educational 0.442 0.473 0.548 0.717 UserGenerated 0.268 0.368 0.390 0.473 Table4:PrecisionandPotentialforSTT VideoType WER Statistical Hybrid StudioFilms 0.857 0.723 0.690 News/Educational 0.406 0.731 0.961 Table5:RelativePrecisionandWordErrorRateHauptmann'sworkindicatesthatspeech-to-textworderrorratesunder0.4resultinretrievalper-formancecomperabletoaperfecttranscript[9].Atthe0.4threshold,relativeretrievalprecisionisapproximately80%.Wecomputetheaverageworderrorrateforstudio lmsandnews/educationalvideos(usingthedefault\general"languagemodelsforourSTTengine),andcomparetherelativepre-cisionofSTTwithrespecttoclosedcaptioningforthestatisticalandhybridmethods,showninTable5.Usergeneratedvideosarenotincludedbecauseno\correct"transcriptsareavailableforthecontent.Asexpected,theaverageworderrorratesfornewsandeducationalvideosaresubstantiallylower,thoughstillaround0.4.Forthistypeofcontent,therelativeprecisionofSTTis96%oftheclosedcaptioning.Forthehigherworderrorrateof lmswecanstillachieveover70%averagerelativeprecision.Theseresultsfurthersupportuseofthestatisticalselectionmethodsonlongertextinputsandthegenerativemethodsonshortertext,andsuggestthatspeechtranscriptsalonemaybesucientto ndmeaningfuladvertisingkeywordsforvideossuchasnewsbroadcasts.6.5PrecisionandPotentialofRelatedTermsWenextlookattheprecisionandpotentialoftherelatedterms.Table6showstheprecisionandpotentialscoresforthetop10relatedtermsfromboththestatistical(S-Related)andhybrid(H-Related)methods.TheseresultsaremostlyconsistentwithTable3,withthemostpreciseinputsource(closedcaptioning)producingthemostrelevantrelatedkeywords.Foreachmethodandsource,theprecisionandpotentialofthesourcekeywordsarehigherthantherelatedterms.InourexperimentswerandomlyselectedfromthetopN=10relatedtermsforeachsourcekeyword.Wenowinvestigatehowtheaverageprecisionoftherelatedtermsisa ectedaswevarythisrangefor1N10.Figure5plotstheprecisionoftherelatedkeywordsforeachtextsourceusingthestatisticalselectionmethod.Forclosedcaptioningthetop2relatedtermsgivethehighestprecision,whichislowerthantheprecisionofthesourcetermsbutsigni cantlyhigher(p=0:003)thanchoosingfromthetop10.Bothscriptandspeechtranscriptinputsshowanincreaseinprecisionwhenselectingfromthetop3-6terms.Whiletheprecisionisagainlowerthanthesourcekeywords,thereisnoticeableimprovementbetweenselectingfromthetopN=6andN=10forbothscript(p=0:06)andSTT(p=0:03)input.Thisresultsuggeststhatthenumberofrelatedtermstoconsidertoachievethemaximumoverallprecisiondependsontheinputtexttype,withhigherprecisioninputlikeclosedcaptioningachievingitsbestprecisionwithasmallernumberrelatedtermsthanscriptsorspeechtranscripts.Resultsforthehybridselectionmethodexhibitsimilarbehavior.Anotherfactortoconsiderwhenevaluatingtheprecisionoftherelatedkeywordsistherelevancyofthesourcetermbeingexpanded.Anirrelevantsourcetermislesslikelytoresultinrelevantrelated Source Precision Potential S-Related H-Related S-Related H-Related Script 0.254 0.215 0.253 0.222 CC 0.260 0.221 0.262 0.221 STT 0.208 0.186 0.200 0.191 Table6:PrecisionandPotentialofRelatedTerms12 UCLAComputerScienceDepartmentTechnicalReport#100025 Source Statistical S-Related Hybrid H-Related Script 0.726 0.788 0.607 0.792 CC 0.578 0.785 0.543 0.796 STT 0.681 0.827 0.594 0.820 Table7:AppealofKeywordsbySource Source Statistical S-Related Hybrid H-Related Script 3.59 3.96 3.00 4.18 CC 2.11 3.81 2.00 3.77 STT 2.54 4.39 2.56 4.30 Table8:PopularityofKeywordsbySourceForbothappealandpopularitywenoticethat,whileclosedcaptioningwasgenerallyconsideredthemostprecisesourceofkeywords,italsoproducestheleastmeaningfulkeywordsforadvertisers.Thismaybearesultofcharacternamesappearingintheclosedcaptioningkeywords,whichwenotedearlierare lteredoutfromscriptinputtextandarelesslikelytoretrieverelevantads.Finally,welookcloseratthepopularityofkeywordsforspeechtranscripts.Table9comparesthepopularityforsourceandrelatedkeywordsforvariousvideotypes.Inallcasestherelatedkeywordshavehigherpopularitythanthesourcekeywordsbyastatisticallysigni cantmargin.Italsoshowsthatnewsandeducationalcontentcontainslesspopularkeywordsforadvertisers.6.7Precision-PopularityTradeo sTheresultsabovedemonstratethat,whenrelevant,relatedkeywordsaresigni cantlymoreattractivetoadvertisersthansourcekeywords.Theoverallprecisionoftherelatedterms,however,islowerthansourceterms.Weexploretheinherenttradeo betweenkeywordrelevanceandpopularitybycomputingaprecision-weightedpopularitymetric:PWP(S)=Pk2K(S)AkP(S;k) jK(S)j(4)WhereP(S;k)istheprecisionofkeywordkfromsourceS,de nedas:P(S;k)=Pijfkg\Ri(S)j Pijfkg\Ki(S)jTable10showstheprecision-weightedpopularityforthestatisticalmethodforeachtextsourceusingthetop5relatedkeywordsfromeachsourcekeyword.Theresultssuggeststhatforscriptinput,theminorimprovementinpopularityofrelatedkeywords(showninTable8)maynoto setthedecreaseinprecision.Forspeechtranscriptinput,however,thereappearstobesomebene tfromrelatedterms.WeexamineSTTinputfurtherinTable11,whichshowsthatoverall,evenwiththedropinprecision,relatedkeywordsarebene cialtoadvertisersfornewsandusergeneratedvideoswhenonlyspeechtranscriptsareavailable.Althoughtherelatedkeywordsforstudio lmspeechtranscriptshavehigherpopularitythansourcekeywords(Table9),therelativeincreaseisnoticeablylowerthanforCCorSTT,andtheresultingprecision-weightedpopularitydoesnoto erimprovement. Source Statistical S-Related Hybrid H-Related StudioFilms 2.97 4.35 2.67 4.39 News/Educational 1.69 4.11 2.21 3.50 UserGenerated 1.89 4.83 2.63 4.75 Table9:PopularityforSpeechTranscripts14 UCLAComputerScienceDepartmentTechnicalReport#100025 [4]D.M.Blei,A.Y.Ng,M.I.Jordan,andJ.La erty.Latentdirichletallocation.JournalofMachineLearningResearch,3,2003.[5]A.Broder,M.Fontoura,V.Josifovski,andL.Riedel.Asemanticapproachtocontextualadvertising.InSIGIR'07,pages559{566,2007.[6]C.Buckley,G.Salton,J.Allan,andA.Singhal.AutomaticqueryexpansionusingSMART:TREC3.InTextREtrievalConference,1994.[7]Y.Chen,G.-R.Xue,andY.Yu.Advertisingkeywordsuggestionbasedonconcepthierarchy.InWSDM'08,pages251{260,2008.[8]H.ChimandX.Deng.Anewsuxtreesimilaritymeasurefordocumentclustering.InWWW'07,pages121{130,2007.[9]A.Hauptmann.Lessonsforthefuturefromadecadeofinformediavideoanalysisresearch.InCIVR'05,pages1{10,2005.[10]G.JehandJ.Widom.Simrank:ameasureofstructural-contextsimilarity.InKDD'02,pages538{543,2002.[11]A.JoshiandR.Motwani.Keywordgenerationforsearchengineadvertising.InICDMW'06,pages490{496,2006.[12]V.Levenshtein.Binarycodescapableofcorrectingdeletions,insertionsandreversals.InSovietPhysicsDoklady,1966.[13]H.Ma,H.Yang,I.King,andM.R.Lyu.Learninglatentsemanticrelationsfromclickthroughdataforquerysuggestion.InCIKM'08,pages709{718,2008.[14]C.D.Manning,P.Raghavan,andH.Schtze.IntroductiontoInformationRetrieval.CambridgeUniversityPress,NewYork,NY,USA,2008.[15]E.Moxley,T.Mei,X.-S.Hua,W.-Y.Ma,andB.Manjunath.Automaticvideoannotationthroughsearchandmining.InICME'08,pages685{688,2008.[16]L.Page,S.Brin,R.Motwani,andT.Winograd.Thepagerankcitationranking:Bringingordertotheweb.Technicalreport,StanfordDigitalLibraryTechnologiesProject,1998.[17]S.Ravi,A.Broder,E.Gabrilovich,V.Josifovski,S.Pandey,andB.Pang.Automaticgenerationofbidphrasesforonlineadvertising.InWSDM'10,pages341{350,2010.[18]B.Ribeiro-Neto,M.Cristo,P.B.Golgher,andE.SilvadeMoura.Impedancecouplingincontent-targetedadvertising.InSIGIR'05,pages496{503,2005.[19]M.SahamiandT.D.Heilman.Aweb-basedkernelfunctionformeasuringthesimilarityofshorttextsnippets.InWWW'06,pages377{386,2006.[20]G.SaltonandC.Buckley.Term-weightingapproachesinautomatictextretrieval.InInformationProcessingandManagement,pages513{523,1988.[21]S.Siersdorfer,J.SanPedro,andM.Sanderson.Automaticvideotaggingusingcontentredundancy.InSIGIR'09,pages395{402,2009.[22]A.VelivelliandT.S.Huang.Automaticvideoannotationbyminingspeechtranscripts.InCVPRW'06,page115,2006.[23]E.M.Voorhees.Queryexpansionusinglexical-semanticrelations.InSIGIR'94,pages61{69,1994.[24]C.Wang,P.Zhang,R.Choi,andM.D.Eredita.Understandingconsumersattitudetowardadver-tising.InEighthAmericasConf.onInformationSystem,pages1143{1148,2002.[25]W.T.Yih,J.Goodman,andV.R.Carvalho.Findingadvertisingkeywordsonwebpages.InWWW'06,pages213{222,2006.[26]S.Zanetti,L.Zelnik-Manor,andP.Perona.Awalkthroughtheweb'svideoclips.CVPRW'08,pages1{8,2008.16