UCLA Computer Science Department Technical Report - PDF document

Download presentation
 UCLA Computer Science Department Technical Report
 UCLA Computer Science Department Technical Report

Embed / Share - UCLA Computer Science Department Technical Report


Presentation on theme: " UCLA Computer Science Department Technical Report"— Presentation transcript


UCLAComputerScienceDepartmentTechnicalReport#100025 Theremainderofthispaperisorganizedasfollows.WediscussrelatedworkinSection2andprovideanoverviewofourgoalsandapproachinSection3.Section4describesoursystemforextractingkeywordsfromcontent-basedtextsources,andinSection5wepresentmethodsforexpandingthosetermstoaddressvocabularymismatchproblemsbetweenthesourcekeywordsandthosechosenbyadvertisers.WeevaluatethekeywordsgeneratedbythesemethodsforvariousvideotypesandsourcesoftextinSection6,anddiscussourobservations,conclusions,andfutureworkinSection7.2RelatedWorkSponsoredsearch,oradvertisingdisplayedalongsidethesearchresultsofauser-suppliedkeywordquery,typicallyinvolvesacomplexcombinationofadvertisersbiddingonkeywords,reviewofadvertisementsforrelevance,andanauctionprocesstoplaceadsalongsidesearchresults.See[2]foranoverviewofsponsoredsearch.Indisplayorcontent-matchadvertising,however,explicitkeywordsforthecontentarenotpro-vided.Inonlineadvertisingitisimportanttodisplayadsrelevanttoapage'scontent[24].Withoutuser-suppliedkeywords,researchershaveinvestigatednumerouskeywordidenti cationtechniquesandapproachestomatchadvertisementswiththecontentofWebpages.Ontologiesortaxonomieshavebeenusedincombinationwithfeatureidenti cationforsemanticapproachestomatchingadvertisementswithcontent[5,7].Ontologiesareoftendomain-speci candtedioustoconstruct,andtheindividualtextelementsfromscriptsordialogareoftenterse,makingtheuseofclassi cationtechniquesorontologiesmoreerrorprone.Yih,Goodman,andCarvalho[25]establishseveralfeaturesofWebpagesandquerylogs,suchasfrequency,textualcharacteristics(e.g.capitalization),andstructuralcuesforidentifyingadvertisingkeywords.Ribeiro-Netoetal.[18]proposestrategiesformatchingthetextofaWebpagewithtext-basedadvertisementsinaknownadinventory.Theyaddressthevocabularyimpedanceproblembyrepresentingapagewithconceptsfromitsnearest(mostsimilar)neighbors.Ravietal.[17]proposeatwophasegenerativemodelforidentifyingrelevantadvertisingkeywordsforagivenWebpage.TheyuseapopularmachinetranslationmethodtolearnaprobabilisticsetofkeywordmappingsfromatrainingcorpusofWebpagesassociatedwithadsandadvertiserchosenkeywords.TermweightsareassignedbasedonHTMLfeatures.Abigramlanguagemodeltrainedonsearchqueriesisusedtohelprankthegeneratedcandidatekeywords.Findingrelatedbut\lessobvious"(andthereforelessexpensive)keywordsfromanadvertiserspointofview[11,1]hasbeenaddressedaswell.Ourworkdi ersfromtheseproblemsinseveralways.OurtextsourcesareplaintextwhichlackexplicitstructuralcuessuchasHTMLmarkup.Wethereforemustresorttostatisticalmethodsforrankingandselectingkeywordsfromthesourcetext.Severaloftheabovetechniquesalsorequiretaggedtrainingdata,languageanddomain-speci contologies,orpre-constructedpoolsofavailableadvertisements.Ourmethodsarelanguage-independentandunsupervised,notrequiringanytrainingdata.Relatedtermidenti cationisawellresearchedproblemintheinformationretrievaldomain,wheretaskssuchasqueryrewritingorexpansionarewidelystudied.Voorhees[23]describedtheuseoflexicalrelationshipscontainedinWordNetforqueryexpansion.Buckleyetal.[6]notedthatrelatedtermswilltypicallyco-occurnon-randomlyindocumentsrelevanttoaquery.Morerecently,SahamiandHeilman[19]useasimilarnotiontocomputethesemanticsimilarityofshorttextsnippetsusingWebsearchresultsasanopaquecontext.OurworkfollowsalongtheselinesbyusingWebsearchresultstodiscoverrelatedco-occurringterms.WealsousetheimplicitsemanticrelationshipscapturedinthehyperlinkedstructureofWikipediatoidentifyrelatedterms.Hauptmannsummarizesmany\lessonslearned"regardingspeechrecognitionaccuracyandthee ectsofworderrorrateoninformationretrievalprecision[9].Inparticulartheirresearchshowsthatthebestsystemsachieveworderrorratesaround0.15underidealconditions(suchasin-studioanchorsforbroadcastnews),andthatretrievalperformancedegradesrelativelygracefullywithrespecttoperfecttexttranscriptsuntilworderrorratesapproach0.40.Whiletheirworkonretrievalisorthogonaltoourfocusofadvertisingkeywordselection,weconsiderthese ndingsaswecompareourresultsbetweenperfecttexttranscripts(closedcaptioningtracks)andspeechtranscripts.Keywordidenti cationformultimediaoftenutilizes,inpart,attributesextractedfromimagesaspartofalargerfeaturespaceformachinelearning.VelivelliandHuang[22]predicttagsforvideosbasedonimagefeaturesandspeechtranscripts.Usingacollectionofspeechtranscripts,theyperformaPLSI-basedclusteringtoformktopicthemes.Eachclusterisusedtogenerateaunigramlanguagemodeli,andasceneisassumedtobeamixtureofthesemodelsandanunderlyingbasemodel.Tagsarethenpredictedusingacombinationofshotfeaturesandkeywordco-occurrencebasedonaconstructedtrainingset.2 UCLAComputerScienceDepartmentTechnicalReport#100025 Figure1:ScriptProcessingWork owthisstage,ourgoalistoincreasethelikelihoodofmatchinganadvertiser'skeywordswhileminimizingdeclineinrelevancyoftheadswhenmatchesdooccur.ThisrelatedtermminingprocessisdescribedinSection5.Inbothsteps,keywordsareidenti edandrankedwithoutconsultinganinventoryofadsoradvertisersuppliedkeywords.4ProcessingSourceTextInthe rststageofprocessing,weanalyzetheformatandcomplexitiesofvideo-basedtextsources,suchasscripts,anddescribemethodsoftextanalysisbasedontraditionalstatisticalanalysisandgenerativemodels.Inthisworkweconsiderthreesourcesoftextdataforavideo:MovieScript-ascriptorscreenplayisadocumentthatoutlinesallofthevisual,audio,behavioral,andspokenelementsrequiredtotellastory.Since lmproductionisahighlycollaborativemedium,thedirector,cast,editors,andproductioncrewwillusevariousformsofthescripttointerprettheunderlyingstoryduringtheproduction lmingprocess.Numerousindividualsareinvolvedinthemakingofa lm,thereforeascriptmustconformtospeci cstandardsandconventionsthatallinvolvedpartiesunderstandandthuswilluseaspeci cformatwithrespecttothelayout,margins,notation,andotherproductionconventions.Thisdocumentisintendedtostructureallofthescriptelementsusedinascreenplay.ClosedCaptioning(CC)track-adocumentwhichcontainsaseriesoftimecodesandtextofthespokendialog.Eachtimecodeindicateswhenandforwhatdurationthecorrespondingtextappearsonscreen.Closedcaptioningtrackslackadditionalcues,suchasvisualinformationorindicatorsofthecurrentspeaker.Speech-To-Text(STT)-isaprocessbywhichaudiodatacontainingdialogornarrativecontentisautomaticallyconvertedtoatexttranscription.Theoutputtypicallyconsistsofaseriesofwords,eachwithanassociatedtimecodeandduration.Thesourceaudiomaybeofpoorqualityorcontainnon-speechsoundssuchasmusicorsounde ectartifacts,whichgenerallycontributetotranscriptionerrors.TranscriptionqualityistypicallymeasuredbytheoverallWordErrorRate(WER).AfrequentgoalofSTTsystemsistoreducetheimpactofahighWER,thougherrorratesonheterogeneouscontentistypicallyquitehigh.Figure1outlinestheprocessingwork owforacompletemoviescript,whichincludesnon-speechelementssuchassceneheadingsandactiondescriptions.Wewilldescribeeachofthesestepsnext.Notethatthework owislargelythesameforclosedcaptioningtracksandspeechtranscripts,whichcanbeformattedto\look"likeascreenplay.Inthosecases,script-speci cprocessingstepsaresimplyomitted.4.1ScriptParsingTelevisionandmoviescriptsarefrequentlywritteninplaintextandfollowaconventional\screenplay"formatwhichallowshumanreaderstoeasilydi erentiateandinferthepropersemanticsfordi erentscriptelements,suchasdialogorsceneheadings.Forexample,sceneheadingsaretypicallywrittenonasinglelineinallcapitalletters,beginningwithINTorEXTtodenotewhetherthesettingisinteriororexterior,andendingwithanindicatoroftimeofdaysuchasMORNINGorNIGHT.Figure2showsabriefsnippetofatypicalscript.Understandingthesemanticsofatextelementishelpfulwhenprocessingit.Forexample,characternamesappearfrequentlyinascriptpriortoeachoftheirlinesofdialog,thoughwegenerally ndthemtobeapoorchoiceforadvertisingkeywords.Weaddamachine-readablehierarchicalstructure4 UCLAComputerScienceDepartmentTechnicalReport#100025 Figure2:ExampleScriptSnippetandsemanticstoeachtextsegmentofascriptusinga nitestatemachinebasedparserderivedfromconventionalscreenplaywritingrules.Thisisdepictedasstep(1)inFigure1.Moviescriptdocumentsareconvertedintoastructuredandtaggedrepresentationwhereallscriptelements(sceneheadings,actiondescriptions,dialoglines,etc.)aresystematicallyextracted,tagged,andrecordedasobjectsintoaspecializeddocumentobjectmodel(DOM)forsubsequentprocessing.AllobjectswithintheDOM(e.g.,entiresentencestaggedbytheircorrespondingtypeandscriptsection)arethenprocessedusingbothstatisticalmethodstoidentifykeywordsofinterest,andanaturallanguageprocessing(NLP)enginethatidenti esandtagsthenounitemsidenti edineachsentence.Theseextractedandtaggednounelementsarethencombinedwithtime-alignmentinformationandrecordedintoametadatarepository.Wedescribethisalignmentprocessnext.4.2Speech-to-Text(STT)ProcessingSTTtranscriptscontaintimecodeinformationthatplaysanimportantroleinassociatingscriptkeywordstospeci cpointsintimeinthevideocontent.Inthissectionofthework ow,avideooraudio lethatcontainsspokendialogthatcorrespondstothedialogsectionsoftheinputscriptisreadandprocessedusingaSpeech-to-Textenginethatgeneratesatranscriptionofthespokendialog,shownas(2)inFigure1.Forthisprocess,wealsoperformanimportantoptimization.Automaticspeechrecognitionenginestypicallyincorporateaknownvocabularyandprobabilisticmodelsofspeech(oftenbasedonwordN-grams).Whenthedialogdataisavailablefromascript,weconstructacustomlanguagemodeltobiasthetranscriptionenginetowardstheexpectedvocabularyandwordsequences,whichhelpstoincreasethetranscriptionaccuracy.4.3ScriptandSTTTranscriptAlignmentAtthisstage,wehavetaggedandstructuredscriptdata(withoutanytimeinformation)fromstep(1),andanoisy,relativelyinaccurateSTTtranscriptwithveryprecisetimecodeinformationfromstep(2).Tomakeuseofthekeywordsandconceptsgeneratedbythelaterprocessingsteps,thescriptdatamustbetime-alignedwiththeSTTdata.Thisisaccomplishedinstep(3)byusingtheLevenshteinWordEditDistance[12]algorithmto ndthebestwordalignmentbetweenscriptdialogandtheSTTtranscript.Theresultofthisphaseofprocessingisatime-alignedsourcescriptthatcanassociatescriptactionanddialogkeywordswithprecisepointsintimewithinthevideocontent.Thisdataisstoredintoametadatarepository.4.4StatisticalGenerationofKeywordTermsInthe nalstepforasourcetext(script,CC,ortranscript),thetime-codedtextelementsfromthemetadatarepositoryareusedtobuildasuxwordN-gramtreethatisprunedbyN-gramtermfrequencytodiscoverthemostdominantterms,basedinlargepartontheworkofChimandDeng[8].Thisisshownas(5)inFigure1.BeforeN-gramtermgeneration,weperformedaone-timeprocessofselectingastopwordvocabularyspeci ctothedomainofmoviescripts.Usingfrequencystatisticscomputedfromalargecorpusofscripts,wemanuallyidenti edasetofstopwordsfromthemostfrequentlyoccurringterms.DuringN-gramtermgeneration,thefollowingstepsarefollowed:1.Corpusstopwordsareremovedfromthesourcetext.2.AnN-gramtermtreewithsequencesuptolengthN=4iscreatedbycollectingandcountingN-gramoccurrencesfromthescript.5 UCLAComputerScienceDepartmentTechnicalReport#100025 3.TheresultingsuxtreeisthenprunedbytraversingthetreetocollectandrankthetopmostMfrequentterms.Inourexperiments,weselectthemostfrequentM=20keywords.4.5GenerativeModelsForNoisyDataThestatisticalN-grammethodsworkwellwhenkeywordsandphrasesarerepeatedmultipletimes.Whilethisisoftenthecaseforlongerorwell-formedtextinput,shortornoisytextoftenresultsinthemajorityof(non-stopword)keywordsonlybeingmentionedonce.Withthistypeofinput,statisticalmodelsareunabletodecipherwhichkeywordsaremostimportant.Tobetterhandleshortornoisytextinput,weuseakeywordselectionmethodbasedongenerativetopicmodeling.Inthismodel,weassumethatavideocomprisesasmallnumberofhiddentopics,whichcanberepresentedaskeywordprobabilities,andthatavideo'stextisgeneratedfromsomedistributionoverthosetopics.Thehighlyprobablekeywordsinthosetopicsarelikelytobemostrepresentativeofthevideocontent.WeuseLatentDirichletAllocation(LDA)[4]tolearnthetopicsandcorrespondingtopic-keywordprobabilitydistributionfromtheinputtext.Wethencombinethesetopicstoformarankedkeywordlist.4.5.1GeneratingTopicsTodiscovertheunderlyingtopicsinavideo,wesegmenttheinputtextintosentencesandperformtopicmodelingwithLDA.Theresultingtopic-termdistributionisaKxVmatrix,whereKisthenumberoftopics,Visthesizeoftheinputvocabulary,and[i][j]istheprobabilityofkeywordjintopici.Weformanorderedlistofkeywordskiforeachtopic,sortedbytheirprobabilityin~[i].ThisresultsinKrankedlistsofkeywords,onepertopic,whichmustthenbemergedintoasinglelisttoselectthetopM.WhilesimplyselectingthetopM Kkeywordsfromeachtopicisoneoption,wedescribeamoregeneralsolutionformergingmultiplerankedlistswhenwediscussourapproachto ndingrelatedkeywords.ThismethodisdescribedinSection5.3.4.6Statistical-GenerativeHybridMethodTheLDAmodellearnskeywordprobabilitiesfortermswhichareseparatedbywhitespace.Whenpos-sible,however,itispreferabletoidentifymulti-termkeywordsforadvertising.Forexample,thephrase\relationaldatabase"ismorespeci cthaneitheroftheindividualwords\relational"or\database",andthushashighervaluetoadvertisers.Tohelpidentifythesemulti-tokenkeywordsinshortornoisytextsources,weuseahybridofstatisticalandgenerativetechniques.We rstprocessthesourcetextusingtheN-grammethodtoidentifyanysigni cantmulti-tokenkeywords.Wetheneditthesourcetextbyremovingthewhitespacebetweenthetermsofthesemulti-tokenphrasessotheyappearasasingletoken.Thismodi edsourcetextisthenprocessedusingthegenerativemodel.4.7FilteringtheKeywordsWeapplytwo lters,whenpossible,toremovefrequentlyoccurringwordswhichareoftennotusefulinthecontextofmatchingadvertisements.Fromallinputsources,keywordsmatchingalistofEnglishprofanityareremoved.Wealso ndthatmaincharacternamesareoftenamongstthetopkeywords,butgenerallydonotretrieverelevantadvertisements.Whengivenacompletescript,weremovecharacternamesfromthekeywordlistusingadictionaryconstructedduringtheparsingandtaggingstage.Forclosedcaptioningandspeechtranscripts,however,thesenamesareunknownandthusmaystillappearinthetopkeywords.Thisismorecommonforclosedcaptioningthanspeechtranscripts,however,aspropernamesarelesslikelytobecorrectlytranscribedbytheSTTengine.Atthispointthemostdominant(possiblymulti-term)keywordswhichoccurinthesourcetext,alongwithassociatedtimecodeinformation,havebeenidenti edandcanbesuggestedasadvertisingkeywordsrelevanttoaparticulartimepointofavideo.AsRibeiro-Netoetal.[18]describe,however,thekeywordschosendirectlyfromasourceandthekeywordsbidonbyadvertisersmaysu eravocabularyimpedanceproblem.Inthenextsectionwedescribenoveltermminingtechniqueswhichcanprovidearicher,morecompletesetofrelevantadvertisingkeywords.6 UCLAComputerScienceDepartmentTechnicalReport#100025 Figure3:RelatedTermsFromSearchResultsAfterthese lteringsteps,weconstructavectorspacemodelMforthissmallcorpusof\documents"relevanttoT.BasedonthepopularTF-IDF[20]termweighting,wecomputethecorpusfrequency(CF)andinverse-document-frequency(IDF)weightforeachterminM,andrankthekeywordsaccordingtotheirCF*IDFscore.Thisstepisshowninthework owas(3)inFigure3,producingthe nallistofrankedrelatedkeywordsfromsearchresults.5.2MiningwithWikipediaTheseconddatasourceweanalyzeforrelatedtermsisWikipedia,anextensiveknowledgebasewithover3.1millionEnglisharticlesavailableatthetimeofthiswriting.WhereasintheWebcorpuswefocusedonsearchresultsinresponsetoaquery,withWikipediawedirectourattentiontohyperlinks.WithinthetextofaWikipediaarticle,numerousinter-wikilinkspointtootherWikipediapages,whichallowsustomodelWikipediaasadirectedgraphG=fV;Eg.WeconstructtheWikipediagraphwherenodesVrepresentpagesinthemainarticlenamespace,andedgesEdenoteinter-wikilinksbetweenthosepages.Whenbuildingthegraph,twoarticletypesinthemainnamespaceareprocessedspecially.Forambiguoustermssuchas\coach",adisambiguationpageinWikipedialiststheavailablearticlesfordi erentsensesoftheterm.Thesepagesserveprimarilyasnavigationalaidesforusers,ratherthanconveyingasemanticrelationshipbetweenterms,andwethere-foreexcludetheminthegraph.Thesecondcategoryofpagesweprocessspeciallyareredirectionpages,whichprovideatranslationforalternateormisspelledwords,inconsistentcapitalization,acronyms,andsoon,intoacanonicalform.InourWikipediagraph,anarticleandallofthepageswhichredirecttoitaremergedintoasinglenode.Weusethelinkstructureofthegraphtobothidentifyandrankcandidaterelatedterms.Thesestepsaredescribedindetailinthefollowingsections.5.2.1IdentifyingCandidateRelatedTermsWithoutclearlyde neddirectedlinksbetweenindividualtermsintheWebcorpus,theapproachesusingWebsearchresultsdescribedabovedependontheassumptionsthatdocumentsretrievedbythesearchenginearerelevanttotheinputterms,andthatothertagsorkeywordsforthosepagesarepotentiallyrelated.Thatis,werelyonco-occurrencebasedmeasurestoidentifywhichtermsaremostlikelyrelated.WithWikipedia,however,wehaveanexplicitlinkstructurebetweenarticleswhichcanbeusedasanindicatorofrelatedness.Werequiretherelatednessbetweentwoarticlenodesaandbtobeasymmetricrelationship:aisrelatedtobifandonlyifbisrelatedtoa.TranslatingthisrequirementtotheWikipediagraphisrelativelystraightforward.Toidentifycandi-daterelatedtermsfortermT,we rstlocatetheWikipediapagewithTasthetitle.3GiventhenodetforT,weidentifyanynodesinthegraphwhichformadirectcyclewithtascandidaterelatedterms.Thatis,weselectthesubsetofnodesNVsuchthat:8n2V;n2N=)ft;ng^fn;tg2E(1)Figure4showsasimpleexample,wherefortermt,termsn1andn2arecandidaterelatedterms,butXandYarenot.5.2.2RankingCandidateTermsAgivensetofcandidaterelatedtermsmaybequitelarge.Wenowlookathowtorankthecandidateterms.Tobeagoodsuggestionasanadvertisingkeyword,atermshouldberelativelypopular.Whilewecouldmeasurepopularitythroughexternalsources,suchasquerylogfrequency,wechosetoutilizethegraphstructureofWikipedia.Weapproximatetherelativeimportanceoftermsbycomputing 3WecanrelaxthisrequirementandsearchthetextofWikipediaarticlestoidentifythetoppageorpagesforanyparticularinputterm,albeitatalikelyreductioninqualityofthegeneratedrelatedterms.8 UCLAComputerScienceDepartmentTechnicalReport#100025 SR(camera) WP(camera) Combined SR(advertising) WP(advertising) Combined digitalcamera photography digitalcamera product internet internet lens pornography photography marketing newspaper product canon visualarts canon advertiser videogame marketing nikon photograph nikon business americanfootball newspaper zoom digitalcamera pornography campaign magazine advertiser lmcamera photojournalism lens advertisingagency worldwideweb magazine digitalslr photographic lm digitalphotography internet marketing advertisingagency megapixels aperture photograph consumer mtv publicrelations digitalphotography canon aperture job blog google compact photographiclens shutterspeed newspaper publicbroadcastingservice billboard camcorder aerialphotography visualarts agency massmedia videogame slrcamera holography exposure publicrelations google publicity lense single-lensre excamera view nder company brand productplacement digitalslrcamera focallength moviecamera service broadcasting graphicdesign olympus nikon cameraphone budget musicvideo promotion Table2:ExampleRelatedTermsbyMethod6.1EvaluationDesignWeidenti edthetop20keywordsfromeachavailabletextsourceusingboththestatisticalandhybridapproachesdescribedinSection4.ForeachofthesekeywordsweusetherelatedtermminingtechniquesofSection5toidentifythetop10relatedterms.Thesekeywordswerethenevaluatedwithausersurvey.Forthetopicmodelingphaseofthehybridtechnique,wesetthenumberoftopicsK=5withtheLDAparameters =0:3and =0:1.Userswereshownavideoclip,typicallyaround3minutesinlength,andasetofkeywords.Tokeepthesizeofthekeywordsetmanageable,weshow5ofthetop20keywordsforeachmethodfromeachavailabletextsource,and1ofthetop10relatedtermsforeachofthosekeywords,allchosenandorderedatrandom.Userswereaskedtomakeabinaryassessmentontherelevanceofeachdisplayedkeyword.ForthenewsandeducationalvideosandtheamateurclipsavailableonYouTube,usersareshownthecompletevideo.Forfulllength lms,usersareshownthetheatricaltrailerandaskedtomakejudgementsbasedonthetrailerandtheirpriorknowledgeofthemovie.Over23peopleparticipatedinthesurvey(personallyidenti ableinformationwasnotrequired),withaminimumof9andanaverageof13usersevaluatingeachvideo.6.2EvaluationMetricsWeevaluatethekeywordsgeneratedbyourmethodsusingfourmetrics.Theaveragerelevancyofthekeywordsdisplayedtouserswecalltheprecision.Multipleusersviewingthesamesetofkeywordsmaynotcompletelyagreeonwhichkeywordsarerelevant.Wethereforecomputethepotentialofasource,whichmeasuresthefractionofthekeywordsjudgedrelevantbyatleastoneuser.Moreformally,wede netheprecisionandpotentialoftextsourceSas:Precision(S)=1 iXijKi(S)\Rij jKi(S)jPotential(S)=jR(S)j jK(S)jRiisthesetofkeywordsjudgedrelevantinevaluationiandKi(S)arethekeywordsdisplayedtotheuserforevaluationiwhichcomefromsourceS.K(S)arethekeywordsfromsourceSdisplayedinatleastoneevaluation,andR(S)arethekeywordsfromsourceSjudgedrelevantbyatleastoneuser,de nedas:R(S)=[iKi(S)\RiK(S)=[iKi(S)Theothermetricswede neareappealandpopularity,whichserveasindicatorsofhowpertinentthekeywordsaretoadvertisers.Appealestimatesthelikelihoodthatakeyworddeemedrelevanttothecontentwillalsobemeaningfultoanadvertiser.Popularitymeasurestheaveragenumberofadvertisersinterestedinarelevantkeyword.Wede netheappealandpopularityofasourceSas:Appeal(S)=jR(S)\Aj jR(S)j10 UCLAComputerScienceDepartmentTechnicalReport#100025 VideoType Precision Potential Statistical Hybrid Statistical Hybrid StudioFilms 0.268 0.252 0.479 0.480 News/Educational 0.442 0.473 0.548 0.717 UserGenerated 0.268 0.368 0.390 0.473 Table4:PrecisionandPotentialforSTT VideoType WER Statistical Hybrid StudioFilms 0.857 0.723 0.690 News/Educational 0.406 0.731 0.961 Table5:RelativePrecisionandWordErrorRateHauptmann'sworkindicatesthatspeech-to-textworderrorratesunder0.4resultinretrievalper-formancecomperabletoaperfecttranscript[9].Atthe0.4threshold,relativeretrievalprecisionisapproximately80%.Wecomputetheaverageworderrorrateforstudio lmsandnews/educationalvideos(usingthedefault\general"languagemodelsforourSTTengine),andcomparetherelativepre-cisionofSTTwithrespecttoclosedcaptioningforthestatisticalandhybridmethods,showninTable5.Usergeneratedvideosarenotincludedbecauseno\correct"transcriptsareavailableforthecontent.Asexpected,theaverageworderrorratesfornewsandeducationalvideosaresubstantiallylower,thoughstillaround0.4.Forthistypeofcontent,therelativeprecisionofSTTis96%oftheclosedcaptioning.Forthehigherworderrorrateof lmswecanstillachieveover70%averagerelativeprecision.Theseresultsfurthersupportuseofthestatisticalselectionmethodsonlongertextinputsandthegenerativemethodsonshortertext,andsuggestthatspeechtranscriptsalonemaybesucientto ndmeaningfuladvertisingkeywordsforvideossuchasnewsbroadcasts.6.5PrecisionandPotentialofRelatedTermsWenextlookattheprecisionandpotentialoftherelatedterms.Table6showstheprecisionandpotentialscoresforthetop10relatedtermsfromboththestatistical(S-Related)andhybrid(H-Related)methods.TheseresultsaremostlyconsistentwithTable3,withthemostpreciseinputsource(closedcaptioning)producingthemostrelevantrelatedkeywords.Foreachmethodandsource,theprecisionandpotentialofthesourcekeywordsarehigherthantherelatedterms.InourexperimentswerandomlyselectedfromthetopN=10relatedtermsforeachsourcekeyword.Wenowinvestigatehowtheaverageprecisionoftherelatedtermsisa ectedaswevarythisrangefor1N10.Figure5plotstheprecisionoftherelatedkeywordsforeachtextsourceusingthestatisticalselectionmethod.Forclosedcaptioningthetop2relatedtermsgivethehighestprecision,whichislowerthantheprecisionofthesourcetermsbutsigni cantlyhigher(p=0:003)thanchoosingfromthetop10.Bothscriptandspeechtranscriptinputsshowanincreaseinprecisionwhenselectingfromthetop3-6terms.Whiletheprecisionisagainlowerthanthesourcekeywords,thereisnoticeableimprovementbetweenselectingfromthetopN=6andN=10forbothscript(p=0:06)andSTT(p=0:03)input.Thisresultsuggeststhatthenumberofrelatedtermstoconsidertoachievethemaximumoverallprecisiondependsontheinputtexttype,withhigherprecisioninputlikeclosedcaptioningachievingitsbestprecisionwithasmallernumberrelatedtermsthanscriptsorspeechtranscripts.Resultsforthehybridselectionmethodexhibitsimilarbehavior.Anotherfactortoconsiderwhenevaluatingtheprecisionoftherelatedkeywordsistherelevancyofthesourcetermbeingexpanded.Anirrelevantsourcetermislesslikelytoresultinrelevantrelated Source Precision Potential S-Related H-Related S-Related H-Related Script 0.254 0.215 0.253 0.222 CC 0.260 0.221 0.262 0.221 STT 0.208 0.186 0.200 0.191 Table6:PrecisionandPotentialofRelatedTerms12 UCLAComputerScienceDepartmentTechnicalReport#100025 Source Statistical S-Related Hybrid H-Related Script 0.726 0.788 0.607 0.792 CC 0.578 0.785 0.543 0.796 STT 0.681 0.827 0.594 0.820 Table7:AppealofKeywordsbySource Source Statistical S-Related Hybrid H-Related Script 3.59 3.96 3.00 4.18 CC 2.11 3.81 2.00 3.77 STT 2.54 4.39 2.56 4.30 Table8:PopularityofKeywordsbySourceForbothappealandpopularitywenoticethat,whileclosedcaptioningwasgenerallyconsideredthemostprecisesourceofkeywords,italsoproducestheleastmeaningfulkeywordsforadvertisers.Thismaybearesultofcharacternamesappearingintheclosedcaptioningkeywords,whichwenotedearlierare lteredoutfromscriptinputtextandarelesslikelytoretrieverelevantads.Finally,welookcloseratthepopularityofkeywordsforspeechtranscripts.Table9comparesthepopularityforsourceandrelatedkeywordsforvariousvideotypes.Inallcasestherelatedkeywordshavehigherpopularitythanthesourcekeywordsbyastatisticallysigni cantmargin.Italsoshowsthatnewsandeducationalcontentcontainslesspopularkeywordsforadvertisers.6.7Precision-PopularityTradeo sTheresultsabovedemonstratethat,whenrelevant,relatedkeywordsaresigni cantlymoreattractivetoadvertisersthansourcekeywords.Theoverallprecisionoftherelatedterms,however,islowerthansourceterms.Weexploretheinherenttradeo betweenkeywordrelevanceandpopularitybycomputingaprecision-weightedpopularitymetric:PWP(S)=Pk2K(S)AkP(S;k) jK(S)j(4)WhereP(S;k)istheprecisionofkeywordkfromsourceS,de nedas:P(S;k)=Pijfkg\Ri(S)j Pijfkg\Ki(S)jTable10showstheprecision-weightedpopularityforthestatisticalmethodforeachtextsourceusingthetop5relatedkeywordsfromeachsourcekeyword.Theresultssuggeststhatforscriptinput,theminorimprovementinpopularityofrelatedkeywords(showninTable8)maynoto setthedecreaseinprecision.Forspeechtranscriptinput,however,thereappearstobesomebene tfromrelatedterms.WeexamineSTTinputfurtherinTable11,whichshowsthatoverall,evenwiththedropinprecision,relatedkeywordsarebene cialtoadvertisersfornewsandusergeneratedvideoswhenonlyspeechtranscriptsareavailable.Althoughtherelatedkeywordsforstudio lmspeechtranscriptshavehigherpopularitythansourcekeywords(Table9),therelativeincreaseisnoticeablylowerthanforCCorSTT,andtheresultingprecision-weightedpopularitydoesnoto erimprovement. Source Statistical S-Related Hybrid H-Related StudioFilms 2.97 4.35 2.67 4.39 News/Educational 1.69 4.11 2.21 3.50 UserGenerated 1.89 4.83 2.63 4.75 Table9:PopularityforSpeechTranscripts14 UCLAComputerScienceDepartmentTechnicalReport#100025 [4]D.M.Blei,A.Y.Ng,M.I.Jordan,andJ.La erty.Latentdirichletallocation.JournalofMachineLearningResearch,3,2003.[5]A.Broder,M.Fontoura,V.Josifovski,andL.Riedel.Asemanticapproachtocontextualadvertising.InSIGIR'07,pages559{566,2007.[6]C.Buckley,G.Salton,J.Allan,andA.Singhal.AutomaticqueryexpansionusingSMART:TREC3.InTextREtrievalConference,1994.[7]Y.Chen,G.-R.Xue,andY.Yu.Advertisingkeywordsuggestionbasedonconcepthierarchy.InWSDM'08,pages251{260,2008.[8]H.ChimandX.Deng.Anewsuxtreesimilaritymeasurefordocumentclustering.InWWW'07,pages121{130,2007.[9]A.Hauptmann.Lessonsforthefuturefromadecadeofinformediavideoanalysisresearch.InCIVR'05,pages1{10,2005.[10]G.JehandJ.Widom.Simrank:ameasureofstructural-contextsimilarity.InKDD'02,pages538{543,2002.[11]A.JoshiandR.Motwani.Keywordgenerationforsearchengineadvertising.InICDMW'06,pages490{496,2006.[12]V.Levenshtein.Binarycodescapableofcorrectingdeletions,insertionsandreversals.InSovietPhysicsDoklady,1966.[13]H.Ma,H.Yang,I.King,andM.R.Lyu.Learninglatentsemanticrelationsfromclickthroughdataforquerysuggestion.InCIKM'08,pages709{718,2008.[14]C.D.Manning,P.Raghavan,andH.Schtze.IntroductiontoInformationRetrieval.CambridgeUniversityPress,NewYork,NY,USA,2008.[15]E.Moxley,T.Mei,X.-S.Hua,W.-Y.Ma,andB.Manjunath.Automaticvideoannotationthroughsearchandmining.InICME'08,pages685{688,2008.[16]L.Page,S.Brin,R.Motwani,andT.Winograd.Thepagerankcitationranking:Bringingordertotheweb.Technicalreport,StanfordDigitalLibraryTechnologiesProject,1998.[17]S.Ravi,A.Broder,E.Gabrilovich,V.Josifovski,S.Pandey,andB.Pang.Automaticgenerationofbidphrasesforonlineadvertising.InWSDM'10,pages341{350,2010.[18]B.Ribeiro-Neto,M.Cristo,P.B.Golgher,andE.SilvadeMoura.Impedancecouplingincontent-targetedadvertising.InSIGIR'05,pages496{503,2005.[19]M.SahamiandT.D.Heilman.Aweb-basedkernelfunctionformeasuringthesimilarityofshorttextsnippets.InWWW'06,pages377{386,2006.[20]G.SaltonandC.Buckley.Term-weightingapproachesinautomatictextretrieval.InInformationProcessingandManagement,pages513{523,1988.[21]S.Siersdorfer,J.SanPedro,andM.Sanderson.Automaticvideotaggingusingcontentredundancy.InSIGIR'09,pages395{402,2009.[22]A.VelivelliandT.S.Huang.Automaticvideoannotationbyminingspeechtranscripts.InCVPRW'06,page115,2006.[23]E.M.Voorhees.Queryexpansionusinglexical-semanticrelations.InSIGIR'94,pages61{69,1994.[24]C.Wang,P.Zhang,R.Choi,andM.D.Eredita.Understandingconsumersattitudetowardadver-tising.InEighthAmericasConf.onInformationSystem,pages1143{1148,2002.[25]W.T.Yih,J.Goodman,andV.R.Carvalho.Findingadvertisingkeywordsonwebpages.InWWW'06,pages213{222,2006.[26]S.Zanetti,L.Zelnik-Manor,andP.Perona.Awalkthroughtheweb'svideoclips.CVPRW'08,pages1{8,2008.16

By: phoebe-click
Views: 136
Type: Public

UCLA Computer Science Department Technical Report - Description


Welch Junghoo Cho and Walter Chang mjwelchcsuclaedu chocsuclaedu wachangadobecom Abstract With the proliferation of online distribution methods for videos content owners require easier and more e64256ective methods for monetization through advertis ID: 1529 Download Pdf

Related Documents