/
thesedirectionsofhighcorrelationcanbeusedtondwordsthatarestronglychar thesedirectionsofhighcorrelationcanbeusedtondwordsthatarestronglychar

thesedirectionsofhighcorrelationcanbeusedtondwordsthatarestronglychar - PDF document

natalia-silvester
natalia-silvester . @natalia-silvester
Follow
357 views
Uploaded On 2017-02-23

thesedirectionsofhighcorrelationcanbeusedtondwordsthatarestronglychar - PPT Presentation

annotationss1Forexampleif3outof4studentslabelElvisPresleysHeartbreakHotelasbeingabluessongsthenAbluesheartbreakhotel075Wecalculatethehumanagreementforawordbyaveragingoverallthesong ID: 518906

#(annotations)s:(1)Forexample if3outof4studentslabelElvisPresley's`HeartbreakHotel'asbeinga`blues'songsthenA`blues' `heartbreakhotel'=0:75.Wecalculatethehumanagreementforawordbyaveragingoverallthesong

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "thesedirectionsofhighcorrelationcanbeuse..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

thesedirectionsofhighcorrelationcanbeusedtondwordsthatarestronglycharacterizedbyanaudiorepre-sentation.WedosobyimposingconstraintsonCCAthatexplicitlyturnitintoavocabularyselectionmechanism.ThisCCAvariantiscalledsparseCCA.2HUMANAGREEMENTRecently,wecollectedtheComputerAuditionLab500(CAL500)dataset[16]:500songsby500uniqueartistseachofwhichhasbeenannotatedaccordingtoa173-wordvocabularybyaminimumofthreeindividuals.Mostoftheparticipantswerepaid,American,undergraduatestu-dentsandthetestingwasconductedinacomputerlabo-ratoryatUCSanDiego.Wepurposelycollectedmultipleannotationsforsongssothatwecouldgaugehowconsis-tentlyapopulationofcollegestudentslabelmusic.Usingthisdataset,wecancalculateastatisticwerefertoashumanagreementforeachwordinourvocabulary.Theagreementofaword-songpair(w;s)is:Aw;s=#(positiveassociations)w;s #(annotations)s:(1)Forexample,if3outof4studentslabelElvisPresley's`HeartbreakHotel'asbeinga`blues'songsthenA`blues',`heartbreakhotel'=0:75.Wecalculatethehumanagreementforawordbyaveragingoverallthesongsinwhichatleastonesubjecthasusedthewordtodescribethesong.ThiscanbewrittenasAw=PsAw;s PsI[Aw;s�0](2)whereIisanindicatorfunctionthatis1ifAw;sisgreaterthenzero,and0otherwise.Thatis,allword-songpairsarevalidexcepttheword-songpairthatnobodyassociateswithoneanother.Weexpecthumanagreementtobecloseto1formore`objective'wordssuchaswordsassociatedwithinstrumentation(`cowbell'),andcloseto0forwordsthataremore`subjective'suchasthosethatrelatedtosongusages(`drivingmusic').3ACOUSTICCORRELATIONWITHCCACanonicalCorrelationAnalysis,orCCA,isamethodofexploringdependenciesbetweendatawhicharerepresentedintwodifferent,butrelated,vectorspaces.Forexample,considerasetofsongswhereeachsongisrepresentedbybothasemanticannotationvectorandanaudiofeaturevector.Anannotationvectorforasongisareal-valued(orbinary)vectorwhereeachelementrepresentsthestrengthofassociation(e.g.,Equation1)betweenthesongandawordfromourvocabulary.Anaudiofeaturevectorisareal-valuedvectorofstatisticscalculatedfromthedigi-talaudiosignal.Itisassumedthatthetwospacessharesomejointinformationwhichcanbecapturedintheformofcorrelationsbetweenthemusicdatathatliveinthesespaces.CCAndsaone-dimensionalprojectionofthedataineachspacesuchthatthecorrelationsbetweentheprojectionsismaximized.Moreformally,considertwodatamatrices,AandS,fromtwodifferentfeaturespaces.TherowsofAcon-tainmusicdatarepresentedintheaudiofeaturespaceA.ThecorrespondingrowsofScontainthemusicdatarepre-sentedinthesemanticannotationspaceS(e.g.,annotationvectors).CCAseekstooptimizemaxwa2A;ws2Sw0aA0Sws(3)s.t.w0aA0Awa=1w0sS0Sws=1:TheobjectiveinProblem3isthedotproductbetweenpro-jectionsofdatapoints.Byitself,theobjectivefunctionisunboundedsincewecanscalethewtermsarbitrarily.Thus,weaddtheconstraintstoboundthelengthofthewtermsandensuretheresultisproportionaltoacorrelationscore.ByanalyzingtheLagrangiandualfunctionofProblem3,wendthatitisequivalenttoapairofmaximumeigen-valueproblems,S�1ssSsaS�1aaSasws=2ws(4)S�1aaSasS�1ssSsawa=2wa(5)whereSaaSasSsaSss=A0AA0SS0AS0SandisthemaximumofProblem3.Notethatthesolutionvectorwscanbeinterpretedasalinearcombinationofwords,learnedfromthemusicdata,whicharehighlycorrelatedwiththeaudiorepresentation.InthenextsectionwemodifyProblem3sothatasubsetofwordsinourvocabularyisexplicitlyselected.3.1SparseCCAThesolutionvectors,waandws,inProblem3canbeconsidereddensesincemostoftheelementsofeachvec-torwillbenon-zero.Inmanyapplicationsitmaybeofinteresttolimitthenumberofnon-zeroelementsinthewterms.Thismayaidintheinterpretabilityoftheresult,particularlywhenthecoordinateaxesofavectorspacehaveadirectmeaning.Forexample,inbioinformaticsex-periments,theinputspacemaycontainthousandsofco-ordinateaxescorrespondingtoindividualgenes.Wemaywishtoperformsparseanalysisifwesuspectthatsomephenomenonisdependentonahandfulofthesegenes.Inthispaper,weimposesparsityonthesolutionvectorws,correspondingtothesemanticspacewhereeachcoordi-nateaxisdescribesaword.Ourgoalistondasubsetofwordsinavocabularythatarehighlycorrelatedtoaudio.Weexpectthatthesewordsmaybemoreobjectivethanothersinthevocabularysincetheyarepotentiallycharac-terizedbycorrelationswiththeunderlyingaudiosignal,andthus,usingthesewordsmayimprovetheperformanceofsemanticmusicanalysissystems.Sparsityhasbeenwellstudiedintheeldsofstatis-ticsandmachinelearning[22,2,14].Imposingsparsity Top3wordsbysemanticcategory Agreement AcousticCorrelation overall maleleadvocals,drumset,femaleleadvocals rapping,ataparty,hip-hop/rap emotion notangry/agressive,notweird,nottender/soft arousing/awakening,exciting/thrilling,sad genre hip-hop/rap,electronica,world hip-hop/rap,electronica,funk instrument maleleadvocals,drumset,femaleleadvocals drummachine,samples,synthesizer general electrictexture,notdanceable,highenergy heavybeat,verydanceable,synthesizedtexture usage driving,ataparty,goingtosleep ataparty,exercising,gettingreadytogoout vocals rapping,emotional,strong rapping,strong,alteredwitheffects Bottom3wordsbysemanticcategory Agreement AcousticCorrelation overall atwork,withthefamily,wakingup notweird,notarousing,notangry/agressive emotion notpowerful/strong,notemotional,weird notweird,notarousing,notangry/agressive genre contemporaryblues,rootsrock,alternativefolk classicrock,bebop,alternativefolk instrument trombone,tamborine,organ femaleleadvocals,drumset,acousticguitar general changingenergylevel,minorkeytonality,lowsongquality constantenergylevel,changingenergylevel,notcatchy usage atwork,withthefamily,wakingup goingtosleep,cleaningthehouse,atwork vocals falsetto,spoken,monotone highpitches,falsetto,emotional Table1.Topandbottom3wordsbysemanticcategoryascalculatedbyagreementandacousticcorrelation.ascalefromonetothree(e.g.,”nothappy”,”neutral”,”happy”);15songconceptsdescribingtheacousticqual-itiesofthesong,artistandrecording(e.g.,tempo,en-ergy,soundquality);and15usagetermsfrom[5],(e.g.,“Iwouldlistentothissongwhiledriving,sleeping,etc.”).The135conceptsareconvertedtothe174-wordvocabu-larybyrstmappingbi-polarconceptstomultiplewordlabels(`EnergyLevel'mapsto`lowenergy'and`highen-ergy').Thenwepruneallwordsthatarerepresentedinveorfewersongstoremoveunder-representedwords.Lastly,weconstructareal-valued174-dimensionalanno-tationvectorbyaveragingthelabelfrequenciesofthein-dividualannotators.Detailsofthesummarizationprocesscanbefoundin[16].Ingeneral,eachelementinthean-notationvectorcontainsareal-valuedscalarindicatingthestrengthofassociation.TheWeb2131isanannotatedcollectionof2131songsandaccompanyingexpertsongreviewsminedfromaweb-accessiblemusicdatabase1[15].Exactly363songsfromWeb2131overlapwiththeCAL500songs.Thevocab-ularyconsistsof317wordsthatwerehandpickedfromalistofthecommonwordsfoundinthecorpusofsongreviews.Commonstopwordsareremovedandthere-sultingwordsarepreprocesseswithacustomstemmingalgorithm.Werepresentasongreviewasabinary317-dimensionalannotationvector.Theelementofavectoris1ifthecorrespondingwordappearsinthesongreviewand0otherwise.5EXPERIMENTSBothhumanagreementandacousticcorrelationmaybeusedtodiscoverwordsthataremusicallymeaningfuland 1AMGAllMusicGuidewww.allmusic.com HumanAgr. AcousticCor. emotion53.5(26.7) emotion127.9(54.9) instrument53.9(39.5) vocals146.2(50.0) vocals88.2(40.0) instrument154.5(39.9) genre118.6(42.4) genre156.7(41.2) usage152.3(21.9) usage162.5(37.9) Table2.Averagerankofwordsinasemanticcategorywhenrankedbyhumanagreementandacousticcorrela-tion:Columnsaresorteddownwardinincreasingaveragerank.Theaveragerankofthecategoryandstd.dev.areshown(lowerisbetter).Notethattheorderofthecate-goriescloselymatchacrossbothcolumns.usefulinthecontextofsemanticmusicannotationandre-trieval.Inthissection,weconductthreeexperimentstohighlightpotentialuses.5.1QualitativeAnalysisTable2showstheaveragerankofwordsinasemanticcat-egorywhenwordsarerankedbyhumanagreementandacousticcorrelation.Forhumanagreement,wordsarerankedbytheiragreementscore.Foracousticcorrela-tion,wordsarerankedbyhowlongtheyarekeptbysparseCCAasthevocabularysizeisreduced.ThisexperimentwasrunontheCAL500dataset.Agoodrankinthehumanagreementmetricsuggeststhatawordislesssubjective.Thisistruebydenitionsinceagoodhumanagreementscoremeansthatpeopleusedthatwordconsistentlytodescribemusic.Notsur-prisingly,wefoundthatmoreobjectivecategoriessuchasinstrumentationarehighlyrankedonthislistandsubjec-tivecategoriessuchasusagearerankedatthebottom.

Related Contents


Next Show more