/
The Holy Grail of Sense Denition Creating a Sense Disambiguated Corpus from Scratch The Holy Grail of Sense Denition Creating a Sense Disambiguated Corpus from Scratch

The Holy Grail of Sense Denition Creating a Sense Disambiguated Corpus from Scratch - PDF document

calandra-battersby
calandra-battersby . @calandra-battersby
Follow
438 views
Uploaded On 2017-04-11

The Holy Grail of Sense Denition Creating a Sense Disambiguated Corpus from Scratch - PPT Presentation

PATTERN1AnythingcrushPhysicalObjectHardStuffHard ExplanationAnythingdamagesordestroysPhysicalObjectStuffHardbysuddenandunexpectedforcePATTERN2PhysicalObjectcrushHuman E ID: 338838

PATTERN1:[[Anything]]crush[[PhysicalObject=Hard|Stuff=Hard]] Explanation:[[Anything]]damagesordestroys[[PhysicalObject|Stuff=Hard]]bysuddenandunexpectedforcePATTERN2:[[PhysicalObject]]crush[[Human]] E

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "The Holy Grail of Sense Denition Creati..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

TheHolyGrailofSenseDenition:CreatingaSense-DisambiguatedCorpusfromScratchAnnaRumshiskyDept.ofComputerScienceBrandeisUniversityWaltham,MAarum@cs.brandeis.eduMarcVerhagenDept.ofComputerScienceBrandeisUniversityWaltham,MAmarc@cs.brandeis.eduJessicaL.MoszkowiczDept.ofComputerScienceBrandeisUniversityWaltham,MAjlittman@cs.brandeis.eduAbstractThispaperpresentsamethodologyforcre-atingagoldstandardforsensedenitionusingAmazon'sMechanicalTurkservice.Wedemonstratehowthismethodcanbeusedtocreateinasinglestep,quicklyandcheaply,alexiconofsenseinventoriesandthecorrespondingsense-annotatedlexicalsample.Weshowtheresultsobtainedbythismethodforasampleverbanddis-cusshowitcanbeimprovedtoproduceanexhaustivelexicalresource.Wethende-scribehowsucharesourcecanbeusedtofurtherothersemanticannotationefforts,usingasanexampletheGenerativeLexi-conMark-upLanguage(GLML)effort.1IntroductionTheproblemofdeningarobustprocedureforsensedenitionhasbeentheholygrailofbothlexicographicworkandthesenseannotationworkdonewithinthecomputationalcommunity.Inre-centyears,therehasbeenanumberofinitiativestocreategoldstandardsforsense-annotateddatatobeusedforthetrainingandtestingofsensedis-ambiguationandinductionalgorithms.Suchef-fortshaveoftenbeenimpededbydifcultiesinselectingorproducingasatisfactorysenseinven-tory.Methodologically,deningasetofstandardsforcreatingsenseinventorieshasbeenanelusivetask.Inthepastyear,MechanicalTurk,introducedbyAmazonas“articialarticialintelligence”,hasbeenusedsuccessfullytocreateannotateddataforanumberoftasks,includingsensedisambigua-tion(Snowetal.,2008)aswellasforcreatingarobustevaluationofmachinetranslationsys-temsbyreadingcomprehension(Callison-Burch,2009).Typically,complexannotationissplitintosimplersteps.Eachstepisfarmedouttothenon-expertannotatorsemployedviaMechanicalTurk(henceforce,MTurkers)inaformofaHIT(Hu-manIntelligenceTask),denedasataskthatishardtoperformautomatically,yetveryeasytodoforpeople.Inthispaper,weproposeamethodusingMe-chanicalTurktocreatesenseinventoriesfromscratchandthecorrespondingsense-annotatedlexicalsampleforanypolysemousword.AswiththeNLPtasksforwhichMTurkhasbeenusedpre-viously,thisannotationisquiteinexpensiveandcanbedoneveryquickly.Wetestthismethodonapolysemousverbofmediumdifcultyandcomparetheresultstothegroupingscreatedbyaprofessionallexicographer.Wethendescribehowthismethodcanbeusedtocreatesensegroupingsforotherwordsinordertocreateasenseenu-meratedlexicon.Finally,wedescribeoneappli-cationinwhichsucharesourcewouldbeinvalu-able,namely,thepreprocessingstepfortheongo-ingGLMLannotationeffort.2ProblemofSenseDenitionInthepastfewyears,anumberofinitiativeshavebeenundertakentocreateastandardizedframe-workforthetestingofwordsensedisambiguation(WSD)andwordsenseinduction(WSI)systems,includingtherecentseriesofSENSEVALcom-petitions(Agirreetal.,2007;MihalceaandEd-monds,2004;PreissandYarowsky,2001),andthesharedsemanticrolelabelingtasksattheCoNLLconference(CarrerasandMarquez,2005;CarrerasandMarquez,2004).Trainingandtestingsuchsystemstypicallyin-volvesusingagoldstandardcorpusinwhicheachoccurrenceofthetargetwordismarkedwiththeappropriatesensefromagivensenseinventory.Anumberofsuchsense-taggedcorporahavebeendevelopedoverthepastfewyears.Withinthecontextofworkdoneinthecomputationalcom-munity,senseinventoriesusedinsuchanntoa-tionareusuallytakenoutofmachine-readable PATTERN1:[[Anything]]crush[[PhysicalObject=Hard|Stuff=Hard]] Explanation:[[Anything]]damagesordestroys[[PhysicalObject|Stuff=Hard]]bysuddenandunexpectedforcePATTERN2:[[PhysicalObject]]crush[[Human]] Explanation:[[PhysicalObject]]killsorinjures[[Human]]bythepressureofirresistibleforcePATTERN3:[[Institution|Human=PoliticalorMiltaryLeader]]crush[[Activity=Revolt|IndependentAction]] Explanation:[[Institution|Human=PoliticalorMilitaryLeader]]usesforcetobringtoanend[[Activity=Revolt|IndependentAction]]by[[HumanGroup]]PATTERN4:[[Institution|Human1=MilitaryorPoliticalLeader]]crush[[HumanGroup|Human2=LEXSET]] Explanation:[[Institution|Human1=MilitaryorPoliticalLeader]]destroysorbringstoanendtheresistanceof[[HumanGroup|Human2=MilitaryorPoliticalLeader]]PATTERN5:[[Event]]crush[[Human|Emotion]] Explanation:[[Event]]causes[[Human]]tolosehopeorotherpositive[[Emotion]]andtofeelbadFigure1:CPAentryforcrush dictionariesorlexicaldatabases,suchasWord-Net(Fellbaum,1998),Roget'sthesaurus(Roget,1962),LongmanDictionaryofContemporaryEn-glish(LDOCE,1978),Hectorproject,etc.Insomecasesinventoriesare(partiallyorfully)con-structedoradaptedfromanexistingresourceinapre-annotationstage,asinPropBank(Palmeretal.,2005)orOntoNotes(Hovyetal.,2006).Thequalityoftheannotatedcorporadependsdirectlyontheselectedsenseinventory,so,forexample,SemCor(Landesetal.,1998),whichusesWord-Netsynsets,inheritsalltheassociatedproblems,includingusingsensesthataretoone-grainedandinmanycasespoorlydistinguished.Atthere-centSensevalcompetitions(Mihalceaetal.,2004;SnyderandPalmer,2004;PreissandYarowsky,2001),thechoiceofasenseinventoryalsofre-quentlypresentedproblems,spurringtheeffortstocreatecoarser-grainedsenseinventories(Navigli,2006;Hovyetal.,2006;Palmeretal.,2007;Snowetal.,2007).InventoriesderivedfromWordNetbyusingsmall-scalecorpusanalysisandbyauto-maticmappingtotopentriesinOxfordDictionaryofEnglishwereusedinthemostrecentworkshoponsemanticevaluation,Semeval-2007(Agirreetal.,2007).Establishingasetofsensesavailabletoapar-ticularlexicalitemisataskthatisnotoriouslydifculttoformalize.Thisisespeciallytrueforpolysemousverbswiththeirconstellationsofre-latedmeanings.Inlexicography,“lumpingandsplitting”sensesduringdictionaryconstruction–i.e.decidingwhentodescribeasetofusagesasaseparatesense–isawell-knownproblem(HanksandPustejovsky,2005;Kilgarriff,1997;Apres-jan,1973).Itisoftenresolvedonanad-hocba-sis,resultinginnumerouscasesof“overlappingsenses”,i.e.instanceswhenthesameoccurrencemayfallundermorethanonesensecategorysi-multaneously.Withinlexicalsemantics,therehasalsobeenlittleconsentontheoreticalcriteriaforsensedenition,whilealotofworkhasbeende-votedtoquestionssuchaswhenthecontextse-lectsadistinctsenseandwhenitmerelymodu-latesthemeaning,whatistheregularrelationshipbetweenrelatedsenses,whatcompositionalpro-cessesareinvolvedinsenseselection,andsoon(Pustejovsky,1995;Cruse,1995;Apresjan,1973).Severalcurrentresource-orientedprojectsat-tempttoformalizetheprocedureofcreatingasenseinventory.FrameNet(Ruppenhoferetal.,2006)attemptstoorganizelexicalinformationintermsofscript-likesemanticframes,withsemanticandsyntacticcombinatorialpossibili-tiesspeciedforeachframe-evokinglexicalunit(word/sensepairing).CorpusPatternAnalysis(CPA)(HanksandPustejovsky,2005)attemptstocatalognormsofusageforindividualwords,spec-ifyingthemintermsofcontextpatterns.Anumberofotherprojectsusesomewhatsim-ilarcorpusanalysistechniques.InPropBank(Palmeretal.,2005),verbsensesweredenedbasedontheiruseinWallStreetJournalcorpusandspeciedintermsofframesetswhichcon-sistofasetofsemanticrolesfortheargumentsofaparticularsense.IntheOntoNotesproject(Hovyetal.,2006),annotatorsusesmall-scalecorpusanalysistocreatesenseinventoriesderivedbygroupingtogetherWordNetsenses.Theproce-dureisrestrictedtomaintain90%inter-annotatoragreement.Annotatingeachinstancewithasense,andes- Figure2:TaskinterfaceandinstructionsfortheHITpresentedtothenon-expertannotatorspeciallythecreationofasenseinventoryistypi-callyverylabor-intensiveandrequiresexpertan-notation.Weproposeanempiricalsolutiontotheproblemofsensedenition,buildingbothasenseinventoryandanannotatedcorpusatthesametime,usingminimalexpertcontribution.Thetaskofsensedenitionisaccomplishedempiricallyus-ingnativespeaker,butnon-expert,annotators.Inthenextsection,wereportonanexperimentusingthisprocedureforsensedenition,andevaluatethequalityoftheobtainedresults.3SolutiontotheProblem(Experiment)3.1TaskDesignWeofferedMTurkersanannotationtaskdesignedtoimitatetheprocessofcreatingclustersofexam-plesusedinCPA(CPA,2009).InCPA,thelex-icographersortsthesetofinstancesforthegiventargetwordintogroupsusinganapplicationthatallowshimorhertomarkeachinstanceinacon-cordancewithasensenumber.Foreachgroupofexamples,thelexicographerthenrecordsapat-ternthatcapturestherelevantsemanticandsyn-tacticcontextfeaturesthatallowustoidentifythecorrespondingsenseofthetargetword(HanksandPustejovsky,2005;Pustejovskyetal.,2004;Rumshiskyetal.,2006).Forthisexperiment,weusedtheverbcrushwhichhas5differentsense-deningpatternsas-signedtoitintheCPAverblexicon,andinwhichsomesensesappeartobemetaphoricalextensionsoftheprimaryphysicalsense.Wethereforeviewitasaverbofmediumdifcultybothforsensein-ventorycreationandforannotation.ThepatternsfromtheCPAverblexiconforcrusharegiveninFigure1.TheCPAverblexiconhas350sentencesfromtheBNCthatcontaintheverbcrush,sortedaccordingtothesepatterns.Thetaskwasdesignedasasequenceofanno-tationrounds,witheachroundcreatingaclustercorrespondingtoonesense.MTurkerswererstgivenasetofsentencescontainingthetargetverb,andonesentencethatisrandomlyselectedfromthissetasaprototypesentence.Theywerethenaskedtoidentify,foreachsentence,whetherthetargetverbisusedinthesamewayasinthepro-totypesentence.Ifthesensewasunclearoritwasimpossibletotell,theywereinstructedtopickthe“unclear”option.WetooktheexamplesentencesfromtheCPAverblexiconforcrush.Figure2showstherstscreendisplayedtoMTurkersforthisHIT.Tenexampleswerepresentedoneachscreen.Eachexamplewasannotatedby5MTurk-ers.Aftertherstroundofannotationwascom-plete,thesentencesthatwerejudgedassimilar Figure3:Sense-deningiterationsonasetofsentencesS.totheprototypesentencebythemajorityvote(3ormoreoutof5annotators)weresetapartintoaseparateclustercorrespondingtoonesense,andexcludedfromthesetusedinfurtherrounds.Theprocedurewasrepeatedwiththeremainingset,i.e.anewprototypesentencewasselected,andthere-mainingexampleswerepresentedtotheannota-tors.SeeFigure3foranillustration.Thisprocedurewasrepeateduntilallthere-mainingexampleswereclassiedas“unclear”bythemajorityvote,ornoexamplesremained.Sincesomemisclassicationsareboundtooccur,westoppedtheprocesswhentheremainingsetcon-tained7examples,severalofwhichwereunclear,andothersjudgedtobemisclassicationbyanex-pert.Eachroundtookapproximately30minutestoanhourtocomplete,dependingonthenumberofsentencesinthatround.Eachsetof10sen-tencestookapproximately1minuteontheaver-agetocomplete,andtheannotatorreceived$0.03USDascompensation.Thetotalsumspentonthisexperimentdidnotexceed$10USD.3.2Evaluation3.2.1ComparisonagainstexpertjudgementsWeevaluatedtheresultsoftheexperimentagainstthegoldstandardcreatedbyaprofessionallexi-cographerfortheCPAverblexicon.Inthediscus-sionbelow,wewillusethetermclustertorefertotheclusterscreatedbynon-expertsasdescribedabove.Followingthestandardterminologyinsenseinductiontasks,wewillrefertotheclusterscreatedbythelexicographerassenseclassesWeusedtwomeasurestoevaluatehowwelltheclustersproducedbynon-expertsmatchedtheclassesintheCPAverblexicon:theF-scoreandEntropy.TheF-score(Zhaoetal.,2005;AgirreandSoroa,2007)isaset-matchingmeasure.Pre-cision,recall,andtheirharmonicmean(vanRijs-bergen'sF-measure)arecomputedforeachclus-ter/senseclasspair.EachclusteristhenmatchedtotheclasswithwhichitachievesthehighestF-measure.1TheF-scoreiscomputedasaweightedaverageoftheF-measurevaluesobtainedforeachcluster.Entropy-relatedmeasures,ontheotherhand,evaluatetheoverallqualityofaclusteringsolutionwithrespecttothegoldstandardsetofclasses.En-tropyofaclusteringsolution,asithasbeenusedintheliterature,evaluateshowthesenseclassesaredistributedwitheachderivedcluster.Itiscom-putedasaweightedaverageoftheentropyofthedistributionofsenseswithineachcluster:Entropy(C;S)=Xjcj nXjjc\jj jcjlogjc\jj jcj(1)whereci2Cisaclusterfromtheclusteringso-lutionC,andsj2SisasensefromthesenseassignmentSWeusethestandardentropydenition(CoverandThomas,1991),so,unlikeinthedenitionusedinsomeoftheliterature(Zhaoetal.,2005),thetermsarenotmultipliedbytheinverseofthelogofthenumberofsenses.Thevaluesobtainedforthetwomeasuresareshowninthe“initial”col-umnofTable1. initial merged F-score 65.8 93.0 Entropy 1.1 0.3 Table1:F-scoreandentropyofnon-expertclus-teringagainstexpertclasses.Secondcolumnshowsevaluationagainstmergedexpertclasses.Whiletheseresultsmayappearsomewhatdis-appointing,itisimportanttorecallthattheCPA 1Multipleclustersmaythereforemaptothesameclass. verblexiconspeciesclassescorrespondingtosyntacticandsemanticpatterns,whereasthenon-expertsjudgementswereeffectivelyproducingclusterscorrespondingtosenses.Therefore,theseresultsmaypartiallyreectthefactthatseveralCPAcontextpatternsmayrepresentasinglesense,withpatternsvaryinginsyntacticstructureand/ortheencodingofsemanticrolesrelativetothede-scribedevent.Weinvestigatedthispossibility,rstautomati-cally,andthenthroughmanualexamination.Table2showstheoverlapbetweennon-expertclustersandexpertclasses.Thesenumbersclearlysuggest expertclasses 1 2 3 4 5 non-expert 1 0 2 120 65 9 clusters 2 83 45 0 3 0 3 0 0 2 1 10 Table2:Overlapbetweennon-expertclustersandexpertclasses.thatcluster1mappedintoclasses3and4,cluster2mappedintoclasses1and2,andcluster3mappedmostlyintoclass5.Indeed,herearetheprototypesentencesassociatedwitheachcluster:C1ByappointingMajidasInteriorMinister,PresidentSaddamplacedhiminchargeofcrushingthesouthernrebellion.C2Thelighterwoodssuchasbalsacanbecrushedwiththenger.C3Thistimethedefeatofhishopesdidn'tcrushhimformorethanafewdays.ItiseasytoseethatthedataaboveindeedreectstheappropriatemappingtotheCPAverblexiconpatternsforcrush,asshowninFigure1.ThesecondcolumnofTable1showstheval-uesforF-scoreandEntropycomputedforthecasewhenthecorrespondingclasses(i.e.classes1and2andclasses3and4)aremerged.3.2.2Inter-annotatoragreementWecomputedpairwiseagreementforallpartici-patingMTurkers.Foreachindividualsentence,welookedatallpairsofjudgementsgivenbytheveannotators,andconsideredthetotalnumberofagreementsanddisagreementsinsuchpairs.WecomputedFleiss'kappastatisticforallthejudgementscollectedfromthegroupofpartici-patingMTurkersinthecourseofthisexperiment.Theobtainedvalueofkappawas=57:9,withtheactualagreementvalue79.1%.Thetotalnum-berofinstancesjudgedwas516.Itisremarkablethatthesegureswereobtainedeventhoughweperformednoweedingofnon-expertswhichperformedpoorlyonthetask.Assuggestedpreviously(Callison-Burch,2009),itisfairlyeasytoimplementthelteringoutofMTurkersthatdonotperformwellonthetask.Inparticular,anannotatorwhoseagreementwiththeotherannotatorsissignicantlybelowtheaveragefortheotherannotatorscouldbeexcludedfromthemajorityvotingdescribedinSection3.1.ThedatainTable3shouldgivesomeideare-gardingthedistributionofvotesinmajorityvot-ing. No.ofvotes %ofjudgedinstances 3votes 12.8% 4votes 29.8% 5votes 55.2% Table3:Thenumberofwinningvotesinmajorityvoting.4Sense-AnnotatedLexiconThemethodologyweusedinthedescribedexper-imenthaspotentialforproducingquicklyandef-ciently,“fromthegroundup”,boththesensein-ventoriesandtheannotateddataforhundredsofpolysemouswords.Tasksetsformultiplepoly-semouswordscanbeinparallel,potentiallyre-ducingtheoveralltimeforsuchaseeminglymon-strousefforttomereweeks.However,alarge-scaleeffortwouldrequiresomeseriousplanningandamoresophisticatedworkowthantheoneusedforthisexperiment.MechanicalTurkprovidesdevelopertoolsandanAPIthatcanhelpautomatethisprocessfurther.Inparticular,thefollowingstepsneedtobeautomated:(i)constructionofHITsinsubsequentiterations,and(ii)managementofthesetofMTurkers(whentoaccepttheirresults,whatrequirementstospecify,andsoforth).Inthepresentexperiment,weperformednoqualitychecksontheratingsofeachindividualMTurker.Also,therststepwasonlypartiallyautomated,thatis,aftereachsetofHITswascompleted,humaninterventionwasrequiredtorunasetofscriptsthatproduceandsetupthenextsetofHITsusingtheremainingsentences.Inaddition,some moreconceptualissuesneedtobeaddressed: The clarity of the sense distinctions.Highinter-annotatoragreementinthepresentexperimentseemstosuggestthatcrushhaseasilyidentiablesenses.Itispossible,andevenlikely,thatcreatingaconsistentsenseinventorywouldbemuchharderforotherpolysemousverbs,manyofwhicharenotoriousforhavingconvolutedconstellationsofinter-relatedandoverlappingsenses(see,forexample,thediscussionofdriveinRumshiskyandBatiukova(2008)). The optimal number of MTurkers.WechosetouseveMTurkersperHITbasedonobservationsinSnowetal.(2008),butwehavenodatasupportingthatveisoptimalforthistask.Thisrelatestothepreviousissuesincetheoptimalnumbercanvarygiventhecomplexityofthetask. The quality of prototype sentences.Weselectedtheprototypesentencescompletelyrandomlyinthepresentexperiment.However,itisobviousthatifthesenseofthetargetwordissomewhatunclearintheprototypesententence,thequalityoftheassociatedclustershouldfalldrastically.Thisproblemcouldberemediedbyintroducinganadditionalstep,whereanothersetofMTurkerswouldbeaskedtojudgetheclarityofaparticularexamplarsentence.IfthecurrentprototypesentenceisjudgedtobeunclearbythemajorityofMTurkers,itwouldberemovedfromthedataset,andanotherprototypesentencewouldberandomlyselectedandjudged.Finally,weneedtocontemplatethekindofresourcethatthisapproachgenerates.Itseemsclearthattheresultingresourcewillcontainmorecoarse-grainedsensedistinctionsthanthoseob-servedinCPA,asevidencedbythecrushexam-ple.Buthowitwillactuallyworkoutforalargelexiconisstillanempiricalquestion.5TheGLMLAnnotationEffortInthissection,wediscusssomeoftheimplica-tionsoftheaboveapproachforanexistingseman-ticannotationeffort.Moreexplicitly,wesurveyhowsenseclusteringcanunderpintheannotationoftypecoercionandargumentselectionandpro-videanoutlineforhowsuchanannotationeffortcouldproceed.5.1PreliminariesInPustejovskyetal.(2009)andPustejovskyandRumshisky(2009),aprocedureforannotatingar-gumentselectionandcoercionwaslaidout.Theaimwastoidentifythenatureofthecomposi-tionalrelationratherthanmerelyannotatingsur-facetypesofentitiesinvolvedinargumentselec-tion.Considertheexamplebelow.(a)Marycalledyesterday.(b)TheBostonofcecalledyesterday.Thedistinctionbetween(a)and(b)canbede-scribedbysemantictypingoftheagent,butnotbysensetaggingandrolelabelingasusedbyFrameNet(Ruppenhoferetal.,2006)andProp-Bank(Palmeretal.,2005).Pustejovskyetal.(2009)focusontwomodesofcomposition:pureselection(thetypeafunctionrequiresisdirectlysatisedbytheargument)andtypecoercion(thetypeafunctionrequiresisimposedontheargu-mentbyexploitationorintroduction).Theyde-scribethreetasksforannotatingcompositionalmechanisms:verb-basedannotation,qualiainmodicationstructures,andtypeselectioninmod-icationofdotobjects.Allthreetasksinvolvetwostages:(i)adatapreparationphasewithselectionofannotationdomainandconstructionofsensein-ventoriesand(ii)theannotationphase.Forthepurposeofthispaperwefocusontherstannotationtask:choosingwhichselectionalmechanism(pureselectionorcoercion)isusedbythepredicateoveraparticularargument.Thedatapreparationphaseforthistaskconsistsof:(1)Selectingasetofhighlycoercivetargetverbs.(2)Creatingasenseinventoryforeachverb.(3)Assigningtypetemplatestoeachsense.Forexample,the“refusetogrant”senseofdenyinTheauthoritiesdeniedhimthevisawillbeas-sociatedwiththetemplate[HUMANdenyHUMANENTITY].ThesamesenseoftheverbinTheyde-niedsheltertorefugeeswillbeassociatedwiththetemplate[HUMANdenyENTITYtoHUMAN].Afterthis,theannotationphaseproceedsintwosteps.First,foreachexample,annotatorsselectthesenseofaverb.Then,theannotatorsspec-ifywhether,giventhesenseselected,theargumentmatchesthetypespeciedinthetemplate.Ifthetypedoesnotmatch,wehaveaninstanceoftypecoercionandtheannotatorwillbeaskedwhatthe typeoftheargumentis.Forthispurpose,alistofabouttwentytypesisprovided.Twoissuespopupratherpressingly.First,howshouldthesenseinventorybecreated?Thispar-ticularchoicestronglyinuencesthekindofco-ercionsthatarepossibleanditwouldbewisetoavoidtheoreticalbiasasmuchaspossible.Puste-jovskyetal.(2009)chosetousealexicograph-icallyorientedsenseinventoryprovidedbytheCPAproject,butthemethoddescribedheresug-gestsanotherroute.WhiletheinventoryprovidedbyCPAiscertainlyofveryhighquality,itsuseforanannotationeffortlikeGLMLislimitedbe-causetheinventoryitselfislimitedbytheneedforexpertlexicographerstoperformaverylabor-intensiveanalysis.Thesecondprobleminvolvestheshallowtypesystem.Noticethatthis`typesystem'informsboththetemplatecreationandtheselectionofac-tualtypesoftheargument.Itisunclearhowtopickthesetypesand,again,thechoiceoftypesde-nesthekindsofcoercionsthataredistinguished.5.2AdaptingGLMLAnnotationWenowprovideafewadaptationsoftheproce-dureoutlinedintheprevioussubsection.Webe-lievethatthesechangesprovideamorebottom-upprocedureindistinguishingpureselectionandtypecoercioninargumentselection.InSection3,wedescribedafastbottom-uppro-ceduretocreatesenseinventories.Considerthatafterthisprocedurewehavethefollowingarti-facts:(i)asetofsenses,representedasclustersofexamples,and(ii)theresultingclustersofsen-tencesthatillustratetheusageofeachsense.Wecannowputtheresultingsenseinventoryandtheassociatedexamplestowork,seeingcon-tributionstoboththepreparationandannotationphase.Mostoftheworkoccursintheprepa-rationphase,indeningthetemplatesandtheshallowlistoftypes.Firstly,theclustersdenethesenseinventorycalledforinthedataprepa-rationphaseoftheargumentselectiontask.Theclustersalsotriviallyprovideforthesenseselec-tion,therststepoftheannotationphase.Sec-ondly,theexemplarscanguideatrainedlinguistincreatingthetemplates.Foreachsense,thelin-guist/lexicographerneedstodistinguishbetweeninstancesofpureselectionandinstancesoftypecoercion.Foreachset,asuccinctwaytodescribethetypeoftheargumentneedstobedened.Theseconditemabovehasadesirableimpli-cationfortheannotationprocess.Notethattheguidelinescanfocusoneachtargetverbseparatelybecausethesetofavailabletypesforselectionandcoercionaredenedforeachtargetverbindivid-ually.Inaddition,thesetofexemplarsprovidesillustrationsforeachtype.Ingeneral,theseman-tictypescanbeconsideredshorthandsforsetsofeasilyavailabledatathattheannotatorcanuse.Overall,wehavetakenthespecialistoutoftheconstructionofthesenseinventory.However,itisstilluptotheexperttoanalyzeandprocesstheclustersdenedbyMTurkers.Foreachclusterofexamples,oneormoretypetemplatesneedtobecreatedbytheexpert.Inaddition,theexpertneedstoanalyzethetypesinvolvedintypecoercionandaddthesetypestothelistofpossibletypecoer-cionsforthetargetverb.6ConclusionInthispaper,wehavepresentedamethodforcre-atingsenseinventoriesandtheassociatedsense-taggedcorporafromthegroundup,withoutthehelpofexpertannotators.Havingalexiconofsenseinventoriesbuiltinthiswaywouldallevi-atesomeoftheburdenontheexpertinmanyNLPtasks.Forexample,inCPA,splittingandlump-ingcanbedonebythenon-expertannotators.Thetaskoftheexpertisthenjusttoformulatethesyn-tacticandsemanticpatternsthatarecharacteristicforeachsense.Wediscussedoneapplicationfortheresourcethatcanbeproducedbythismethod,namely,theGLMLannotationeffort,buttheabilitytoquicklyandinexpensivelygeneratesenseclusterswithouttheinvolvementofexpertswouldassistinamyr-iadofotherprojectsthatinvolvewordsensedis-ambiguationorinduction.ReferencesE.AgirreandA.Soroa.2007.Semeval-2007task02:Evaluatingwordsenseinductionanddiscriminationsystems.InProceedingsofSemEval-2007,pages7–12,Prague,CzechRepublic,June.AssociationforComputationalLinguistics.E.Agirre,L.Marquez,andR.Wicentowski,editors.2007.ProceedingsofSemEval-2007.AssociationforComputationalLinguistics,Prague,CzechRe-public,June.Ju.Apresjan.1973.Regularpolysemy.Linguistics,142(5):5–32. ChrisCallison-Burch.2009.Fast,cheap,andcreative:Evaluatingtranslationqualityusingamazon'sme-chanicalturk.InProceedingsofEMNLP2009.X.CarrerasandL.Marquez.2004.IntroductiontotheCoNLL-2004sharedtask:Semanticrolelabeling.InProceedingsofCoNLL-2004,pages89–97.X.CarrerasandL.Marquez.2005.Introductiontotheconll-2005sharedtask:Semanticrolelabeling.InProceedingsofCoNLL-2005.T.CoverandJ.Thomas.1991.ElementsofInforma-tionTheory.JohnWiley&sons.CPA.2009.CorpusPatternAnalysis,CPAProjectWebsite,http://nlp..muni.cz/projekty/cpa/.MasarykUniversity,Brno,CzechRepublic.D.A.Cruse.1995.Polysemyandrelatedphenomenafromacognitivelinguisticviewpoint.InPatrickSt.DizierandEvelyneViegas,editors,ComputationalLexicalSemantics,pages33–49.CambridgeUniver-sityPress,Cambridge,England.C.Fellbaum,editor.1998.Wordnet:anelectroniclex-icaldatabase.MITPress.P.HanksandJ.Pustejovsky.2005.Apatterndic-tionaryfornaturallanguageprocessing.RevueFranc¸aisedeLinguistiqueAppliqu´ee.E.Hovy,M.Marcus,M.Palmer,L.Ramshaw,andR.Weischedel.2006.OntoNotes:The90%solu-tion.InProceedingsofHLT-NAACL,CompanionVolume:ShortPapers,pages57–60,NewYorkCity,USA,June.AssociationforComputationalLinguis-tics.A.Kilgarriff.1997.Idon'tbelieveinwordsenses.ComputersandtheHumanities,31:91–113.S.Landes,C.Leacock,andR.I.Tengi.1998.Build-ingsemanticconcordances.InC.Fellbaum,editor,Wordnet:anelectroniclexicaldatabase.MITPress,Cambridge(Mass.).LDOCE.1978.LongmanDictionaryofContemporaryEnglish.LongmanGroupLtd,Harlot,England.R.MihalceaandP.Edmonds,editors.2004.Senseval-3:ThirdInternationalWorkshopontheEvalua-tionofSystemsfortheSemanticAnalysisofText,Barcelona,Spain,July.AssociationforComputa-tionalLinguistics.R.Mihalcea,T.Chklovski,andA.Kilgarriff.2004.TheSenseval-3Englishlexicalsampletask.InRadaMihalceaandPhilEdmonds,editors,Senseval-3,pages25–28,Barcelona,Spain,July.AssociationforComputationalLinguistics.R.Navigli.2006.Meaningfulclusteringofsenseshelpsboostwordsensedisambiguationperfor-mance.InProceedingsofCOLING-ACL2006,pages105–112,Sydney,Australia,July.AssociationforComputationalLinguistics.M.Palmer,D.Gildea,andP.Kingsbury.2005.ThePropositionBank:Anannotatedcorpusofsemanticroles.ComputationalLinguistics,31(1):71–106.M.Palmer,H.Dang,andC.Fellbaum.2007.Makingne-grainedandcoarse-grainedsensedistinctions,bothmanuallyandautomatically.JournalofNat-uralLanguageEngineering.JPreissandD.Yarowsky,editors.2001.ProceedingsoftheSecondInt.WorkshoponEvaluatingWSDSystems(Senseval2).ACL2002/EACL2001.J.PustejovskyandA.Rumshisky.2009.SemEval-2010Task7:ArgumentSelectionandCoercion.NAACLHLT2009WorkshoponSemanticEvalua-tions:RecentAchievementsandFutureDirections.J.Pustejovsky,P.Hanks,andA.Rumshisky.2004.AutomatedInductionofSenseinContext.InCOL-ING2004,Geneva,Switzerland,pages924–931.J.Pustejovsky,A.Rumshisky,J.Moszkowicz,andO.Batiukova.2009.GLML:Annotatingargumentselectionandcoercion.IWCS-8.J.Pustejovsky.1995.GenerativeLexicon.Cambridge(Mass.):MITPress.P.M.Roget,editor.1962.Roget'sThesaurus.ThomasCromwell,NewYork,3edition.A.RumshiskyandO.Batiukova.2008.Polysemyinverbs:systematicrelationsbetweensensesandtheireffectonannotation.InHJCL-2008,Manchester,England.submitted.A.Rumshisky,P.Hanks,C.Havasi,andJ.Pustejovsky.2006.Constructingacorpus-basedontologyusingmodelbias.InFLAIRS2006,MelbourneBeach,Florida,USA.J.Ruppenhofer,M.Ellsworth,M.Petruck,C.Johnson,andJ.Scheffczyk.2006.FrameNetII:ExtendedTheoryandPractice.R.Snow,S.Prakash,D.Jurafsky,andA.Ng.2007.Learningtomergewordsenses.InProceedingsofthe2007JointConferenceonEmpiricalMeth-odsinNaturalLanguageProcessingandCom-putationalNaturalLanguageLearning(EMNLP-CoNLL),pages1005–1014.AssociationforCompu-tationalLinguistics.R.Snow,B.OConnor,D.Jurafsky,andA.Y.Ng.2008.Cheapandfastbutisitgood?Evaluatingnon-expertannotationsfornaturallanguagetasks.InProceed-ingsofEMNLP2008.B.SnyderandM.Palmer.2004.Theenglishall-wordstask.InRadaMihalceaandPhilEdmonds,editors,Senseval-3,pages41–43,Barcelona,Spain,July.AssociationforComputationalLinguistics.YingZhao,GeorgeKarypis,andUsamaM.Fayyad.2005.Hierarchicalclusteringalgorithmsfordocu-mentdatasets.10:141–168.