/
constraintsandattributeconstraints.Thefourththirddifferenceisintheleve constraintsandattributeconstraints.Thefourththirddifferenceisintheleve

constraintsandattributeconstraints.Thefourththirddifferenceisintheleve - PDF document

natalia-silvester
natalia-silvester . @natalia-silvester
Follow
364 views
Uploaded On 2017-03-02

constraintsandattributeconstraints.Thefourththirddifferenceisintheleve - PPT Presentation

KOREANpoclassvbmas1XppcareulENGLISHlookclassverbattratclassprepositioniiX2xLOGLIKELIHOOD1277 Figure2TransferruleforEnglishlexicalizationandprepositioninsertion KOREA ID: 521234

@KOREAN:{po}[class=vbma](s1$X[ppca={reul}])@ENGLISH:look[class=verb](attrat[class=preposition](ii$X))@-2xLOG_LIKELIHOOD:12.77 Figure2:TransferruleforEnglishlexicalizationandprepositioninsertion @KOREA

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "constraintsandattributeconstraints.Thefo..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

constraintsandattributeconstraints.Thefourththirddifferenceisinthelevelofabstractionoftransferrulescandidates;in(Meyersetal.,1998),thesourceandtargetpatternsofeachtransferrulearefullylexicalized(exceptpossiblytheterminalnodes),whileinourapproachthenodesoftrans-ferrulesdonothavetobelexicalized.Section2describesourapproachtotrans-ferrulesinductionanditsintegrationwithdatapreparationandevaluation.Section3describesthedatapreparationprocessandresultingdata.Section4describesthetransferinductionprocessindetail.Section5describestheresultsofourini-tialevaluation.Finally,Section6concludeswithadiscussionoffuturedirections.2OverallApproachInitsmostgeneralform,ourapproachtotransferrulesinductionincludesthreedifferentprocesses,datapreparation,transferruleinductionandeval-uation.Anoverviewofeachprocessisprovidedbelow;furtherdetailsareprovidedinsubsequentsections.Thedatapreparationprocesscreatesthefol-lowingresourcesfromthebi-texts:Atrainingsetandatestsetofsourceandtargetparsesforthebi-texts,post-processedintoasyntacticdependencyrepresentation.Abaselinetransferdictionary,whichmayin-clude(dependinguponavailability)lexicaltransferrulesextractedfromthebi-textsus-ingstatisticalmethods,lexicaltransferrulesfromexistingbilingualdictionaries,and/orhandcraftedlexico-structuraltransferrules.Thetransferinductionprocessinduceslexico-structuraltransferrulesfromthetrainingsetofcorrespondingsourceandtargetparsesthat,whenaddedtothebaselinetransferdictionary,producetransferredparsesthatareclosertothecorre-spondingtargetparses.Thetransferinductionprocesshasthefollowingsteps:Nodesofthecorrespondingsourceandtar-getparsesarealignedusingthebaselinetransferdictionaryandsomeheuristicsbasedonthesimilarityofpart-of-speechandsyn-tacticcontext.Transferrulecandidatesaregeneratedbasedonthesub-patternsthatcontainthecorre-spondingalignednodesinthesourceandtar-getparses.Thetransferrulecandidatesareorderedbasedontheirlikelihoodratios.Thetransferrulecandidatesareltered,oneatatime,intheorderofthelikelihoodra-tios,byremovingthoserulecandidatesthatdonotproduceanoverallimprovementintheaccuracyofthetransferredparses.Theevaluationprocesshasthefollowingsteps:Boththebaselinetransferdictionaryandtheinducedtransferdictionary(i.e.,thebaselinetransferdictionaryaugmentedwiththein-ducedtransferrules)areappliedtothetestsetinordertoproducetwosetsoftransferredparses,thebaselinesetandthe(hopefully)improvedinducedset.Foreachset,thedif-ferencesbetweenthetransferredparsesandtargetparsesaremeasured,andtheimprove-mentintreeaccuracyiscalculated.Afterperformingsyntacticrealizationonthebaselinesetandtheinducedsetoftrans-ferredparses,thedifferencesbetweentheresultingtranslatedstringsandthetargetstringsaremeasured,andtheimprovementinstringaccuracyiscalculated.Forasubsetofthetranslatedstrings,humanjudgmentsofaccuracyandgrammaticalityaregathered,andthecorrelationsbetweenthemanualandautomaticscoresarecalcu-lated,inordertoassessthemeaningfulnessoftheautomaticmeasures.3DataPreparation3.1ParsingtheBi-textsInourexperimentstodate,wehaveusedacor-pusconsistingofaKoreandialogof4183sen-tencesandtheirEnglishhumantranslations.Weranoff-the-shelfparsersoneachhalfofthecor-pus,namelytheKoreanparserdevelopedbyYoonetal.(1997)andtheEnglishparserdevelopedbyCollins(1997).Neitherparserwastrainedonourcorpus. @KOREAN:{po}[class=vbma](s1$X[ppca={reul}])@ENGLISH:look[class=verb](attrat[class=preposition](ii$X))@-2xLOG_LIKELIHOOD:12.77 Figure2:TransferruleforEnglishlexicalizationandprepositioninsertion @KOREAN:$X[class=vbmaente={ra}]@ENGLISH:$X[class=verbmood=imp]@-2xLOG_LIKELIHOOD:33.37 Figure3:Transferruleforimperativeformstionaryiscreatedfromlexicaltransferentriesex-tractedfromthebi-textsusingstatisticalmethods.Tosimulatethisscenario,wecreatedourbaselinetransferdictionarybytakingthelexico-syntactictransferdictionarydevelopedbyHanetal.(2000)forthiscorpusandremovingthe(moregeneral)rulesthatwerenotfullylexicalized.Startingwiththispurelylexicalbaselinetransferdictionaryen-abledustoexaminewhetherthesemoregeneralrulescouldbediscoveredthroughinduction.4TransferRuleInductionTheinducedlexico-structuraltransferrulesarerepresentedinaformalismsimilartotheonede-scribedinNasretal.(1997),andextendedtoalsoincludeloglikelihoodratios.Figures2and3illustratetwoentrysamplesthatcanbeusedtotransferaKoreansyntacticrepresentationforci-to-reulpo-ratoanEnglishsyntacticrepresenta-tionforlookatthemap.TherstrulelexicalizestheEnglishpredicateandinsertsthecorrespond-ingprepositionwhilethesecondruleinsertstheEnglishimperativeattribute.Thisformalismusesnotationsimilartothesyntacticdependencynota-tionshowninFigure1,augmentedwithvariableargumentsprexedwith$characters.4.1AligningtheParseNodesToalignthenodesinthesourceandtargetparsetrees,wedevisedanewdynamicprogrammingalignmentalgorithmthatperformsatop-down,bidirectionalbeamsearchfortheleastcostmap-pingbetweenthesenodes.Thealgorithmispa-rameterizedbythecostsof(1)aligningtwonodeswhoselexemesarenotfoundinthebaselinetrans-ferdictionary;(2)aligningtwonodeswithdif-feringpartsofspeech;(3)deletingorinsertinganodeinthesourceortargettree;and(4)aligningtwonodeswhoserelativelocationsdiffer.Todetermineanappropriatepartofspeechcostmeasure,werstextractedasmallsetofparsepairsthatcouldbereliablyalignedusinglexicalmatchingalone,andthenbasedthecostmeasureontheco-occurrencecountsoftheobservedpartsofspeechpairings.Theremainingcostsweresetbyhand.Asaresultofthealignmentprocess,alignmentidattributes(aid)areaddedtothenodesoftheparsepairs.Somenodesmaybeinalignmentwithnoothernode,suchasEnglishprepositionsnotfoundintheKoreanDSyntS.4.2GeneratingRuleCandidatesCandidatetransferrulesaregeneratedusingthreedatasources:thetrainingsetofalignedsourceandtargetparsesresultingfromthealignmentprocess;asetofalignmentconstraintswhichidentifythesubtreesofinterestinthealignedsourceandtargetparses(Section4.2.1);asetofattributeconstraintswhichdeterminewhatpartsofthealignedsubtreestoincludeinthetransferrulecandidates'sourceandtargetpatterns(Section4.2.2).Thealignmentandattributeconstraintsarenec-essarytokeepthesetofcandidatetransferrulesmanageableinsize.4.2.1AlignmentconstraintsFigure4showsanexamplealignmentconstraint.Thisconstraint,whichmatchesthestructuralpat-ternsofthetransferruleillustratedinFigure2,usestheaidalignmentattributetoindicatethat log=logL(C12;C1;p)+logL(C2�C12;N�C1;p)�logL(C12;C1;p1)�logL(C2�C12;N�C1;p2)where,notcountingattributesaid,C1=numberofsourceparsescontainingatleastoneoccurrenceofC'ssourcepatternC2=numberoftargetparsescontainingatleastoneoccurrenceofC'stargetpatternC12=numberofsourceandtargetparsepairscontain-ingatleastoneco-occurrenceofC'ssourcepatternandC'stargetpatternsatisfyingthealignmentcon-straintsN=numberofsourceandtargetparsepairsP=C2=N;P1=C12=C1;P2=(C2�C12)=(N�C1);L(k;n;x)=xk(1�x)n�k Figure7:Loglikelihoodratiosfortransferrulecandidateseachlexemeattributeandforeachdependencyre-lationship.Inourinitialexperiments,thissimpleheuristichasbeensatisfactory.4.4FilteringRuleCandidatesOncethecandidatetransferruleshavebeenor-dered,error-drivenlteringisusedtoselectthosethatyieldimprovementsoverthebaselinetrans-ferdictionary.Thealgorithmworksasfollows.First,intheinitializationstep,thesetofacceptedtransferrulesissettojustthoseappearinginthebaselinetransferdictionary,andthecurrenter-rorrateisestablishedbyapplyingthesetransferrulestoallthesourcestructuresandcalculatingtheoveralldifferencebetweentheresultingtrans-ferredstructuresandthetargetparses.Then,inasinglepassthroughtheorderedlistofcandidates,eachtransferrulecandidateistestedtoseeifitreducestheerrorrate.Duringeachiteration,thecandidatetransferruleisprovisionallyaddedtothecurrentsetofacceptedrulesandtheupdatedsetisappliedtoallthesourcestructures.Iftheoveralldifferencebetweenthetransferredstruc-turesandthetargetparsesislowerthanthecur-renterrorrate,thenthecandidateisacceptedand @KOREAN:{po}[class=vbmaente={ra}](s1$X[ppca={reul}])@ENGLISH:look[class=verbmood=imp](attrat[class=preposition](ii$X))@-2xLOG_LIKELIHOOD:11.40 Figure8:TransferruleforEnglishimperativewithlexicalizationandprepositioninsertionthecurrenterrorrateisupdated;otherwise,thecandidateisrejectedandremovedfromthecur-rentset.4.5DiscussionofInducedRulesExperimentationwiththetrainingsetof882parsepairsdescribedinSection3.1produced12467sourceandtargetsub-treepairsusingthealign-mentconstraints,fromwhich20569transferrulescandidateweregeneratedand7565wereacceptedafterltering.Weexpectthatthenumberofacceptedrulesperparsepairwilldecreasewithlargertrainingsets,thoughthisremainstobever-ied.TheruleillustratedinFigure3wasacceptedasthe65thbesttransferrulewithaloglikelihoodratioof33.37,andtheruleillustratedinFigure2wasacceptedasthe189thbesttransferrulecan-didatewithaloglikelihoodratioof12.77.Anex-ampleofacandidatetransferrulethatwasnotac-ceptedistheonethatcombinesthefeaturesofthetworulesmentionedabove,illustratedinFigure8.Thistransferrulecandidatehadalowerloglike-lihoodratioof11.40;consequently,itisonlycon-sideredafterthetworulesmentionedabove,andsinceitprovidesnofurtherimprovementuponthesetworules,itislteredout.Inaninformalinspectionofthetop100ac-ceptedtransferrules,wefoundthatmostofthemappeartobefairlygeneralrulesthatwouldnor-mallybefoundinageneralsyntactic-basedtrans-ferdictionary.Inlookingattheremainingrules,wefoundthattherulestendedtobecomeincreas-inglycorpus-specic. shownthattheinducedsyntactictransferrulesfromKoreantoEnglishleadtoamodestincreaseintheaccuracyoftransferredparseswhencom-paredtothetargetparses.Infuturework,wehopetodemonstratethatacombinationofconsid-eringalargersetoftransferrulecandiates,ren-ingourtreatmentofruleconicts,andmakinguseofmoretrainingdatawillleadtofurtherimprove-mentsintreeaccuracy,and,followingsyntacticrealization,willyieldtosignicantimprovementsinend-to-endresults.AcknowledgementsWethankRichardKittredgeforhelpfuldiscus-sion,DarylMcCulloughandTedCaldwellfortheirhelpwithevaluation,andChung-hyeHan,MarthaPalmer,JosephRosenzweigandFeiXiafortheirassistancewiththehandcraftedKorean-Englishtransferdictionaryandtheconversionofphrasestructureparsestosyntacticdependencyrepresentations.Thisworkhasbeenpartiallysup-portedbyDARPATIDEScontractno.N66001-00-C-8009.ReferencesMichaelCollins.1997.Threegenerative,lexicalisedmodelsforstatisticalparsing.InProceedingsofthe35thMeetingoftheAssociationforComputationalLinguistics(ACL'97),Madrid,Spain.BonnieDorr.1994.Machinetranslationdivergences:Aformaldescriptionandproposedsolution.Com-putationalLinguistics,20(4):597–635.C.Han,B.Lavoie,M.Palmer,O.Rambow,R.Kit-tredge,T.Korelsky,N.Kim,andM.Kim.2000.HandlingstructuraldivergencesandrecoveringdroppedargumentsinaKorean-Englishmachinetranslationsystem.InProceedingsoftheFourthConferenceonMachineTranslationintheAmeri-cas(AMTA'00),MisinDelSol,Mexico.BenoitLavoieandOwenRambow.1997.RealPro–afast,portablesentencerealizer.InProceedingsoftheConferenceonAppliedNaturalLanguagePro-cessing(ANLP'97),Washington,DC.C.D.ManningandH.Schutze.1999.FoundationsofStatisticalNaturalLanguageProcessing.MITPress.H.MaruyamaandH.Watanabe.1992.Treecoversearchalgorithmforexample-basedtranslation.InProceedingsoftheFourthInternationalConferenceonTheoreticalandMethodologicalIssuesinMa-chineTranslation(TMI'92),pages173–184.Y.Matsumoto,H.Hishimoto,andT.Utsuro.1993.Structuralmatchingofparalleltexts.InProceed-ingsofthe31stAnnualMeetingsoftheAssociationforComputationalLinguistics(ACL'93),pages23–30.IgorMel'cuk.1988.DependencySyntax.StateUni-versityofNewYorkPress,Albany,NY.A.Meyers,R.Yangarber,R.Grishman,C.Macleod,andA.Moreno-Sandoval.1998.Derivingtransferrulesfromdominance-preservingalignments.InProceedingsofCOLING-ACL'98,pages843–847.MakotoNagao.1984.Aframeworkofamechan-icaltranslationbetweenJapeneseandEnglishbyanalogyprinciple.InA.ElithornandR.Banerji,editors,ArticialandHumanIntelligence.NATOPublications.AlexisNasr,OwenRambow,MarthaPalmer,andJosephRosenzweig.1997.Enrichinglexicaltrans-ferwithcross-linguisticsemanticfeatures.InPro-ceedingsoftheInterlinguaWorkshopattheMTSummit,SanDiego,California.S.SatoandM.Nagao.1990.Towardmemory-basedtranslation.InProceedingsofthe13thInter-nationalConferenceonComputationalLinguistics(COLING'90),pages247–252.FeiXiaandMarthaPalmer.2001.Convertingdepen-dencystructurestophrasestructures.InNotesoftheFirstHumanLanguageTechnologyConference,SanDiego,California.J.Yoon,S.Kim,andM.Song.1997.Newparsingmethodusingglobalassociationtable.InProceed-ingsofthe5thInternationalWorkshoponParsingTechnology.

Related Contents


Next Show more