105K - views

Towards Annotating and Extracting Textual Legal Case Elements

In common law contexts legal cases are decided with respect to precedents rather than legislation as in civil law contexts Legal professionals must 64257nd analyse and reason with and about cases drawn from a set of cases a case base A range of part

Embed :
Pdf Download Link

Download Pdf - The PPT/PDF document " Towards Annotating and Extracting Textu..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Towards Annotating and Extracting Textual Legal Case Elements






Presentation on theme: " Towards Annotating and Extracting Textual Legal Case Elements"— Presentation transcript:

TowardsAnnotatingandExtractingTextualLegalCaseElementsAdamWynerUniversityofLeedsAbstract.Incommonlawcontexts,legalcasesaredecidedwithrespecttoprecedentsratherthanlegislationasincivillawcontexts.Legalprofessionalsmustnd,analyse,andreasonwithandaboutcasesdrawnfromasetofcases(acasebase).Arangeofparticulartextualelementsofacasemayberelevanttoqueryandextract.Commercialprovidersoflegalinformationallowlegalprofessionalstosearchacasebasebykeywordsandmetadata.However,thecasebaseandthesearchtoolsareproprietary,oflimited,non-extensiblefunctionality,andarerestrictedaccess.Moreover,noproviderappliesnaturallanguageprocessingtechniquestothecasesfortextanalysis,XMLannotation,orinformationacquisition.Inthispaper,wediscussaninitialexperimentindevelopingandapplyingnaturallanguageprocessingtoolstocasestoproduceannotatedtextwhichcanthensupportinformationextraction.Keywords:TextAnalysis,LegalCases,Ontologies1.IntroductionIncommonlawcontexts,judgesandjuriesdecidealegalcasetofollowpreviouslydecidedcases(precedents)ratherthanlegislationasincivillawcontexts.1Thesetofsuchcasesisthelegalcasebase.Legalprofessionalsmustnd,analyse,andreasonwithandaboutcasesdrawnfromthecasebaseinthecourseofarguingforadecisioninacurrentundecidedcase.Arangeofelementsofcasesmayberelevanttoqueryandextractsuchasthecitationin-dex,participants,locale,jurisdiction,representatives,judge,prototypicalfactpatterns(factors),applicablelaw,andothers.Commercialprovidersoflegalinformationallowlegalprofessionalstosearchthecasebasebykeywordsandmetadata.However,thecasebaseandsearchtoolsareproprietary,oflimited,non-extensiblefunctionality,andarerestrictedaccess.Moreover,noproviderworkswithSemanticWebfunctionalitiessuchasontologiesorrichXMLannotations,norarenaturallanguageprocessingtechniquesappliedtothecasestosupportanalysistoacquireinformation.Textannotationofunstructuredlinguisticinformationisasignicant,dif-cultaspectofthe“knowledgebottleneck”inlegalinformationprocessing.Inthispaper,weapplynaturallanguageprocessingtoolstotextualelementsincases,whichareunstructuredtext,toproduceannotatedtext,fromwhichinformationcanbeextracted,thuscontributingtoovercomingthebottle-neck.Theextractedinformationcanthenbesubmittedtofurtherprocesses. 1CorrespondencetoAdamWyneradam@wyner.info.loait2010.tex;25/06/2010;23:41;p.9 10A.WynerWheretheannotationsareassociatedwithanontology(WynerandHoek-stra,2010)alongwithanassociatedcasebasedreasoner(WynerandBench-Capon,2007),thenwemakeprogresstowardsatextualcasebasedreasoningsystemwhichenablesprocessingfromnaturallanguagecasedecisionsinthecasebasetogenerateddecisionsinnovelcases(Weberetal,2005a).However,thispaperfocusesontheinitialdevelopmentinannotatingcaseswithrespecttocaseelements.Thepaperisafeasibilitystudyforfutureresearchoninformationextrac-tionofcaseelements.2Inthispaper,wefocusoncaseelementsratherthancasefactors(see(WynerandPeters,2010)).In2,wediscussbackgroundandmaterials.In3,wepresentthemethod-ology,whichusestheGeneralArchitectureforTextEngineering(GATE)sys-tem,samplecomponentsofsystem,sampleresults,andaworkowforfur-therrenement.3Finally,in4,wereviewthepaperandoutlinefutureworktoevaluateandimproveourresults.2.BackgroundandmaterialsLegalcasebasedreasoningwithfactorshasbeenatopicofcentralconcerninarticialintelligenceandlaw.Forourpurposes,therearetwomainbranchesofresearch.Onebranch,knowledgerepresentationandreasoningsystems,requiresaknowledgebasethatisconstructedbymanualanalysis(cf.(Hafner,1987),(Ashely,1990),(Risslandetal,1996),(Aleven,1997),(WynerandBench-Capon,2007)).However,thisbranchofresearchdoesnotaddresstheknowledgebottleneck,whichistheextractionofinformationtocomposetheknowledgebase.Theotherbranch,informationextraction,addressesthebottleneckusingnaturallanguageprocessingtechniqueswhichidentifyinformativecompo-nentsofthetextandannotatethemwithXML.TheannotatedinformationcanbeextractedwithXQuery.Thus,thecontentofthedocumentscanbeidentiedfromitssourcelinguisticrealisation.Therearearangeofareaswhereinformationextractionoflegaltextshasbeencarriedout:ontologyconstruction((Lame,2004)and(Peters,2009)),textsummarisation((Moensetal,1997)and(HacheyandGrover,2006)),extractionofprecedentlinks(Jacksonetal,2003),andfactoranalysis((AshleyandBrüninghaus,2009)and(WynerandPeters,2010)).Wefocusoninformationextractionofcaseelements,whichcontributestothispreviouswork.Thebranchesarerelatedsincetheextractedinformationcanberepre-sentedinsomeknowledgebaseandreasonedwith.Forcasebasedreasoningwithfactorsasin(Aleven,1997),weextractfactors;forreasoningabout 2Contacttheauthorformaterials.3ForGATE,seehttp:==gate.ac.uk=.loait2010.tex;25/06/2010;23:41;p.10 AnnotatingandExtractingCaseElements11precedentialrelationsamongcases(overturned,afrmed,andsoon),weex-tractcitationindicesandrelationalterms.Aslegalcasesarenotjustaboutthelawperse,butaboutsomecontentarea(e.g.intellectualproperty,familylaw,etc)andhumanpropertiesandartifacts(e.g.instrumentsandproperty),onemightsupposethatallofhumanknowledgeandexperienceispotentiallyunderthescopeofthelawandsopotentiallytobeextracted,putinaknowl-edgebase,andreasonedwith(cf.worksonlegalknowledgerepresentation(Petersetal,2007),(ScheighoferandLiebwald,2007),(Hoekstraetal,2009),and(Gangemietal,2005)).Yet,(WynerandHoekstra,2010)arguethatthefocusshouldbeoninformationwhichhasalegaldenitionorfunction,leavingasidehighlevel,non-legaldomaininformation(e.g.events/processes,causation,time,andsoon).Inthislightandinthecurrentpaper,weareinterestedincaseinformationthatwouldberelevanttosearchingfororextractinginformationfromcases.Forreasonsofspace,weonlygiveasampleoftheinformationwesearchedforandannotated:-Casecitation,casescited,precedentialrelationships.-Namesofparties,judges,attorneys,courtsort....-Rolesofparties,meaningplaintiffordefendant,andattorneys,meaningthesidetheyrepresent.-Finaldecision.Withrespecttothesefeatures,onewouldwanttomakearangeofqueries(usingsomeappropriatequerylanguage)suchas:-InwhatcaseshascompanyXbeenadefendant?-InwhatcaseshasattorneyYworkedforcompanyX,whereXwasadefen-dant?AsweinitiallybasedourworkoninformationextractionfromCaliforniaCriminalCourtsin(Bransford-Koons,2005),developingandmodifyinglistsandrules,weworkedwithalegalcasebaseofcasesfromtheUnitedStates.(Bransford-Koons,2005)reportsworkingwith47criminalcasesdrawnfromtheCaliforniaSupremeCourtandStateCourtofAppeals.However,onlytwocasesaregivenassamplesandforwhichwehaveaccess;forthisfeasibilitystudy,wegiveexamplesfromthesecases.(Bransford-Koons,2005)usesGATE(describedbelow)andOPENCYC,whichisarepositoryofcommonsenserules.WedonotconsiderOPENCYChere.Toshowthefeasibilityoftheapproach,weprovidepreliminaryresultsonthisverysmallcorpusofPeoplev.Coleman117Cal.App.2d565andInreJamesM.,9Cal.3d517.loait2010.tex;25/06/2010;23:41;p.11 12A.Wyner3.MethodologyusingGATEWeusetheGATEframework(Cunninghametal,2002).GATEDeveloperisanopensourcedesktopapplicationwritteninJAVAandforlinguistsandtextengineers.UsingaGUI,itallowsavarietyoftextanalysistoolstobecascadedandappliedtoasetofdocuments.Forourpurposes,wehaveappliednaturallanguageprocessingmodulessuchasTokeniser,Gazetteer,andJavaAnnotationPatternsEngine(JAPE),eachmoduleprovidinginputtothenext.Thelasttwomodulesareexplainedfurtherbelow.Inadditiontothesefunctionalities,onecanalsouseentityextractionandsyntacticparsingcomponents.Foraparticulardomain,itisimportanttoprovidegazetteerlistsandJAPErules.Ingeneral,thereisacascadefromlowerlevelinformationinthepartsofspeechandgazetteerliststohigherlevelinformationwherelowerlevelinformationisusedtocomposemorecomplexunitsofinformation.Asaworkingstrategy,thelistscapturesimple,unsystematicpatterns,leavingtheJAPErulestocapturesystematic,complexpatterns.Figure1representstheworkow(derivedfromtheworkowdiagramin(WynerandPeters,2010)),whereaninitialspecicationguidesthedenitionofgazetteerlistsandJAPErules.Theprocesscascadeisappliedtothecorpus,whichresultsinanannotatedtext.Examiningtheresults,onedetermineswhattomodifyinthegazetteerlistsandJAPErulesuntiloneachievesdesiredannotations.Thus,wehaveaniterativeprocesswhichsupportsexperimentalrenementofthelistsandrulesthatinduceannotation.3.1.GAZETTEERLISTSAgazetteerisalistoflists.Eachlistiscomprisedofstringsthatareassociatedwithacentralconceptorwithsomeelementsofthetext.ThelistsannotatethewordsandstringswiththeMajorTypeofthelist;theyprovidethebottomlevelofannotationonwhichhigherlevelannotationsareconstructedusingJAPErules.Thegazetteerlistsdiscussedherearemanuallycomposed.Weinitiallyworkedwithgazetteerlistsfrom(Bransford-Koons,2005).However,whilethelistsmay“work”,theyareclearlyinneedofreconstruc-tionandextension,whichwediscuss.Oneobservationisthatthelistsarede-nedforUScaselawandparticularlytheCaliforniadistrictcourts.Thus,wecannotsimplyapplytheliststodifferentjurisdictions,e.g.theUnitedKing-dom;thelistsandrulesmustbelocalisedtodifferentcontexts.Forinstance,thetermFifthAppellateDistrictorMunicipalCourtof....maynotoccurintheUK.Similarissuesarisewithcasecitations,rolesofpar-ticipants,causesofaction,andsoon.Moretechnically,listshavealternativegraphical(capitalorlowercase)ormorphologicalforms,whichwouldbebet-loait2010.tex;25/06/2010;23:41;p.12 AnnotatingandExtractingCaseElements13 Figure1.AWorkowDiagramteraddressedusingGATE'sFlexibleGazetteer,whichhomogenisesgraphicalformsandlemmatiseswords(providinga“root”form).Asageneralstrategy,itisbesttocreatelistswith“unique”wordformsorxedphrasesratherthanthosewhichmayotherwisebeconstructedbyJAPErules.Takingtheseconsiderationsintoaccount,wecreatedlistsforparticularlylegalterminologyandusedtheFlexibleGazetteer.Theliststhuscompriseaconceptualcoverterm;forexample,asearchforjudgmentsorlegalpartiesinacorpuswillreturncasesandpassageswhichcontaintermsfoundintheselists:-judgements.lst.Termsrelatedtojudgment:grant,deny,reverse,overturn,remand,....-legal_parties.lst.Termsforlegalroles:amicuscurie,appellant,appellee,counsel,defendant,plaintiff,victim,witness,....Arangeoflistssuchasthetwosampledbelowbearon“indicators”ofstructure.Forexample,“v.”isusedincasestoindicatetheopposingparties,soitcanbeusedtoleverageidenticationandannotationofpartieswhichap-pearoneithersideoftheindicator.Thesearenotunproblematic:theindicatormightincorrectlylabelanabbreviatedrstname.Theremaybebetterwaystondjudgesthantheinitial“J.”;inparticular,asthelistofjudgesisniteandgivebythecourtsystem,itmightbesimplesttousesuchalistratherthanapplyingtextminingtondingit.loait2010.tex;25/06/2010;23:41;p.13 14A.Wyner-legal_casenames.lst.Termsthatcanbeusedtoindicatecasenames:v.,InRe,....-judgeindicator.lst.TheindicatorJ..Thisisaproblematicindicatorifitispartofanindividual'sname.Inotherlists,wehavephrases,abbreviations,andcasecitations.Forphrases,therearetwostrategies.(Bransford-Koons,2005)followsthestrategyoflist-ingthepossiblephrases.Thealternativewhichweadoptistoprovidebottomlevellistsforconstituentpartsofthephrases,thenconstructingthecom-plexphrasesbyrule.Theformerrequiresanitelist;itwillnotannotateanovelphrase.Constructingphrasesrequiresthattheoutputbecheckedagainstactualphrasessoitdoesnotovergenerate.Thetreatmentofabbrevi-ationsinGATEisnotentirelyclear,though(Bransford-Koons,2005)simplyliststhem.Forexample,onewouldwanttolinktheabbreviationwiththefullform,e.g.FifthAppellateDistrictandFifthApp.Dist.,andmoreover,theremaybearangeofalternativeabbreviations.Onestrategyistohaverelatedlists-alistofphraseswheretheabbreviationofthephraseisaMinorType,andalistofabbreviationswherethecorrelatedphraseisaMinorType.Inourview,moregeneralsolutionsarebetterthanspeciconeswhichlistinformation;listsoughttobecontainarbitraryinformation,whileJAPErulesconstructsystematicinformation.Casecitationscombinetheis-suesofphrases,abbreviations,andalternativeforms.WemayhaveacitationsuchasCal.App.3dwhichabbreviatestheCaliforniaCourtofAppeals,ThirdDistrict.Clearly,eachpartisacomponentthatcanbereusedinothercitations.Moreover,asspacesmatterintextanalysis,wemustaccountforalternatives,Cal.App.3dandCal.App.3d.-lower_courts.lst.Phrasesforothercourts:MunicipalCourtof,SuperiorCourtof,....-legal_code_citations.lst.Codecitations:Civ.Code,PenalCode,....Someofthetermsarefunctional;thatis,bothlegalpartiesandcounselnamesarerolesthatindividualshavewithrespecttoaparticularcontext.Inonecontext,anindividualmaybeaplaintiff,whileinanotherthedefen-dant.Inannotatinganindividualwithafunctionalrole,e.g.anindividualasplaintiff,werelyonlocalcontextwithinthetextanddonotpresumethattheindividual'sannotationappliesacrosscases.Finally,(Bransford-Koons,2005)providesarangeoftermswhichrelatetothecontentofthecase.Forexample,acaseofcriminalassaultismarkedbytheappearanceoftermsbearingonweaponorintention.-weapons.lst.Alistofitemsthatareweapons:assaultrie,axe,club,st,gun,....loait2010.tex;25/06/2010;23:41;p.14 AnnotatingandExtractingCaseElements15-intention.lst.Termsforintention:intend,expect,....Whileitwouldbemeaningfultoindexcasesaccordingtosuchcontent,theypresentseveralproblems.Clearly,whethersomethingisaweaponorcriminalassaultiscontextdependentsinceinsomeothercontexttheymightnotbe.Howcouldoneboundtherangeofrelevanttermsappropriatelyandgivetheminterpretationsthatarerelevanttothecontext?Forexample,isn'tanyobjectapossibleweapon?Thesemaybetermswhich,asdiscussedin(WynerandHoekstra,2010),aredevelopedinindependentmodules;wedonotwanttodevelopafulltheoryofspace,time,instruments,intention,orcausation.3.2.JAPERULESGiventhebottom-levelannotationsprovidedbythelists,wehaveJAPEruleswhichmaketheannotationsgraphicallyrepresentedandavailableforhigherlevelannotations.BelowisapartiallistofannotationsgivenbyJAPErules.-AppellantCounsel:annotatestheappellantcounsel.-DSACaseName:annotatesthecasename.-CauseOfAction:annotatesforcausesofaction.-DecisionStatement:annotatesasentenceasthedecisionstatement.-JudgeName:annotatesthenamesofjudges.SomeoftheJAPErulessimplytranslatetheLookuptypeintoananno-tationsuchasWeapon,whileotherrulesusetheLookuptypeandcontexttoannotateatextspansuchasAppellantCounselandDecisionStatement.Inthefollowingsamplerule,asentencewhichcontainsajudgmentterm(e.g.afrm,overturn,etc)followedbyajudge'snameislabeledadecisionstatement.Therulereliesonastandardformat,wherethecasedecisionisfollowedbythejudge'sname;weresimilarpatternstoappearinthecase,thentheytoomightbemis-annotatedasadecisionofthecase.Rule:DecisionStatementPriority:10({SentencecontainsJudgementTerm}):termtemp{JudgeName}–�:termtemp.DecisionStatement={rule=“DecisionStatement”}loait2010.tex;25/06/2010;23:41;p.15 16A.Wyner3.3.RESULTSInthissection,wegivesomeoftheresultsofrunningourGATEapplicationoverourcorpus,givingtheresultsusingthegraphicaloutputofGATEWehavethefollowingsampleoutputsfromourlistsandrulesappliedtoPeoplev.Coleman,117CalApp.2d565.Thecolouredhighlightsonthecasetextareassociatedwiththesamecolouredannotation.WecanoutputanXMLrepresentationtoindicatetheannotation.InFigure2,wendtheaddress,courtdistrict,citation,casename,counselsforeachside,andtheroles.Theresultsgiveaavouroftheannotations,thoughfurtherworkisrequiredtorenethem. Figure2.CaseInformationIInFigure3,wefocusonadditionalinformationsuchasstructuralsections(e.g.Opinion),thenameofthejudge,andtermshavingabearingoncriminalassaultandweapons.InFigure4,weidentifythedecision. Figure3.CaseInformationII Figure4.CaseInformationIIIloait2010.tex;25/06/2010;23:41;p.16 AnnotatingandExtractingCaseElements174.ConclusionInthispaper,wehaveoutlinedandextendedaproofofconceptapproachtotextmininglegalcasesinordertoextractarangeofparticularelementsofinformationfromthecases.Whilearelativelysmallsystemappliedtoaverysmallcorpus,thelistsandrulesapproachcanbeextendedfurtherandrelativelyeasily.FurtherdevelopmentsusingthisapproachtotextminingwouldbetorelatetheextractedinformationtoanontologywhichisdirectlyincorporatedintotheGATEpipeline.Aseconddevelopmentwouldbetoengageawiderangeofusers(e.g.lawschoolstudents)inacollaborative,onlineannotationtaskusingGATETeamWare.Notonlywouldthishavedidacticpurposes(tofocustheattentionofstudentsoncloseanalysisofthetext),butitwouldalsohelptobuildupabodyofannotatedtextsforfurtherresearchaswellasdevelopmentofagoldstandardthatcouldbeusedformachinelearning.ReferencesAleven,A.(1997),Teachingcase-basedargumentationthroughamodelandexamples.PhDthesis,UniversityofPittsburgh,1997.Ashley,K.(1990),ModellingLegalArgument:ReasoningwithCasesandHypotheticals.BradfordBooks/MITPress,Cambridge,MA,1990.Ashley,K.andBrüninghaus,S.(2009),Automaticallyclassifyingcasetextsandpredictingoutcomes.Artif.Intell.Law,17(2):125–165,2009.Bransford-Koons,G.(2005),DynamicsemanticannotationofCaliforniacaselaw.Master'sthesis,SanDiegoStateUniversity,2005.Cunningham,H.,Maynard,D.,Bontcheva,K,andTablan,V.(2002),GATE:AframeworkandgraphicaldevelopmentenvironmentforrobustNLPtoolsandapplications.InProceed-ingsofthe40thAnniversaryMeetingoftheAssociationforComputationalLinguistics(ACL'02),2002.Gangemi,A.,Sagri,M.,andTiscornia,D.(2005),Aconstructiveframeworkforlegalontolo-gies.InV.R.Benjamins,P.Casanovas,J.Breuker,andA.Gangemi,editors,LawandtheSemanticWeb,pages97–124.SpringerVerlag,2005.Hachey,B.andGrover,C.(2006),Extractivesummarisationoflegaltexts.ArticialIntelligenceandLaw,14(4):305–345,2006.Hafner,C.(1987),Conceptualorganizationofcaselawknowledgebases.InICAIL'87:Proceedingsofthe1stInternationalConferenceonArticialIntelligenceandLaw,pages35–42,NewYork,NY,USA,1987.ACM.Hoekstra,R.,Breuker,J.,Bello,M.,andBoerA.(2009),LKIFcore:Principledontologydevelopmentforthelegaldomain.InJoostBreuker,PompeuCasanovas,MichelC.A.Klein,andEnricoFrancesconi,editors,Law,OntologiesandtheSemanticWeb,volume188ofFrontiersinArticialIntelligenceandApplications,pages21–52.IOSPress,2009.Jackson,P.,Al-Kofahi,K.,Tyrell,A.,andVachher,A.(2003),Informationextractionfromcaselawandretrievalofpriorcases.ArticialIntelligence,150(1-2):239–290,November2003.Lame,G.(2004),UsingNLPtechniquestoidentifylegalontologycomponents:Conceptsandrelations.ArticialIntelligenceandLaw,12(4):379–396,2004.loait2010.tex;25/06/2010;23:41;p.17 18A.WynerMoens,M.-F.,Uyttendaele,C.,andDumortier,J.(1997),Abstractingoflegalcases:thesalomonexperience.InICAIL'97:Proceedingsofthe6thInternationalConferenceonArticialIntelligenceandLaw,pages114–122,NewYork,NY,USA,1997.ACM.Peters,W.(2009),Text-basedlegalontologyenrichment.InProceedingsoftheworkshoponLegalOntologiesandAITechniques,Barcelona,Spain,2009.Peters,W.,Sagri,M.-T.,andTiscornia,D.(2007),ThestructuringoflegalknowledgeinLOIS.ArticialIntelligenceandLaw,15(2):117–135,2007.Rissland,E.,Skalak,D.,andFriedman,T.(1996),BankXX:Supportinglegalargumentsthroughheuristicretrieval.ArticialIntelligenceandLaw,4(1):1–71,1996.Schweighofer,E.andLiebwald,D.(2007),Advancedlexicalontologiesandhybridknowl-edgebasedsystems:Firststepstoadynamiclegalelectroniccommentary.ArticialIntelligentandLaw,15(2):103–115,2007.Weber,R.,Ashley,K.,andBrüninghaus,S.(2005),Textualcase-basedreasoning.KnowledgeEngineeringReview,20(3):255–260,2005.Wyner,A.andBench-Capon,T.(2007),Argumentschemesforlegalcase-basedreasoning.InArnoR.LodderandLaurensMommers,editors,LegalKnowledgeandInformationSystems.JURIX2007,pages139–149,Amsterdam,2007.IOSPress.Wyner,A.andHoekstra,R.(2010),AlegalcaseOWLontologywithaninstantiationofPopovv.Hayashi.KnowledgeEngineeringReview,xx:xx,2010.Toappear.Wyner,A.andPeters,W.(2010),Towardsannotatingandextractingtextuallegalcasefac-tors.InProceedingsoftheLanguageResourcesandEvaluationConferenceWorkshoponSemanticProcessingofLegalTexts,Malta,2010.Toappear.loait2010.tex;25/06/2010;23:41;p.18