Building a wordformation based lexicon for Latin Eleonora Litta Modignani Picozzi MSCA Fellow CIRCSE Research Centre Università Cattolica del Sacro Cuore Milano Marie Skłodowska ID: 777725
Download The PPT/PDF document "Morphology beyond inflection." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Morphology beyond inflection.
Building a wordformation based lexicon for Latin
Eleonora Litta Modignani
PicozziMSCA FellowCIRCSE Research CentreUniversità Cattolica del Sacro Cuore, MilanoMarie Skłodowska-Curie grant agreement No 658332-WFL
Slide2Inflectional vs derivational
Morphology (for Latin)Word formation relations
not treated by computational lexical resources or morphological analysers.Halfway between morphology
and semanticsWFR not only build new words, but create new words with a shared semantic core, which can be useful for NLP tools.First tools for modern languages (Czech, Croatian, German).
Slide3The Word Formation Latin (WFL) project
Started by Passarotti and Mambrini in 2012Awarded Marie Curie
Fellowship Definitive derivational lexicon for Latin.
Slide4The lexical basis: LemLat
Morphological analyser for Latin.Data collected from
three dictionaries: Georges and Georges, Ausfuhrliches Lateinisch-Deutsches Handwôrterbuch (1913-1918)Glare, Oxford Latin Dictionary
(1982)Gradenwitz: Laterculi vocum latinarum (1904).40,014 lexical entries, 43,432 lemmas, and 26,205 lemmas from Forcellini’s Onomasticon (1940).Lexical basis for WFL: LES (LExical Segment) archive and List of Lemmas.
Slide5Word formation
Item-and-Arrangement model: word forms are 1) simple morphemes or
2) concatenation of morphemes absolving the following conditions: Baudoin’s assumption that both base and affixes are lexical elements (i.e. they are both morphemes), They are dualistic, having both form and meaning (Bloomfield’s “sign-base” morpheme theory) They both exist in a lexicon (Bloomfield’s “lexical morpheme” theory)
Slide6Formalising WFL
Word formation based lexicon built in three
steps:WFRs are detectedWFRs are applied to the lexical data3) Results
are manually checked and evaluated
Slide7Types of word f
ormationDerivation:
Affixal: Prefixal: duco => con-duco Suffixal: amo => am-a-bil-
isConversion : bonus (adj.) => bonum (noun)Compounding : magnus + facio = magnificus
Slide8Detection of word formation
rules (WFRs)
Semi-automatic finding of affixal rules (Passarotti & Mambrini 2012).List of possibile combination of PoS for conversion (e.g. V-to-V, V-to-N, V-to-A, etc.)
and compounding (A+V=N, A+V=A, etc.).WFRs formalised into a table according to category of change, type of word formation, input PoS and output PoS. 50 additional rules found so far while working through the data.
Slide9WFR list
Slide10MySql relational database
MySQL
queries
Slide11WFR
paired candidates: V-to-V prefix sub-
Slide12Finding prefixal
and suffixal candidates
Slide13Manual
checking
and disambiguatingfocaria - n ‘
kitchen maid’focarius - n ‘kitchen servant’focarius - adj. ‘belonging to fire’
Conversion A-to-N2
Conversion A-to-N1
Conversion N2-to-N1
Slide14Evaluation
Precision is higher for lower morphotactic mutations (e.g.prefixal
rules)Precision lower for obscure WFRs (e.g. compounding)Precision can also depend on workflow
Recall needs to be calculated when we have finding WFRs
Slide15Viewing the relations
Visualisation query system (by Chris Culy)Browsing
options:By WFRBy AffixBy PoSBy Lemma
Slide16Slide17Slide18Slide19Slide20Slide21Thank you!
WFL team
Eleonora LittaMarco PassarottiVisualisationsChris Culy (http://chrisculy.net/)DB Engineering
Paolo RuffoloWebsitehttp://progetti.unicatt.it/progetti-milan-wfl-homehttps://www.facebook.com/wordformationlatin/WFL has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 658332-WFL.