Eric Brill A Maximum Entropy Approach to Identifying Sentence Boundaries Jeffrey C Reynar and Adwait Ratnaparkhi Presenter Sawood Alam ltsalamcsoduedugt Some Advances in TransformationBased Part of Speech Tagging ID: 562721
Download Presentation The PPT/PDF document "Some Advances in Transformation-Based Pa..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Some Advances in Transformation-Based Part of Speech Tagging
Eric Brill
A Maximum Entropy Approach to Identifying Sentence Boundaries
Jeffrey C. Reynar and Adwait Ratnaparkhi
Presenter
Sawood
Alam
<salam@cs.odu.edu>Slide2
Some Advances in Transformation-Based Part of Speech Tagging
Spoken Language Systems Group Laboratory for Computer Science Massachusetts Institute of Technology Cambridge, Massachusetts 02139brill@goldilocks.lcs.mit.eduSlide3
Introduction
Stochastic taggingTrainable rule-based taggerRelevant linguistic information with simple non-stochastic rulesLexical relationship in tagging
Rule-based approach to tagging unknown wordsExtended into a k-best taggerSlide4
Markov-Model Based Taggers
Tag sequence that maximizesProb(word|tag) *
Prob(tag|previous n tags)Slide5
Stochastic Tagging
Avoid laborious manual rule constructionLinguistic information is only captured indirectlySlide6
Transformation-Based Error-Driven LearningSlide7
An Earlier Transformation-Based Tagger
Initially assign most likely tag based on training corpus
Unknown word is tagged based on some featuresChange tag a to b when:The preceding/following word is tagged zThe word two before/after is tagged zOne of the two/three preceding/following words is tagged zThe preceding word is tagged z and the following word is tagged wThe preceding/following word is tagged z and the word two before/after is tagged w
Example: change from noun to verb if previous word is a modalSlide8
Lexicalizing the Tagger
Change tag a to tag b when:The preceding/following word is wThe word two before/after is w
One of the two preceding/following words is wThe current word is w and the preceding/following word is xThe current word is w and the preceding/following word is tagged zExample: changefrom preposition to adverb if the word two positions to the right is "as“from non-3rd person singular present verb to base form verb if one of the previous two words is "n’t"Slide9
Comparison of Tagging Accuracy With No Unknown Words
Method
Training Corpus Size (Words)# of Rules or Context. Probs.Acc. (%)Stochastic
64 K6,17096.3Stochastic1 Million10,00096.7Rule-Based w/o Lex. Rules
600 K
219
96.9
Rule-Based With
Lex
. Rules
600 K
267
97.2Slide10
Unknown Words
Change the tag of an unknown word (from X) to Y if:Deleting the prefix x, |x| <= 4, results in a word (x is any string of length 1 to 4)The first (1,2,3,4) characters of the word are x
Deleting the suffix x, |x| <= 4, results in a wordThe last (1,2,3,4) characters of the word are xAdding the character string x as a suffix results in a word (|x| <= 4)Adding the character string x as a prefix results in a word (|x| <= 4)Word W ever appears immediately to the left/right of the wordCharacter Z appears in the wordSlide11
Unknown Words Learning
Change tag:From common noun to plural common noun if the word has suffix "-s"
From common noun to number if the word has character ". "From common noun to adjective if the word has character "-"From common noun to past participle verb if the word has suffix "-ed"From common noun to gerund or present participle verb if the word has suffix "-ing"To adjective if adding the suffix "-ly
" results in a wordTo adverb if the word has suffix "-ly"From common noun to number if the word "$" ever appears immediately to the leftFrom common noun to adjective if the word has suffix "-al"From noun to base form verb if the word "would" ever appears immediately to the leftSlide12
K-Best Tags
Modify "change" to "add" in the transformation templatesSlide13
k-Best Tagging Results
# of Rules
AccuracyAvg. # of tags per word096.51.00
5096.91.0210097.41.0415097.9
1.10
200
98.4
1.19
250
99.1
1.50Slide14
Future Work
Apply these techniques to other problemsLearning pronunciation networks for speech recognitionLearning mappings between sentences and semantic representationsSlide15
A Maximum Entropy Approach to Identifying Sentence Boundaries
Jeffrey C. Reynar and Adwait Ratnaparkhi
Department of Computer and Information ScienceUniversity of PennsylvaniaPhiladelphia, Pennsylvania~ USA{jcreynar, adwait}@unagi.cis.upenn.eduSlide16
Introduction
Many freely available natural language processing tools require their input to be divided into sentences, but make no mention of how to accomplish this.Punctuation marks, such as ., ?, and ! might be ambiguous.Issues with abbreviations:
E.g. The president lives in Washington, D.C.Slide17
Previous Work
to disambiguate sentence boundaries they usea decision tree (99.8% accuracy on Brown corpus) or
a neural network (98.5% accuracy on WSJ corpus)Slide18
Approach
Potential sentence boundary (., ? and !)Contextual informationThe PrefixThe Suffix
The presence of particular characters in the Prefix or SuffixWhether the Candidate is an honorific (e.g. Ms., Dr., Gen.)Whether the Candidate is a corporate designator (e.g. Corp., S.p.A., L.L.C.)Features of the word left/right of the CandidateList of abbreviationsSlide19
Maximum Entropy
H(p) = - Σp(b,c) log p(b,c
)Under following constraints:Σ p(b,c) * fj(b,c) = Σ
p'(b,c) * fj(b,c), 1 <= j <= kp(yes|c) > 0.5p(yes|c) = p(yes|c) / (p(
yes|c
) + p(
no|c
))Slide20
System Performance
WJS
BrownSentences2047851672
Candidate P. Marks3217361282Accuracy98.8%97.9%False Positives201
750
False Negatives
171
506Slide21
Conclusions
Achieved comparable (to state-of-the-art systems) accuracy with far less resources.