Kenneth Ward Church Ambiguity Resolution In A Reductionist Parser Atro Voutilainen amp Pasi Tapanainen Presented by Mat Kelly CS895 Webbased Information Retrieval Old Dominion University ID: 360800
Download Presentation The PPT/PDF document "A Stochastic Parts Program and Noun Phra..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text
Kenneth
Ward ChurchAmbiguity Resolution In A Reductionist ParserAtro Voutilainen & Pasi Tapanainen
Presented by Mat Kelly
CS895 – Web-based Information Retrieval
Old Dominion University
October 18, 2011Slide2
Church’s IdeasObjective: Develop system to tag speech and resolve ambiguity
When ambiguity occurs, use stochastic processes to determine optimal lexical possibilityI
seeabird{{{{PPSSNPVBUH
AT
IN
NN
PPSS =
pronoun
NP =
proper noun
VB =
v
erb
UH =
interjection
IN =
p
reposition
AT =
a
rticle
NN =
n
ounSlide3
Various AmbiguitiesNoun-Verb Ambiguity: wind
e.g. wind your watch versus the wind blowsNoun-Complementizer Ambiguity: that
Did you see that? vs. It is a shame that he is leaving. Noun-Noun and Adjective-Noun Distinctione.g. oily FLUID versus TRANSMISSION fluidFirst puts emphasis on fluid, second on
transmissionSlide4
Overcoming Lexical AmbiguityLinear time dynamic programming algorithm that optimizes product of lexical probabilities
Recognize that no Markov process can sufficiently recognize English grammar (
Chomsky)e.g. “The man who said that statement is arriving today”Long distances across length > 1 prevents Markovian analysis
word
word
word
dependent onSlide5
Parsing DifficultiesNo amount of syntactic sugar will help resolve the ambiguity*:
Time flies like an arrow.Flying planes can be dangerous.Parser must allow for multiple possibilities*
Voutilainen states otherwise Slide6
Parsing Impossibilities
Even the parser that considers likelihood will sometimes become confused with “garden path” sentences
The horse raced past the barn fell.Other than these, there is always a unique best interpretation that can be found with very limited resources.Past tense verb– or –Passive participleSlide7
Considering LikelihoodHave/VB the students take the exam. (Imperative)Have/AUX the students taken the exam? (question)
Fidditch (a parser) proposed lexical disambiguation rule [**n+prep] != n [npstarters
]i.e. if a noun/preposition is followed by something that starts with a noun phrase, rule out the noun possibilityMost lexical rules in Fidditch can be reformulated in terms of bigram and trigram statisticsSlide8
…With the Help of a DictionaryDictionaries tend to convey possibilities and not likelihood.Initially consider all assignments for words
[NP [N I] [N see] [N a] [N bird]][S [NP [N I] [N see] [N a]] [VP [V bird]]]Some parts of speech assignments are more likely than othersSlide9
Cross-Referencing Likelihood With The Tagged Brown Corpus
A Manually Tagged CorpusParsing Likelihood Based on Brown Corpus
WordParts of SpeechI
PPSS (pronoun)
5837
NP (proper noun)
1
see
VB (verb)
771
UH (interjection)
1
a
AT (article)
23013
IN (French)
6
bird
NN (noun)
26
Probability that “I” is a pronoun = 5837/5838
frequency(PPSS|“I
”)/frequency(“I”)Slide10
Contextual ProbabilityEst. Probability observing part of speech X:
Given POS Y and Z, X = freq(XYZ)/freq(YZ)Observing a verb before an article and a noun is ratio of freq(VB,AT,NN)/freq(AT,NN)Slide11
Enumerate all potential parsings
I
seea
bird
PPSS
VB
AT
NN
PPSS
VB
IN
NN
PPSS
UH
AT
NN
PPSS
UH
IN
NN
NP
VB
AT
NN
NP
VB
IN
NN
NP
UH
AT
NN
NP
UH
IN
NN
Score
each sequence by product of lexical probability, contextual probability and select best sequence
Not necessary to enumerate all possible assignments, as scoring algorithm
cannot see more than two words awaySlide12
Complexity ReductionSome sequences cannot possibly compete with others in the parsing process and are abandonedOnly
O(n) paths are enumeratedSlide13
An exampleI
see a bird
Find all assignments of parts of speech and score partial sequence (log probabilities)(-4.848072 “NN”)(-7.4453945, “AT”, “NN”)(-15.01957, “IN” “NN”)(-10.1914 “VB” “AT” “NN”)(-18.54318” “VB” “IN” “NN)(-29.974142 “UH” “AT” “NN”)(-36.53299 “UH” “IN” “NN”)
(-12.927581 “PPSS” “VB” AT” “NN”)
(-24.177242 “NP” “VB” “AT” “NN”)
(-35.667458 “PPSS” “UH” “AT” “NN”)
(-44.33943 “NP” “UH” “AT” “NN”)
(-12.262333 “” “” “PPSS” “VB” “AT” “NN”)
I/PPSS
see/VB
a/AT
bird/NN
.
Note that all four possible paths derived from French usage of “IN” score less than others and there is not way additional input will make a difference
Continues for two more iterations to obtain path values to put word 3 and 4 out of rangeSlide14
Further Attempts at Ambiguity Resolution (Voutilaninen’s Paper)
Assigning annotations using a finite-state parserKnowledge-based reductionist grammatical analysis will be facilitated, introducing more ambiguity.The amount of ambiguity, as shown, does not predict the speed of analysisSlide15
Constraint Grammar (CG) ParsingPreprocessing and morphological analysis
Disambiguation of morphological (part-of-speech) ambiguitiesMapping of syntactic functions onto morphological categoriesDisambiguation of syntactic functionsSlide16
Morphological Description(“<*i
>” (“i” <*> ABBR NOM SG) (“I” <*> <NonMod> PRON PERS NOM SG1))(“<see>”
(“see” <SVO> V SUBJUNCTIVE VFIN) (“see” <SVO> V IMP VFIN) (“see” <SVO> V INF) (“see” <SVO> V PRES –SG3 VFIN))(“<a>” (“a” <Indef> DET CENTRAL ART SG))(“<bird>” (“bird” <SV> V SUBJUNCTIVE VFIN) (“bird” <SV> V IMP VFIN) (“bird” <SV> V INF) (“bird” <SV V PRES –SG3 VFIN) (“bird” N NOM SG))(“<$.>”)
(“<*
i
>”
(“I” <*> <
NonMod
> PRON PERS NOM SG1))
(“<see>”
(“see” <SVO> V PRES –SG3 VFIN))
(“<a>”
(“a” <
Indef
>DET CENTRAL ART SG))
(“<bird>”
(“bird” N NOM SG))
(“<$.>”)
Removing AmbiguitySlide17
Disambiguator PerformanceBest known competitors make misprediction
about part of speech in up to 5% of wordsThis (ENGCG) disambiguator makes a false prediction in only up to 0.3% of all casesSlide18
Finite-State SyntaxAll three types of structural ambiguity are presented in parallel
Morphological, clause boundary and syntacticNo subgrammars for morphological disambiguation are needed – one uniform rule components suffices for expressing grammar.FS Parser – each sentence reading separately;
CG - only distinguishes between alternative word readingsFS rules only have to parse one unambiguous sentence at a time – improves parsing accuracySyntax more expressive than CGFull power of RegEx availableSlide19
The Implication RuleExpress distributions in a straightforward, positive fashion and are very compact
Several CG rules that express bits and pieces of the same grammatical phenomenon can usually be expressed with one or two transparent finite state rulesSlide20
ExperimentationExperiment: 200 syntactic
applied to a test textObjective: Were morphological ambiguities that are too hard for ENGCG disambiguator to resolve resolvable with more expressive grammatical description and a more powerful parsing formalism?Difficult to write parser as mature as ENGCG, so some of the rules were “inspired” by the test textThough all rules were tested against various other corpora to assure generalitySlide21
ExperimentationTest data first analyzed with ENGCG disambiguator – Of 1400 word, 43 remained ambiguous to morphological category
Then, finite-state parser enriched text (creating more ambiguities)After parsing was complete with FS parser, only 3 words remained morphologically ambiguousThus, introduction of more descriptive elements into sentence resolved almost all 43 ambiguitiesSlide22
CaveatMorphological ambiguities went from 43
4But syntactic ambiguities raised amounted to 64 sentences: 48 sentences (75%) received a single syntactic analysis13 sentences received two analyses
1 sentence received 3 analyses2 received 4 analysesA new notation was developedSlide23
@@
smoking
PCP@mvSUBJ@@cigarettesN@obj
@
inspires
V
@MV
MAINC@
@
the
DET
@>N
@
fat
A
@>N
@
butcher‘s
N
@>N
@
wife
N
@OBJ
@
and
CC
@CC
@
daughters
N
@OBJ
@
.
FULL
STOP
@@
A Tagging Example
Add boundary markers
Gives words functional tags
@
mv
= main verb in non-finite construction@MV inspires in a main verb in a finite construction@MAINC main clause tag to distinguish primary verb@>N determiner or premodifier of a nominal [[fat butcher’s] wife] vs. [[fat [butcher’s wife] Irresolvable ambiguity Kept convert by notation Uppercase used for finite construction Eases grammarian’s task If we could not treat these non-finite clauses separately, extra checks for further subjects in non- finite clauses would have been necessary!Slide24
@@
Henry
N@SUBJ@dislikes
V
@MV
@MAINC
@
her
PRON
@subj
@
leaving
PCP1
@mv
OBJ@
@
so
ADV
@>A
@
early
ADV
@ADVL
@
.
FULL
STOP
@OBJ
@@
Grouping Non-Finite Clauses
Note the two simplex subjects in the same clause
Subject in finite clause with main verb
Non-finite clause with main verb
Note the difference in case
Unable to attach adverb “so early” to “dislikes” or “leaving”
Structurally irresolvable ambiguity
Description through notation is shallowSlide25
@@
What
PRON@SUBJ@
makes
V
@MV
SUBJ@
@
them
PRON
@OBJ
@
acceptable
A
@OC
@/
is
V
@MV
MAINC@
@/
that
CS
@CS
@
they
PRON
@SUBJ
@
have
V
@MV
SC@
@
different
A
@>N
@
verbal
A
@>N
@
regents
N@OBJ@.FULLSTOP@@Extended Subjects “What makes them acceptable” acts as finite clause“that they have different verbal regents” acts as a subject complementSlide26
@@
What
PRON@>>P@
are
V
@AUX
@
you
PRON
@SUBJ
@
talking
PCP1
@MV
MAINC@
@
about
<Deferred>
PREP
@ADVL
@
?
QUEST-ION
@@
Deferred Prepositions
@>>P
signifies a
delayed preposition
i.e. “about” has no right-hand context, it is either prior or non-existent
Adverbs can also be deferred in this fashion
Without a main verb:
“Tolstoy her greatest novelist” granted clause
status
Signified
by clause boundary symbol
@\
Note no function tag (only main verb gets these)
@@
This
PRON
@SUBJ
@isV@MVMAINC@@theDET@>N@houseN@SC@/shePRON@SUBJ@wasV@AUX@lookingPCP1@MVN<@@forPREP@ADVL@.FULLSTOP@@@@PushkinN@SUBJ@wasV
@MV
MAINC@
@
Russia’s
N
@>N
@
greatest
A
@>N
@
poet
N
@SC
@\
,
COMMA
@
and
CC
@CC
@
Tolstoy
N
@SUBJ
@
her
PRON
@>N
@
greatest
A
@>N
@
novelist
N
@SC
@
.
FULL
STOP
@@Slide27
Ambiguity Resolution with aFinite-State (FS) Parser
10 million sentence readings1032 readings if each boundary between each word is made four-ways ambiguous1064
readings if all syntactic ambiguities are addedIn isolation, each word is ambiguous in 1-70 waysWe can show that # of readings does not alone predict parsing complexity.A pressure lubrication system is employed, the pump, driven from the distributor shaft extension, drawing oil from the sump through a strainer and distributing it through the cartridge oil filter to a main gallery in the cylinder block casting.Slide28
Reduction of Parsing Complexity in Reducing AmbiguityWindow of more than 2 or 3 words requires excessively hard computation.
Acquiring collocation matrices based on 4/5-gram requires tagged corpora >> than current manually validated tagged ones.Mispredictions accumulate but more mispredictions are likely to occur in later stages with this schemeNo reason to use unsure probabilistic information as along as we use defined linguistic knowledgeSlide29
Degree of Complexity ReductionIllegitimate readings are discarded along the wayIn a sentence with is 10
66-way ambiguous might have only 1045 ambiguities left after initial processing through an automaton.Takes a fraction of a second, reduced readings by 1/10
21Can then apply another rule, repeat.Reduces ambiguity to an acceptable level quicklySlide30
Applying Rules Prior to Parsing: Four Methods
Process rules iteratively: Takes a long timeOrder rule automata before parsing – the most efficient rules are applied first
Process all rules togetherUse extra information to direct parsingRuleAutomaton
Sentence
Automaton
Rule
Automaton
∩
Intersection
Result
Iteratively Repeat
w
ith all rules
End
ResultSlide31
Before Parsing, Reduce Number of AutomataA set of them can be easily combined using intersection
Not all rules are needed in parsing because some categories might not be present in sentence – select applicable rules at runtime.
method12345Non-opt.310007301500500
290
Optimized
7000
840
350
110
30
Execution Times (sec.) Slide32
Summing up Process DescribedPreprocess text (text normalization and boundary detection)
Morphologically analyze and enrich text with syntactic and clause boundary ambiguitiesTransform each sentence into FSASelect relevant rules for sentence
Intersect a couple of rule groups with the sentence automatonApply all remaining rules in parallelRank resulting multiple analyses according to heuristic rules and select best one desire totally unambiguous resultSlide33
ConclusionsChurch: Tag parts of speech
Disregard illigitimate permutations based on unlikelihood (through prob analysis)Voutilainen:
The grammar rules, not amt of ambiguity determines hardness of ambiguity resolution. Tag parts of speech but consider finite and non-finite constructions to reduce complexity of overcoming ambiguity.Slide34
ReferencesChurch, K. W. (1988). A Stochastic Parts Program and Noun Phrase
Parser for Unrestricted Text. Proceedings of the second conference onApplied natural language processing (Vol. 136, pp. 136-143).Association for Computational Linguistics. Retrieved fromhttp://portal.acm.org/citation.cfm?id=974260
Voutilainen, A., & Tapanainen, P. (1995). Ambiguity resolution in areductionistic parser. Sixth Conference of the European Chapter of theAssociation for Computational Linguistics, 4(Keskuskatu 8), 394-403.Association for Computational Linguistics. Retrieved fromhttp://arxiv.org/abs/cmp-lg/9502013