/
A Stochastic Parts Program and Noun Phrase Parser for Unres A Stochastic Parts Program and Noun Phrase Parser for Unres

A Stochastic Parts Program and Noun Phrase Parser for Unres - PowerPoint Presentation

mitsue-stanley
mitsue-stanley . @mitsue-stanley
Follow
398 views
Uploaded On 2016-06-13

A Stochastic Parts Program and Noun Phrase Parser for Unres - PPT Presentation

Kenneth Ward Church Ambiguity Resolution In A Reductionist Parser Atro Voutilainen amp Pasi Tapanainen Presented by Mat Kelly CS895 Webbased Information Retrieval Old Dominion University ID: 360800

rules ambiguity noun finite ambiguity rules finite noun parsing parser

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "A Stochastic Parts Program and Noun Phra..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text

Kenneth

Ward ChurchAmbiguity Resolution In A Reductionist ParserAtro Voutilainen & Pasi Tapanainen

Presented by Mat Kelly

CS895 – Web-based Information Retrieval

Old Dominion University

October 18, 2011Slide2

Church’s IdeasObjective: Develop system to tag speech and resolve ambiguity

When ambiguity occurs, use stochastic processes to determine optimal lexical possibilityI

seeabird{{{{PPSSNPVBUH

AT

IN

NN

PPSS =

pronoun

NP =

proper noun

VB =

v

erb

UH =

interjection

IN =

p

reposition

AT =

a

rticle

NN =

n

ounSlide3

Various AmbiguitiesNoun-Verb Ambiguity: wind

e.g. wind your watch versus the wind blowsNoun-Complementizer Ambiguity: that

Did you see that? vs. It is a shame that he is leaving. Noun-Noun and Adjective-Noun Distinctione.g. oily FLUID versus TRANSMISSION fluidFirst puts emphasis on fluid, second on

transmissionSlide4

Overcoming Lexical AmbiguityLinear time dynamic programming algorithm that optimizes product of lexical probabilities

Recognize that no Markov process can sufficiently recognize English grammar (

Chomsky)e.g. “The man who said that statement is arriving today”Long distances across length > 1 prevents Markovian analysis

word

word

word

dependent onSlide5

Parsing DifficultiesNo amount of syntactic sugar will help resolve the ambiguity*:

Time flies like an arrow.Flying planes can be dangerous.Parser must allow for multiple possibilities*

Voutilainen states otherwise Slide6

Parsing Impossibilities

Even the parser that considers likelihood will sometimes become confused with “garden path” sentences

The horse raced past the barn fell.Other than these, there is always a unique best interpretation that can be found with very limited resources.Past tense verb– or –Passive participleSlide7

Considering LikelihoodHave/VB the students take the exam. (Imperative)Have/AUX the students taken the exam? (question)

Fidditch (a parser) proposed lexical disambiguation rule [**n+prep] != n [npstarters

]i.e. if a noun/preposition is followed by something that starts with a noun phrase, rule out the noun possibilityMost lexical rules in Fidditch can be reformulated in terms of bigram and trigram statisticsSlide8

…With the Help of a DictionaryDictionaries tend to convey possibilities and not likelihood.Initially consider all assignments for words

[NP [N I] [N see] [N a] [N bird]][S [NP [N I] [N see] [N a]] [VP [V bird]]]Some parts of speech assignments are more likely than othersSlide9

Cross-Referencing Likelihood With The Tagged Brown Corpus

A Manually Tagged CorpusParsing Likelihood Based on Brown Corpus

WordParts of SpeechI

PPSS (pronoun)

5837

NP (proper noun)

1

see

VB (verb)

771

UH (interjection)

1

a

AT (article)

23013

IN (French)

6

bird

NN (noun)

26

Probability that “I” is a pronoun = 5837/5838

frequency(PPSS|“I

”)/frequency(“I”)Slide10

Contextual ProbabilityEst. Probability observing part of speech X:

Given POS Y and Z, X = freq(XYZ)/freq(YZ)Observing a verb before an article and a noun is ratio of freq(VB,AT,NN)/freq(AT,NN)Slide11

Enumerate all potential parsings

I

seea

bird

PPSS

VB

AT

NN

PPSS

VB

IN

NN

PPSS

UH

AT

NN

PPSS

UH

IN

NN

NP

VB

AT

NN

NP

VB

IN

NN

NP

UH

AT

NN

NP

UH

IN

NN

Score

each sequence by product of lexical probability, contextual probability and select best sequence

Not necessary to enumerate all possible assignments, as scoring algorithm

cannot see more than two words awaySlide12

Complexity ReductionSome sequences cannot possibly compete with others in the parsing process and are abandonedOnly

O(n) paths are enumeratedSlide13

An exampleI

see a bird

Find all assignments of parts of speech and score partial sequence (log probabilities)(-4.848072 “NN”)(-7.4453945, “AT”, “NN”)(-15.01957, “IN” “NN”)(-10.1914 “VB” “AT” “NN”)(-18.54318” “VB” “IN” “NN)(-29.974142 “UH” “AT” “NN”)(-36.53299 “UH” “IN” “NN”)

(-12.927581 “PPSS” “VB” AT” “NN”)

(-24.177242 “NP” “VB” “AT” “NN”)

(-35.667458 “PPSS” “UH” “AT” “NN”)

(-44.33943 “NP” “UH” “AT” “NN”)

(-12.262333 “” “” “PPSS” “VB” “AT” “NN”)

I/PPSS

see/VB

a/AT

bird/NN

.

Note that all four possible paths derived from French usage of “IN” score less than others and there is not way additional input will make a difference

Continues for two more iterations to obtain path values to put word 3 and 4 out of rangeSlide14

Further Attempts at Ambiguity Resolution (Voutilaninen’s Paper)

Assigning annotations using a finite-state parserKnowledge-based reductionist grammatical analysis will be facilitated, introducing more ambiguity.The amount of ambiguity, as shown, does not predict the speed of analysisSlide15

Constraint Grammar (CG) ParsingPreprocessing and morphological analysis

Disambiguation of morphological (part-of-speech) ambiguitiesMapping of syntactic functions onto morphological categoriesDisambiguation of syntactic functionsSlide16

Morphological Description(“<*i

>” (“i” <*> ABBR NOM SG) (“I” <*> <NonMod> PRON PERS NOM SG1))(“<see>”

(“see” <SVO> V SUBJUNCTIVE VFIN) (“see” <SVO> V IMP VFIN) (“see” <SVO> V INF) (“see” <SVO> V PRES –SG3 VFIN))(“<a>” (“a” <Indef> DET CENTRAL ART SG))(“<bird>” (“bird” <SV> V SUBJUNCTIVE VFIN) (“bird” <SV> V IMP VFIN) (“bird” <SV> V INF) (“bird” <SV V PRES –SG3 VFIN) (“bird” N NOM SG))(“<$.>”)

(“<*

i

>”

(“I” <*> <

NonMod

> PRON PERS NOM SG1))

(“<see>”

(“see” <SVO> V PRES –SG3 VFIN))

(“<a>”

(“a” <

Indef

>DET CENTRAL ART SG))

(“<bird>”

(“bird” N NOM SG))

(“<$.>”)

Removing AmbiguitySlide17

Disambiguator PerformanceBest known competitors make misprediction

about part of speech in up to 5% of wordsThis (ENGCG) disambiguator makes a false prediction in only up to 0.3% of all casesSlide18

Finite-State SyntaxAll three types of structural ambiguity are presented in parallel

Morphological, clause boundary and syntacticNo subgrammars for morphological disambiguation are needed – one uniform rule components suffices for expressing grammar.FS Parser – each sentence reading separately;

CG - only distinguishes between alternative word readingsFS rules only have to parse one unambiguous sentence at a time – improves parsing accuracySyntax more expressive than CGFull power of RegEx availableSlide19

The Implication RuleExpress distributions in a straightforward, positive fashion and are very compact

Several CG rules that express bits and pieces of the same grammatical phenomenon can usually be expressed with one or two transparent finite state rulesSlide20

ExperimentationExperiment: 200 syntactic

applied to a test textObjective: Were morphological ambiguities that are too hard for ENGCG disambiguator to resolve resolvable with more expressive grammatical description and a more powerful parsing formalism?Difficult to write parser as mature as ENGCG, so some of the rules were “inspired” by the test textThough all rules were tested against various other corpora to assure generalitySlide21

ExperimentationTest data first analyzed with ENGCG disambiguator – Of 1400 word, 43 remained ambiguous to morphological category

Then, finite-state parser enriched text (creating more ambiguities)After parsing was complete with FS parser, only 3 words remained morphologically ambiguousThus, introduction of more descriptive elements into sentence resolved almost all 43 ambiguitiesSlide22

CaveatMorphological ambiguities went from 43

4But syntactic ambiguities raised amounted to 64 sentences: 48 sentences (75%) received a single syntactic analysis13 sentences received two analyses

1 sentence received 3 analyses2 received 4 analysesA new notation was developedSlide23

@@

smoking

PCP@mvSUBJ@@cigarettesN@obj

@

inspires

V

@MV

MAINC@

@

the

DET

@>N

@

fat

A

@>N

@

butcher‘s

N

@>N

@

wife

N

@OBJ

@

and

CC

@CC

@

daughters

N

@OBJ

@

.

FULL

STOP

@@

A Tagging Example

Add boundary markers

Gives words functional tags

@

mv

= main verb in non-finite construction@MV inspires in a main verb in a finite construction@MAINC main clause tag to distinguish primary verb@>N determiner or premodifier of a nominal [[fat butcher’s] wife] vs. [[fat [butcher’s wife] Irresolvable ambiguity Kept convert by notation Uppercase used for finite construction Eases grammarian’s task If we could not treat these non-finite clauses separately, extra checks for further subjects in non- finite clauses would have been necessary!Slide24

@@

Henry

N@SUBJ@dislikes

V

@MV

@MAINC

@

her

PRON

@subj

@

leaving

PCP1

@mv

OBJ@

@

so

ADV

@>A

@

early

ADV

@ADVL

@

.

FULL

STOP

@OBJ

@@

Grouping Non-Finite Clauses

Note the two simplex subjects in the same clause

Subject in finite clause with main verb

Non-finite clause with main verb

Note the difference in case

Unable to attach adverb “so early” to “dislikes” or “leaving”

Structurally irresolvable ambiguity

Description through notation is shallowSlide25

@@

What

PRON@SUBJ@

makes

V

@MV

SUBJ@

@

them

PRON

@OBJ

@

acceptable

A

@OC

@/

is

V

@MV

MAINC@

@/

that

CS

@CS

@

they

PRON

@SUBJ

@

have

V

@MV

SC@

@

different

A

@>N

@

verbal

A

@>N

@

regents

N@OBJ@.FULLSTOP@@Extended Subjects “What makes them acceptable” acts as finite clause“that they have different verbal regents” acts as a subject complementSlide26

@@

What

PRON@>>P@

are

V

@AUX

@

you

PRON

@SUBJ

@

talking

PCP1

@MV

MAINC@

@

about

<Deferred>

PREP

@ADVL

@

?

QUEST-ION

@@

Deferred Prepositions

@>>P

signifies a

delayed preposition

i.e. “about” has no right-hand context, it is either prior or non-existent

Adverbs can also be deferred in this fashion

Without a main verb:

“Tolstoy her greatest novelist” granted clause

status

Signified

by clause boundary symbol

@\

Note no function tag (only main verb gets these)

@@

This

PRON

@SUBJ

@isV@MVMAINC@@theDET@>N@houseN@SC@/shePRON@SUBJ@wasV@AUX@lookingPCP1@MVN<@@forPREP@ADVL@.FULLSTOP@@@@PushkinN@SUBJ@wasV

@MV

MAINC@

@

Russia’s

N

@>N

@

greatest

A

@>N

@

poet

N

@SC

@\

,

COMMA

@

and

CC

@CC

@

Tolstoy

N

@SUBJ

@

her

PRON

@>N

@

greatest

A

@>N

@

novelist

N

@SC

@

.

FULL

STOP

@@Slide27

Ambiguity Resolution with aFinite-State (FS) Parser

10 million sentence readings1032 readings if each boundary between each word is made four-ways ambiguous1064

readings if all syntactic ambiguities are addedIn isolation, each word is ambiguous in 1-70 waysWe can show that # of readings does not alone predict parsing complexity.A pressure lubrication system is employed, the pump, driven from the distributor shaft extension, drawing oil from the sump through a strainer and distributing it through the cartridge oil filter to a main gallery in the cylinder block casting.Slide28

Reduction of Parsing Complexity in Reducing AmbiguityWindow of more than 2 or 3 words requires excessively hard computation.

Acquiring collocation matrices based on 4/5-gram requires tagged corpora >> than current manually validated tagged ones.Mispredictions accumulate but more mispredictions are likely to occur in later stages with this schemeNo reason to use unsure probabilistic information as along as we use defined linguistic knowledgeSlide29

Degree of Complexity ReductionIllegitimate readings are discarded along the wayIn a sentence with is 10

66-way ambiguous might have only 1045 ambiguities left after initial processing through an automaton.Takes a fraction of a second, reduced readings by 1/10

21Can then apply another rule, repeat.Reduces ambiguity to an acceptable level quicklySlide30

Applying Rules Prior to Parsing: Four Methods

Process rules iteratively: Takes a long timeOrder rule automata before parsing – the most efficient rules are applied first

Process all rules togetherUse extra information to direct parsingRuleAutomaton

Sentence

Automaton

Rule

Automaton

Intersection

Result

Iteratively Repeat

w

ith all rules

End

ResultSlide31

Before Parsing, Reduce Number of AutomataA set of them can be easily combined using intersection

Not all rules are needed in parsing because some categories might not be present in sentence – select applicable rules at runtime.

method12345Non-opt.310007301500500

290

Optimized

7000

840

350

110

30

Execution Times (sec.) Slide32

Summing up Process DescribedPreprocess text (text normalization and boundary detection)

Morphologically analyze and enrich text with syntactic and clause boundary ambiguitiesTransform each sentence into FSASelect relevant rules for sentence

Intersect a couple of rule groups with the sentence automatonApply all remaining rules in parallelRank resulting multiple analyses according to heuristic rules and select best one desire totally unambiguous resultSlide33

ConclusionsChurch: Tag parts of speech

Disregard illigitimate permutations based on unlikelihood (through prob analysis)Voutilainen:

The grammar rules, not amt of ambiguity determines hardness of ambiguity resolution. Tag parts of speech but consider finite and non-finite constructions to reduce complexity of overcoming ambiguity.Slide34

ReferencesChurch, K. W. (1988). A Stochastic Parts Program and Noun Phrase

Parser for Unrestricted Text. Proceedings of the second conference onApplied natural language processing (Vol. 136, pp. 136-143).Association for Computational Linguistics. Retrieved fromhttp://portal.acm.org/citation.cfm?id=974260

Voutilainen, A., & Tapanainen, P. (1995). Ambiguity resolution in areductionistic parser. Sixth Conference of the European Chapter of theAssociation for Computational Linguistics, 4(Keskuskatu 8), 394-403.Association for Computational Linguistics. Retrieved fromhttp://arxiv.org/abs/cmp-lg/9502013