/
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Context-Free Grammar CSCI-GA.2590 – Lecture 3

Context-Free Grammar CSCI-GA.2590 – Lecture 3 - PowerPoint Presentation

relylancome
relylancome . @relylancome
Follow
344 views
Uploaded On 2020-06-22

Context-Free Grammar CSCI-GA.2590 – Lecture 3 - PPT Presentation

Ralph Grishman NYU A Grammar Formalism We have informally described the basic constructs of English grammar Now we want to introduce a formalism for representing these constructs a formalism that we can use as input to a ID: 783496

table art top parser1 art table parser1 top backtrack chases cat mice grammar set nyu productions chase cats bottom

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Context-Free Grammar CSCI-GA.2590 – Le..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Context-Free GrammarCSCI-GA.2590 – Lecture 3

Ralph Grishman

NYU

Slide2

A Grammar FormalismWe have informally described the basic constructs of English grammarNow we want to introduce a formalism for representing these constructsa formalism that we can use as input to a parsing procedure1/16/14NYU2

Slide3

Context-Free GrammarA context-free grammar consists ofa set of non-terminal symbols A, B, C, … ε Na set of terminal symbols a, b, c, … ε Ta start symbol S ε Na set of productions P of the form N (N U T)*1/16/14NYU3

Slide4

A Simple Context-Free GrammarA simple CFG:S  NP VPNP  catsNP  the catsNP  the old catsNP  miceVP  sleepVP  chase NP

1/16/14NYU4

Slide5

Derivation and LanguageIf A  β is a production of the grammar, we can rewrite α A γ  α β γ A derivation is a sequence of rewrite operations  …  …

 NP VP  cats VP  cats chase NPThe language generated by a CFG is the set of strings (sequences of terminals) which can be derived from the start symbol

S  …  …

 T* S NP VP

cats

VP

cats chase NP

cats chase mice

1/16/14

NYU

5

Slide6

PreterminalsIt is convenient to include a set of symbols called preterminals (corresponding to the parts of speech) which can be directly rewritten as terminals (words)This allows us to separate the productions into a set which generates sequences of preterminals (the “grammar”) and those which rewrite the preterminals as terminals (the “dictionary”)1/16/14NYU6

Slide7

A Grammar with Preterminalsgrammar:S  NP VPNP  NNP  ART NNP  ART ADJ NVP  VVP  V NPdictionary:

N  catsN  miceADJ  oldDET

 theV

 sleepV 

chase

1/16/14

NYU

7

Slide8

Grouping AlternatesTo make the grammar more compact, we group productions with the same left-hand side:S  NP VPNP  N | ART N | ART ADJ NVP  V | V NP1/16/14NYU8

Slide9

A grammar can be used togeneraterecognizeparseWhy parse?parsing assigns the sentence a structure that may be helpful in determining its meaning1/16/14NYU9

Slide10

vs Finite State LanguageCFGs are more powerful than finite-state grammars (regular expressions)FSG cannot generate center embeddingsS  ( S ) | xeven if FSG can capture the language, it may be unable to assign the nested structures we want

Slide11

A slightly bigger CFGsentence  np vpnp  ngroup | ngroup ppngroup  n | art n | art adj nvp  v | v

np | v vp | v np pp (auxilliary)pp  p np

(pp = prepositional phrase)

1/16/14

11

NYU

Slide12

AmbiguityMost sentences will have more than one parseGenerally different parses will reflect different meanings …“I saw the man with a telescope.”Can attach pp (“with a telescope”) under np or vp1/16/14NYU12

Slide13

A CFG with just 2 nonterminalsS  NP V | NP V NPNP  N | ART NOUN | ART ADJ Nuse this for tracing our parsers1/16/1413NYU

Slide14

Top-down parserrepeatexpand leftmost non-terminal using first production (save any alternative productions on backtrack stack)if we have matched entire sentence, quit (success)if we have generated a terminal which doesn't match sentence, pop choice point from stack (if stack is empty, quit (failure))1/16/14NYU14

Slide15

Top-down parser1/16/14NYU150: Sthe cat chases mice

Slide16

Top-down parser1/16/14NYU160: S1: NPthe cat chases mice2: V

backtrack table

0: S 

NP V NP

Slide17

Top-down parser1/16/14NYU170: S1: NP3: Nthe cat chases mice

2: V

backtrack table

0: S

NP V NP

1:NP

ART ADJ N

1: NP

ART N

Slide18

Top-down parser1/16/14NYU180: S1: NP3: ARTthe cat chases mice

4: N2: V

backtrack table

0: S

NP V NP

1:NP

ART ADJ N

Slide19

Top-down parser1/16/14NYU190: S1: NP3: ARTthe cat chases mice

4: ADJ2: V5: N

backtrack table

0: S

NP V NP

Slide20

Top-down parser1/16/14NYU200: S1: NPthe cat chases mice2: V

3: NP

backtrack table

Slide21

Top-down parser1/16/14NYU210: S1: NP3: Nthe cat chases mice

2: V3: NP

backtrack table

1:NP

ART ADJ N

1: NP

ART N

Slide22

Top-down parser1/16/14NYU220: S1: NP4: ARTthe cat chases mice

2: V3: NP5: N

backtrack table

1:NP

ART ADJ N

Slide23

Top-down parser1/16/14NYU230: S1: NP4: ARTthe cat chases mice

2: V3: NP5: N

6: N

parse!

backtrack table

1:NP

ART ADJ N

3: NP

ART ADJ N

3: NP

ART N

Slide24

Bottom-up parserBuilds a table where each row represents a parse tree node spanning the words from start up to end1/16/14NYU24symbolstart

endconstituentsN01-

Slide25

Bottom-up parserWe initialize the table with the parts-of –speech of each word …1/16/14NYU25symbolstartendconstituents

ART01-N12

-

V23

-

N

3

4

-

Slide26

Bottom-up parserWe initialize the table with the parts-of –speech of each word …remembering that many English words have several parts of speech1/16/14NYU26symbolstart

endconstituentsART01-N

1

2-V

2

3

-

N

2

3

-

N

3

4

-

Slide27

Bottom-up parserThen if these is a production AB C and we have entries for B and C with endB = startC, we add an entry for A with start = startB and end = endC

[see lecture notes for handling general productions]

1/16/14

NYU

27

node #

symbol

start

end

constituents

0

ART

0

1

-

1

N

1

2

-

2

V

2

3

-

3

N

2

3

-

4

N

3

4

-

5

NP

0

2

[0, 1]

Slide28

Bottom-up parser1/16/14NYU28node #symbolstart

endconstituents0ART01-

1N

12

-

2

V

2

3

-

3

N

2

3

-

4

N

3

4

-

5

NP

0

2

[0, 1]

6

NP

1

2

[1]

7

NP

2

3

[3]

8

NP

3

4

[4]

9

S

0

4

[5, 2, 8]

10

S

1

4

[6, 2, 8]

several more S’s

parse!