/
Knowledge and Tree-Edits in Learnable Entailment Knowledge and Tree-Edits in Learnable Entailment

Knowledge and Tree-Edits in Learnable Entailment - PowerPoint Presentation

billiontins
billiontins . @billiontins
Follow
343 views
Uploaded On 2020-06-22

Knowledge and Tree-Edits in Learnable Entailment - PPT Presentation

Proofs Asher Stern Amnon Lotan Shachar Mirkin Eyal Shnarch Lili Kotlerman Jonathan Berant and Ido Dagan TAC November 2011 NIST Gaithersburg Maryland USA ID: 783348

proof police cost boy police proof boy cost feature operation located child define vector tree find model based learning

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Knowledge and Tree-Edits in Learnable En..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Knowledge and Tree-Edits in Learnable Entailment

Proofs

Asher Stern, Amnon Lotan, Shachar Mirkin, Eyal Shnarch, Lili Kotlerman, Jonathan Berant and Ido DaganTACNovember 2011, NIST, Gaithersburg, Maryland, USADownload at: http://www.cs.biu.ac.il/~nlp/downloads/biutee

BIUTEE

Slide2

RTE

Classify a (T,H)

pair asENTAILING or NON-ENTAILING2T: The boy was located by the police.H: Eventually, the police found the child.Example

Slide3

Matching vs. Transformations

Matching

Sequence of transformations (A proof)Tree-EditsComplete proofsEstimate confidenceKnowledge based Entailment RulesLinguistically motivatedFormalize many types of knowledge3T = T0 → T1 → T2 → ... → Tn = H

Slide4

Transformation based RTE - Example

T = T

0 → T1 → T2 → ... → Tn = HText: The boy was located by the police.Hypothesis: Eventually, the police found the child.4

Slide5

Transformation based RTE - Example

T = T

0 → T1 → T2 → ... → Tn = HText: The boy was located by the police.The police located the boy.The police found the boy.The police found the child.Hypothesis: Eventually, the police found the child.

5

Slide6

Transformation based RTE - Example

T = T

0 → T1 → T2 → ... → Tn = H6

Slide7

BIUTEE Goals

Tree EditsComplete proofs

Estimate confidenceEntailment RulesLinguistically motivatedFormalize many types of knowledgeBIUTEEIntegrates the benefits of both worlds7

Slide8

Challenges / System Components

generate linguistically motivated complete proofs?

estimate proof confidence?find the best proof?learn the model parameters?How to8

Slide9

1. Generate linguistically motivated complete proofs

9

Slide10

Entailment Rules

boy

childGeneric SyntacticLexical SyntacticLexicalBar-Haim et al. 2007. Semantic inference at the lexical-syntactic level.

Slide11

Extended Tree Edits (On The Fly Operations)

Predefined custom tree editsInsert node on the fly

Move node / move sub-tree on the flyFlip part of speech…Heuristically capture linguistic phenomenaOperation definitionFeatures definition11

Slide12

Proof over Parse Trees - Example

T = T

0 → T1 → T2 → ... → Tn = HText: The boy was located by the police.Passive to activeThe police located the boy.X locate Y  X find YThe police found the boy.Boy  childThe police found the child.Insertion on the flyHypothesis: Eventually, the police found the child.12

Slide13

2. Estimate proof confidence

13

Slide14

Cost based Model

Define operation costAssesses operation’s validityRepresent each operation as a feature vectorCost is linear combination of feature values

Define proof cost as the sum of the operations’ costsClassify: entailment if and only if proof cost is smaller than a threshold14

Slide15

Feature vector representation

Define operation costRepresent each operation as a feature vector

Features (Insert-Named-Entity, Insert-Verb, … , WordNet, Lin, DIRT, …)The police located the boy.DIRT: X locate Y  X find Y (score = 0.9)The police found the boy.(0,0,…,0.457,…,0)(0 ,0,…,0,…,0)Feature vector that represents the operation15An operationA downward function of score

Slide16

Cost based Model

Define operation costCost is linear combination of feature values

Cost = weight-vector * feature-vectorWeight-vector is learned automatically16

Slide17

Confidence Model

Define operation costRepresent each operation as a feature vector

Define proof cost as the sum of the operations’ costsCost of proofWeight vectorVector represents the proof.Define

Slide18

Feature vector representation - example

T = T

0 → T1 → T2 → ... → Tn = H(0,0,……………….………..,1,0)(0,0,………..……0.457,..,0,0)(0,0,..…0.5,.……….……..,0,0)(0,0,1,……..…….…..…....,0,0)(0,0,1..0.5..…0.457,....…1,0)+++=18Text:

The boy was located by the police.

Passive to active

The police located the boy.

X locate Y

X find Y

The police found the boy.

Boy

child

The police found the child.

Insertion on the fly

Hypothesis:

Eventually, the police found the child.

Slide19

Cost based Model

Define operation cost

Represent each operation as a feature vectorDefine proof cost as the sum of the operations’ costsClassify: “entailing” if and only if proof cost is smaller than a threshold19Learn

Slide20

3. Find the best proof

20

Slide21

Search the best proof

21

T HProof #1Proof #2Proof #3Proof #4

Slide22

Search the best proof

22

Need to find the “best” proof“Best Proof” = proof with lowest costAssuming a weight vector is givenSearch space is exponentialAI style search algorithmProof #1Proof #2Proof #3Proof #4T  HProof #1Proof #2Proof #3Proof #4T  H

Slide23

4. Learn model parameters

23

Slide24

Learning

Goal: Learn parameters (w, b)Use a linear learning algorithm

logistic regression, SVM, etc.24

Slide25

25

Inference vs. Learning

Training samplesVector representationLearning algorithmw,bBest ProofsFeature extractionFeature extraction

Slide26

26

Inference vs. Learning

Training samplesVector representationLearning algorithmw,bBest ProofsFeature extraction

Slide27

27

Iterative Learning Scheme

Training samplesVector representationLearning algorithmw,bBest Proofs1. W=reasonable guess2. Find the best proofs3. Learn new w and b4. Repeat to step 2

Slide28

Summary- System Components

Generate syntactically motivated complete proofs?

Entailment rulesOn the fly operations (Extended Tree Edit Operations)Estimate proof validity?Confidence ModelFind the best proof?Search AlgorithmLearn the model parameters?Iterative Learning SchemeHow to28

Slide29

Results RTE7

29

IDKnowledge ResourcesPrecision %Recall %F1 %BIU1WordNet, Directional Similarity38.9747.4042.77BIU2WordNet, Directional Similarity, Wikipedia41.8144.1142.93BIU3WordNet, Directional Similarity, Wikipedia, FrameNet, Geographical database39.2645.9542.34BIUTEE 2011 on RTE 6 (F1 %)Base line (Use IR top-5 relevance)34.63Median (September 2010)36.14Best (September 2010)48.01Our system49.54

Slide30

Conclusions

Inference via sequence of transformationsKnowledge

Extended Tree EditsProof confidence estimationResultsBetter than median on RTE7Best on RTE6Open Source30http://www.cs.biu.ac.il/~nlp/downloads/biutee

Slide31

Thank You

http

://www.cs.biu.ac.il/~nlp/downloads/biutee