/
Toward an Open Source Textual Entailment Platform Toward an Open Source Textual Entailment Platform

Toward an Open Source Textual Entailment Platform - PowerPoint Presentation

debby-jeon
debby-jeon . @debby-jeon
Follow
414 views
Uploaded On 2016-10-16

Toward an Open Source Textual Entailment Platform - PPT Presentation

Excitement Project Bernardo Magnini on behalf of the Excitement consortium 1 STS workshop NYC March 1213 2012 Excitement Project EXploring Customer Interactions through Textual EntailMENT ID: 476631

textual entailment platform component entailment textual component platform analysis data 2012 march workshop excitement nyc sts eda algorithm compute

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Toward an Open Source Textual Entailment..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Toward an Open Source Textual Entailment Platform (Excitement Project)

Bernardo Magnini(on behalf of the Excitement consortium)

1

STS workshop, NYC March 12-13 2012Slide2

Excitement Project

EXploring Customer Interactions through Textual EntailMENT

Started 1/1/2012; Duration 3 years, 3,5M € fundingAcademic partnersBar-Ilan

University, Ramat

Gan

, Israel (I. Dagan)DFKI, Saarbrücken, Germany (G. Neumann)Fondazione Bruno Kessler, Povo, Italy (B. Magnini)University of Heidelberg, Germany (S. Pado)Industrial partnersNICE, Ra'anana, Israel (English analytics provider, coordinator)German company (OMQ, German IT support company)AlmaViva, Roma, Italy (Italian analytics provider)

STS workshop, NYC March 12-13 2012Slide3

Scientific objectives

Scientific goal: Develop and advance a “MOSES-style” platform for multi-lingual textual inferenceA Generic Multilingual

Architecture for Component-Based Textual EntailmentAlgorithmic Progress in Textual Inference

An

open-source

multi-lingual textual entailment platformSTS workshop, NYC March 12-13 2012Slide4

Application to Customer interaction analytics

Exploration graphs

STS workshop, NYC March 12-13 2012Slide5

Excitement Platform: Desiderata

Dual goal: OS Platform + Industrial ApplicationOpen Source Platform: Generality

Easy integration of external language analysis toolsAccommodate as many entailment mechanisms as possibleReusability of components

Convince end users to use the platform

Industrial Application: Efficiency

Flexible mapping of application tasks onto “core” entailmentPractical integration into industrial architectures5STS workshop, NYC March 12-13 2012Slide6

Linguistic

Analysis

Core

Engine

The Excitement Open Source TE Platform

EntailmentDecision Algorithm (EDA)DynamicComponents(Algorithms)

Static

Components

(Resources)

Analysis

Input Data

entailment/

contradiction/

unknown

Common Library

Machine Learning, Search, Evaluation

Data

RTE and task-

specific datasets

Pretrained

entailment models

6Slide7

TerminologyPlatform:

everything togetherEntailment engine: configured instantiation of platformLinguistic analysis

:Linguistic analysis tool chain (tagger, NER, parser, …)Core engine:

Entailment Decision Algorithm + Components

Entailment Decision Algorithm

: {H,T} -> DecisionComponent:Everything (re-)usable by an EDA or another component7Slide8

Instantiation example:Stanford-style Entailment

EDA

First, compute word alignment Second, compute match features

Third, weighted sum of features

Component 1

Word AlignmentDependency trees

Component 2

Syntactic

match features

Component 3

Semantic

match features

Parsing

WordNet

entailment/

unknown

Distr. similarity

8Slide9

Instantiation Example:“EDITS-style” EDA

(without rules)

EDALinear combination of scores

Component 1

compute all-word overlap

Raw text

Component 2

compute content-word overlap

Component 3

compute edit distance

entailment/

unknown

0.5

0.99

0.7

9Slide10

Req. 1a: Data model / file formats

“Low overhead”: CoNLL shared task formatcolumn-based plain text formatextensible with new columns, but messy (?)

No data model“High overhead”: UIMA CAS (Common Analysis Structure)one graph; stand-off; support for meta-data information (Req. 1d)

data model + XML serialization (XMI)

backed by IBM and the Apache foundation

Java API; existing wrappers for a number of NLP software packages10Slide11

Linguistic

Analysis

Core

Engine

From TE to Textual Inferences

Textual InferenceDecision AlgorithmDynamic

Components(Algorithms)

Static

Components

(Resources)

Analysis

Input Data

Textual inference

:

Equivalence (similarity)

Entailment

Contradiction

Causality

Temporal relation

Common Library

Machine Learning, Search, Evaluation

Data

RTE and task-

specific datasets

Pretrained

inference models

11