/
Encode-Attend- Refine -Decode: Enriching Encoder Decoder Models with Better Context Representation Encode-Attend- Refine -Decode: Enriching Encoder Decoder Models with Better Context Representation

Encode-Attend- Refine -Decode: Enriching Encoder Decoder Models with Better Context Representation - PowerPoint Presentation

phoebe-click
phoebe-click . @phoebe-click
Follow
346 views
Uploaded On 2019-11-03

Encode-Attend- Refine -Decode: Enriching Encoder Decoder Models with Better Context Representation - PPT Presentation

EncodeAttend Refine Decode Enriching Encoder Decoder Models with Better Context Representation Preksha Nema Mitesh M Khapra Anirban Laha Balaraman Ravindran Indian Institute of Technology Madras India ID: 762672

lab intelligence interactive iit intelligence lab iit interactive encoder madras states query decoder attention federer word encode mechanism embedding

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Encode-Attend- Refine -Decode: Enriching..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Encode-Attend-Refine-Decode: Enriching Encoder Decoder Models with Better Context Representation Preksha Nema*, Mitesh M. Khapra*, Anirban Laha*^, Balaraman Ravindran**Indian Institute of Technology Madras, India.^ IBM Research Labs, India Interactive Intelligence Lab - IIT Madras 1

ENCODE ATTEND DECODE Popular Deep Learning Paradigm Used in a wide range of NLP and Vision Problems Interactive Intelligence Lab - IIT Madras 2

ENCODER Encoder States Word Embedding En: How was your day today ? Hi: Aaj Aapka din kaisa tha Neural Encode-Attend-Decode Framework [ Bahdanau et. al. 2015] Interactive Intelligence Lab - IIT Madras 3

ENCODER Encoder States Word Embedding En: How was your day today ? [ Bahdanau et. al. 2015] Attention Mechanism Aaj Interactive Intelligence Lab - IIT Madras 4 Neural Encode-Attend-Decode Framework

ENCODER Encoder States Word Embedding En: How was your day today ? [ Bahdanau et. al. 2015] Attention Mechanism Aaj aapka Interactive Intelligence Lab - IIT Madras 5 Neural Encode-Attend-Decode Framework

ENCODER Encoder States Word Embedding En: How was your day today ? [ Bahdanau et. al. 2015] Attention Mechanism Aaj aapka din kaisa tha Decoder States Output Interactive Intelligence Lab - IIT Madras 6 Neural Encode-Attend-Decode Framework

Used in a wide range of NLP TasksInteractive Intelligence Lab - IIT Madras 7

ENCODER Encoder States Word Embedding En: How was your day today ? Various NLP Applications Attention Mechanism Aaj aapka din kaisa tha Decoder States Output Machine Translation Interactive Intelligence Lab - IIT Madras 8

ENCODER Encoder States Word Embedding Roger Federer wins a record eighth singles title at Wimbledon 2017 Various NLP Applications Attention Mechanism Federer wins 8 th Wimbledon Title Decoder States Output Summarization Interactive Intelligence Lab - IIT Madras 9

ENCODER Encoder States Word Embedding Roger Federer wins a record eighth singles title at Wimbledon 2017 Various NLP Applications Attention Mechanism Who won the Wimbledon title in 2017 ? Decoder States Output Question Generation Interactive Intelligence Lab - IIT Madras 10

ENCODER Encoder States Word Embedding It was raining all night Various NLP Applications Attention Mechanism The ground is wet Decoder States Output Textual Entailment Interactive Intelligence Lab - IIT Madras 11

ENCODER Encoder States Word Embedding What is Sachin Tendulkar’s birth place Various NLP Applications Attention Mechanism Where was Sachin Tendulkar born ? Decoder States Output Paraphrase Generation Interactive Intelligence Lab - IIT Madras 12

ENCODER Encoder States Word Embedding What is Sachin Tendulkar’s birth place Various NLP Applications Attention Mechanism Where was Sachin Tendulkar born ? Decoder States Output …. A nd so on Interactive Intelligence Lab - IIT Madras 13

Interactive Intelligence Lab - IIT Madras14 But… there are a few problemsSome task agnostic problemsSome task specific problems

Interactive Intelligence Lab - IIT Madras15 But… there are a few problemsSome task agnostic problemsSome task specific problems

Repetition of Words/Phrases [Baskaran et. al. 2016] Interactive Intelligence Lab - IIT Madras 16

Interactive Intelligence Lab - IIT Madras17 But… there are a few problemsSome task agnostic problemsSome task specific problems

Do Not Exploit Task Specific BehaviorInteractive Intelligence Lab - IIT Madras18 Input: Structured DataOutput: Natural Language Example 1: V. Balakrishnan (born 1943) is an Indian theoretical physicist known for his work on particle physics Example 2: What is the nationality of V. Balakrishnan ? More on this later

Interactive Intelligence Lab - IIT Madras19 We focus on “Avoiding Repetition of Phrases/Words [ACL 2017]”We study this in the context of “Query Based Abstractive Summarization”

Interactive Intelligence Lab - IIT Madras20 We focus onAvoiding Repetition of Phrases/Words [ACL 2017]Exploiting Task Specific Behavior [Under Review]

We study this in the context of “Query Based Abstractive Summarization”Interactive Intelligence Lab - IIT Madras21

Extractive vs Abstractive Summarization Roger Federer wins a record eighth men’s singles title at Wimbledon on Sunday.He defeated Marin Cilic in straight sets with 6-3, 6-1, 6-4. Cilic appeared to struggle with a foot injury but the Swiss was in imperious form on Centre Court, winning the final in one hour and 41 minutes. It is Federer’s 19th grand slam title and his second of 2017 following victory at the Australian Open in January. Extractive Summarization Roger Federer wins a record eighth men’s singles title at Wimbledon on Sunday. Abstractive Summarization Roger Federer wins record eighth Wimbledon title against Marin Cilic . Interactive Intelligence Lab - IIT Madras 22

Query-based Abstractive Summarization Interactive Intelligence Lab - IIT Madras 23

Encode-Attend-Decode SolutionInteractive Intelligence Lab - IIT Madras24 ENCODE ATTEND DECODE ENCODE DOCUMENT QUERY SUMMARY

Issue: Repetition of Phrases Interactive Intelligence Lab - IIT Madras25 ENCODE ATTEND ENCODE DOCUMENT QUERY DECODER Federer Federer won 8 th title Decoder States Output

Issue: Repetition of Phrases Interactive Intelligence Lab - IIT Madras26 ENCODE ATTEND ENCODE DOCUMENT QUERY DECODER Federer Federer won 8 th title Decoder States Output Hypothesis: May be the context vector being fed at successive steps are very similar Hence we get same word at successive time steps

Propose Solution: Diversity Based Model for Query-based Abstractive Summarization [Nema et al., 2017]Interactive Intelligence Lab - IIT Madras27

Decoder state at time step t:Encoder state corresponding to word j:Attention Weights: Context Vector :  Interactive Intelligence Lab - IIT Madras 28 Encoder Attention Mechanism Decoder Some Basic Notation

Make successive context vectors orthogonalContext vector orthogonal to . Hence DIFFERENT.       need not be orthogonal to            . Interactive Intelligence Lab - IIT Madras 29 Approach 1 (D1)

But what about the history ?Or what about the context vectors at t-2, t-3, … ? Interactive Intelligence Lab - IIT Madras30 Approach 1 (D1)

Introduce a LSTM cell to keep track of history of context vectors Interactive Intelligence Lab - IIT Madras31Approach 2 (D2): Account for history Encoder Attention Mechanism Decoder

Introduce a LSTM cell to keep track of history of context vectorsBut what about orthogonalizing successive context vectors ? Interactive Intelligence Lab - IIT Madras 32 Approach 2 (D2): Account for history Encoder Attention Mechanism Decoder LSTM Cell

Well, modify LSTM equationsOrthogonalize cell content to previous time step. Interactive Intelligence Lab - IIT Madras33Approach 2 (D2): Account for history Standard LSTM equations Diversity

Well, modify LSTM equationsOrthogonalize cell content to previous time step. Interactive Intelligence Lab - IIT Madras34Approach 2 (D2): Account for history In Summary,

D2  SD2 (soft version of D2) D1  SD1 (soft version of D1) Interactive Intelligence Lab - IIT Madras 35 Approach 3 (SD): Soft Orthogonalization

ENCODER Encoder States Word Embedding Decoder States Output DECODER Diversity Cell Attention Mechanism Illustration SOURCE: Roger Federer wins a record eighth men’s singles title at Wimbledon on Sunday. Roger Federer won the Wimbledon Interactive Intelligence Lab - IIT Madras 36

Diversity Model for Query-based Abstractive Summarization[Nema et al., 2017]Interactive Intelligence Lab - IIT Madras37

DOCUMENT: Roger Federer wins a record eighth men’s singles title at Wimbledon on Sunday. He defeated Marin Cilic in straight sets with 6-3, 6-1, 6-4. Decoder States Output DECODER Diversity Cell Document Attention Query Attention Query States Word Embeddings QUERY ENCODER Encoder States Word Embedding DOC ENCODER Federer won in straight sets QUERY : Margin of victory Federer won in straight OUTPUT 38 REFINE

 Documents and summaries for a given query Crawled from Debatepedia: an encyclopedia of pro and con argumentsEach debate topic has a set of queries associated with itEach query has a set of documents and an abstractive summary associate with each documentTriples (12695): (Query, Document , Summary ) Interactive Intelligence Lab - IIT Madras 39 New Dataset for Query Based Abstractive Summarization

Models ROUGE-1ROUGE-2ROUGE-LVanilla e-a-d13.73 2.0612.84 Query enc 20.87 3.39 19.38Query attn29.2810.2428.21 M1 [Chen et. al, 2016]33.0613.3532.17 M2 [Chen et. al, 2016] 18.42 4.47 17.45 D1 33.85 13.65 32.99 SD1 31.36 11.23 30.5 D2 38.12 16.7637.31 SD241.2618.75 40.43Interactive Intelligence Lab - IIT Madras 40Experiments and Results

Source Text : Fuel cell critics point out that hydrogen is flammable, but so is gasoline. Unlike gasoline, which can pool up and burn for a long time, hydrogen dissipates rapidly. Gas tanks tend to be easily punctured, thin-walled containers, while the latest hydrogen tanks are made from Kevlar. Also, gaseous hydrogen isn’t the only method of storage under consideration – BMW is looking at liquid storage while other researchers are looking at chemical compound storage, such as boron pellets.Query: safety are hydrogen fuel cell vehicles safe Reference : hydrogen in cars is less dangerous than gasoline Query att : hydrogen is hydrogen hydrogen hydrogen fuel energy SD1: hydrogen in cars is reduce risk than fuel SD2 : hydrogen in cars is less dangerous than gasolineInteractive Intelligence Lab - IIT Madras 41 Anecdotal Examples

Source Text : :The basis of all animal rights should be the Golden Rule: we should treat them as we would wish them to treat us, were any other species in our dominant position. Query: do animals have rights that makes eating them inappropriate Reference: animals should be treated as we would want to be treated Query att: animals should be treated as we would protect to be treated D1: animals should be treated as we most individual to be treated SD1: animals should be treated as we would physically to be treated D2: animals should be treated as we would illegal to be treated SD2 : animals should be treated as those would want to be treated Interactive Intelligence Lab - IIT Madras 42 Anecdotal Examples

Interactive Intelligence Lab - IIT Madras43 SummaryENCODE ATTEND REFINE DECODE Refine context vectors to avoid repetition

Thank You!!Interactive Intelligence Lab - IIT Madras 44