Dhruv Batra (Virginia Tech)

Dhruv Batra (Virginia Tech) Dhruv Batra (Virginia Tech) - Start

2018-03-19 28K 28 0 0

Description

Larry . Zitnick. (Facebook AI Research). Devi Parikh. (Virginia Tech). Stanislaw . Antol. (Virginia Tech). Aishwarya Agrawal. (Virginia Tech). Overview of Challenge . Outline. Overview of Task and Dataset . ID: 657585 Download Presentation

Embed code:
Download Presentation

Dhruv Batra (Virginia Tech)




Download Presentation - The PPT/PDF document "Dhruv Batra (Virginia Tech)" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.



Presentations text content in Dhruv Batra (Virginia Tech)

Slide1

Dhruv Batra

(Virginia Tech)

Larry

Zitnick

(Facebook AI Research)

Devi Parikh

(Virginia Tech)

Stanislaw

Antol

(Virginia Tech)

Aishwarya Agrawal

(Virginia Tech)

Overview of Challenge

Slide2

Outline

Overview of Task and Dataset

Overview of Challenge

Winner Announcements

Analysis of Results

2

Slide3

Outline

Overview of Task and Dataset

Overview of Challenge

Winner Announcements

Analysis of Results

3

Slide4

Outline

Overview of Task and Dataset

Overview of Challenge

Winner Announcements

Analysis of Results

4

Slide5

Outline

Overview of Task and Dataset

Overview of Challenge

Winner Announcements

Analysis of Results

5

Slide6

Outline

Overview of Task and Dataset

Overview of Challenge

Winner Announcements

Analysis of Results

6

Slide7

VQA Task

7

Slide8

VQA Task

What is the mustache made of?

8

Slide9

VQA Task

What is the mustache made of?

AI System

9

Slide10

VQA Task

What is the mustache made of?

bananas

AI System

10

Slide11

R

eal images (from COCO)

Tsung

-Yi Lin

et al.

“Microsoft COCO: Common Objects in

COntext

.” ECCV 2014.

http://mscoco.org/

11

Slide12

and abstract scenes.

12

Slide13

Questions

Stump a smart robot!

Ask a question that a human can answer,

but a smart robot probably can’t!13

Slide14

VQA Dataset

14

Slide15

Dataset Stats

>250K images (COCO + 50K Abstract Scenes)

>750K questions (3 per image)

~10M answers (10 w/ image + 3 w/o image)

15

Slide16

Two modalities of answeringOpen EndedMultiple Choice

(18 choices)1 correct answer

3 plausible choices10 most popular answersRest random answers

16

Slide17

Accuracy Metric

17

Slide18

Human Accuracy (Real)

Overall

Yes/No

Number

Other

Open

Ended83.3095.77

83.39

72.67

Multiple Choice

91.54

97.4086.9787.91

18

Slide19

Human Accuracy (Real)

Overall

Yes/No

Number

Other

Open

Ended83.3095.77

83.39

72.67

Multiple Choice

91.54

97.4086.9787.91

19

Slide20

Human Accuracy (Abstract)

Overall

Yes/No

Number

Other

Open

Ended87.4995.96

95.04

75.33

Multiple Choice

93.57

97.7896.7188.73

20

Slide21

Human Accuracy (Abstract)

Overall

Yes/No

Number

Other

Open

Ended87.4995.96

95.04

75.33

Multiple Choice

93.57

97.7896.7188.73

21

Slide22

Outline

Overview of Task and Dataset

Overview of Challenge

Winner Announcements

Analysis of Results

22

Slide23

VQA Challenges on

www.codalab.org

Real

Open Ended

Real

Multiple Choice

Abstract

Open Ended

Abstract

Multiple Choice

Real

Abstract

23

Slide24

VQA Challenges on

www.codalab.org

Real

Open Ended

Real

Multiple Choice

Abstract

Open Ended

Abstract

Multiple Choice

Real

Abstract

24

Slide25

Real Image Challenges: Dataset

Images

Questions

Answers

Training

80K

240K2.4MDataset size is approximate

25

Slide26

Real Image Challenges: Dataset

Images

Questions

Answers

Training

80K

240K2.4MValidation

40K

120K

1.2MDataset size is approximate

26

Slide27

Real Image Challenges: Dataset

Images

Questions

Answers

Training

80K

240K2.4MValidation

40K

120K

1.2MTest

80K

240KDataset size is approximate

27

Slide28

Real Image Challenges: Test Dataset80K test imagesFour splits of 20K images each Test-

dev (development

)Debugging and Validation - unlimited submission to the evaluation server.

Test-standard (publications)Used to score entries for the Public Leaderboard. Test-challenge (competitions)Used to

rank challenge participants. Test-reserve

(check overfitting)

Used to estimate overfitting. Scores on this set are never released.Slide adapted from: MSCOCO Detection/Segmentation Challenge, ICCV 2015

Dataset size is approximate

28

Slide29

VQA Challenges on

www.codalab.org

Real

Open Ended

Real

Multiple Choice

Abstract

Open Ended

Abstract

Multiple Choice

Real

Abstract

29

Slide30

Abstract Scene Challenges: Dataset

Images

Questions

Answers

Training

20K

60K0.6M30

Slide31

Abstract Scene Challenges: Dataset

Images

Questions

Answers

Training

20K

60K0.6MValidation

10K

30K

0.3M31

Slide32

Abstract Scene Challenges: Dataset

Images

Questions

Answers

Training

20K

60K0.6MValidation

10K

30K

0.3MTest

20K

60K32

Slide33

Outline

Overview of Task and Dataset

Overview of Challenge

Winner Announcements

Analysis of Results

33

Slide34

Award GPUs!!!

34

Slide35

Abstract Scene ChallengesOpen-Ended Challenge5 teams 5 institutions

3 countriesMultiple-Choice Challenge

4 teams 4 institutions 3 countries

Top 3 teams are same for Open Ended and Multiple Choice35

Slide36

Abstract Scene ChallengesWinner Team

MIL-UT

Andrew Shin*

Kuniaki

Saito*

Yoshitaka

Ushiku

Tatsuya Harada

Open Ended

Challenge Accuracy:

67.39

Multiple Choice

Challenge Accuracy:

71.18

36

Slide37

Real Image ChallengesOpen-Ended Challenge25 teams

26 institutions 8 countriesMultiple-Choice Challenge

15 teams 17 institutions 6 countries

Top 5 teams are same for Open Ended and Multiple Choice37

Slide38

Real Image ChallengesHonorable Mention

Brandeis

Aaditya

Prakash

Open Ended

Challenge Accuracy:

62.80

Multiple Choice

Challenge Accuracy:

65.17

38

Slide39

Real Image ChallengesRunner-Up Team

Naver

Labs

Hyeonseob

Nam

Open Ended

Challenge Accuracy:

64.89

Multiple Choice

Challenge Accuracy:

69.37

 

Jeonghee

Kim

39

Slide40

Real Image ChallengesWinner Team

UC Berkeley & Sony

Akira Fukui

Dong

Huk

Park

Daylen

Yang

Anna Rohrbach

Trevor

DarrellMarcus Rohrbach

Open Ended

Challenge Accuracy:

66.90

Multiple Choice

Challenge Accuracy:

70.52

40

Slide41

Outline

Overview of Task and Dataset

Overview of Challenge

Winner Announcements

Analysis of Results

41

Slide42

Real Open-Ended Challenge

ICCV15

arXiv

v6

42

Slide43

Real Open-Ended Challenge

+12.76%

absolute

43

Slide44

Statistical SignificanceBootstrap samples 5000 times@ 99% confidence

44

Slide45

Real Open-Ended Challenge

45

Slide46

Easy vs. Difficult Questions

(Real Open-Ended Challenge)

46

Slide47

Easy vs. Difficult Questions

(Real Open-Ended Challenge)

47

Slide48

Easy vs. Difficult Questions

(Real Open-Ended Challenge)

80.6%

of questions can be answered by at least 1

method!

Difficult Questions

48

Slide49

Easy vs. Difficult Questions

(Real Open-Ended Challenge)

Difficult Questions

Easy Questions

49

Slide50

Difficult Questions with Rare Answers

Slide51

Difficult Questions with Rare Answers

What is the name of …

What is the number on …

What is written on the …What does the sign say?What time is it?What kind of …What type of …

Why …

51

Slide52

Easy vs. Difficult Questions

(Real Open-Ended Challenge)

Slide53

Easy vs. Difficult Questions

(Real Open-Ended Challenge)

Difficult Questions

with Frequent Answers

Easy Questions

53

Slide54

Success Cases

Q:

What is the woman holding

?GT A: laptopMachine A: laptop

Q:

Is this a casino?

GT A: noMachine A: no

Q:

Is it going to rain soon

?GT A: yesMachine A: yes

Q:

What room is the cat located in?GT A: kitchenMachine A:

kitchen

54

Slide55

Failure Cases

Q:

What is the woman holding

?GT A: bookMachine A: knife

Q:

Is the hydrant painted a new

color?GT A: yesMachine A: no

Q:

Why is there snow on one side of the

stream and clear grass on the other?GT A:

shadeMachine A:

yesQ: Where is the blue and white umbrella?

GT A: on left

Machine A: right

55

Slide56

Easy vs. Difficult Questions

(Real Open-Ended Challenge)

Difficult Questions

Easy Questions

56

Slide57

Easy vs. Difficult Questions

(Real Open-Ended Challenge)

57

Slide58

Answer Type and Question Type AnalysesPer Answer Type No team statistically significantly better than winner

Per Question TypeNo team statistically significantly better than winner

58

Slide59

Results of the Poll

25 responses

59

Slide60

Image Modelling

60

Slide61

Question Modelling

61

Slide62

Question Word Modelling

62

Slide63

Attention on Images

63

Slide64

Attention on Questions

64

Slide65

Use of Ensemble

65

Slide66

Use of External Data Sources

66

Slide67

Question Type Specific Mechanisms

67

Slide68

Classification vs. Generation of Answers

68

Slide69

Future PlansVQA Challenge 2017?What changes do you want?Sub tasks?More difficult/easy dataset?

Dialogue/conversational QA?New evaluation metric?

Other annotations?

69

Slide70

Thanks!Questions?

70

Slide71


About DocSlides
DocSlides allows users to easily upload and share presentations, PDF documents, and images.Share your documents with the world , watch,share and upload any time you want. How can you benefit from using DocSlides? DocSlides consists documents from individuals and organizations on topics ranging from technology and business to travel, health, and education. Find and search for what interests you, and learn from people and more. You can also download DocSlides to read or reference later.