/
Stereo Matching by Training a Convolutional Neural Stereo Matching by Training a Convolutional Neural

Stereo Matching by Training a Convolutional Neural - PowerPoint Presentation

calandra-battersby
calandra-battersby . @calandra-battersby
Follow
380 views
Uploaded On 2018-10-27

Stereo Matching by Training a Convolutional Neural - PPT Presentation

Network to Compare Image Patches Jure Zbontar Yann LeCun Background Motivation Problem Formulation Methodology Training Data Suggested Net Architectures Sequential Steps Results Conclusion ID: 698197

results matching image methodology matching results methodology image training problem net cost architectures data stereo formulation steps motivation disparity

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Stereo Matching by Training a Convolutio..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Stereo Matching by Training a Convolutional NeuralNetwork to Compare Image Patches

Jure

Zbontar

, Yann

LeCunSlide2

BackgroundMotivation

Problem Formulation

Methodology

Training DataSuggested Net ArchitecturesSequential StepsResultsConclusion

Table of ContentsSlide3

BackgroundMotivation

Problem Formulation

Methodology

Training DataSuggested Net Architectures

Sequential StepsResults

Conclusion

Table of ContentsSlide4

Given input: 2 images (right and left), acquired at different horizontal positions

Required output: The disparity for

each pixel in the left

imageDisparity - difference in horizontal location (x-axis) of an object in the left and right image

Motivation

Stereo

MatchingSlide5

MotivationStereo

MatchingSlide6

Motivation

Stereo MatchingSlide7

Given the disparity ‘ at each pixel, the depth a can be obtained by

- distance between camera centersF - focal lengthApplications in autonomous driving, robotics, 3D scene reconstruction and more…

Motivation

ApplicationsSlide8

Stereo matching steps [Scharstein

& Szeliski -2002]

:

Matching cost computationCost aggregationOptimization

Disparity refinementFocus of this work: Matching cost initialization

Problem Formulation

Stereo

Matching Steps

Matching

cost

initializationSlide9

Matching cost example: - Left and right image centered at

q

- The

set of locations within a fixed rectangular window centered at p

Problem Formulation

Stereo Matching StepsSlide10

Problem Formulation

Goal

Matching cost initialization via convolutional neural networksSlide11

Background

Motivation

Problem Formulation

MethodologyTraining DataSuggested Net ArchitecturesSequential Steps

Results

Conclusion

Table of ContentsSlide12

Data sets: KITTI and Middlebury

For each image position

with known disparity: one

negative and one positive training examplePositive example: the right image patch center is shifted by

where

Negative

example: the

right image

patch center is shifted by

where

 

Methodology

Training DataSlide13

Example

from

KITTI dataset

:

Example from Middlebury dataset:

Methodology

Training DataSlide14

Data augmentation procedure: Artificial expansion of the data set from existing samples

Tweak – small deviations between parallel image patches

Selected actions:

RotationScalingHorizontal scaling

Methodology

Training Data

Horizontal

shearing

Horizontal transformation

Brightness & contrast adjustmentSlide15

Two suggested architectures: fast versus accurate

Common ground for both architectures:

Siamese networkMethodologySuggested Net ArchitecturesSlide16

Methodology

Fast ArchitectureSlide17

Training cost function – hinge loss - margin

- net output for negative sample

- net output for positive sampleSimilarity of the positive example is greater than the similarity of the negative example by at least the margin.

Methodology

Fast ArchitectureSlide18

Methodology

Accurate ArchitectureSlide19

Training cost function – cross-entropy loss

- sample class

- net outputMethodologyAccurate ArchitectureSlide20

Obtained matching cost

- patches from left and right images

Cross-based

cost aggregation (CCBA) – Local averaging of matching costSemiglobal matching – Disparity map smoothness constraints enforcementDisparity image computation and enhancement

Methodology

Sequential StepsSlide21

The outputs of the two sub-networks need to be computed only once per location, and not for

every disparity under consideration.

The output of the two sub-networks can be computed for all pixels in a single

forward pass by propagating full-resolution images, instead of small image patches.The fully connected layer forms the bottleneck.Methodology

Key insightsSlide22

Background

Motivation

Problem Formulation

MethodologyTraining DataSuggested Net Architectures

Sequential Steps

Results

Conclusion

Table of ContentsSlide23

ResultsSuccess Measure

Number of misclassified pixels

Total number of pixelsSlide24

Results

KITTI2012 DatasetSlide25

Results

KITTI2015 DatasetSlide26

Results

Middlebury DatasetSlide27

Results

Data

AugmetntaionSlide28

Results

RuntimesSlide29

Results

Training Data SizeSlide30

Results

Transfer LearningSlide31

Results

Hyperparameters

Remark

: Patch size is directly determined by the number of convolutional layersSlide32

Results

Visual Examples (KITTI)Slide33

Results

Visual Examples (KITTI)Slide34

ResultsVisual Examples

(Middlebury)Slide35

Background

Motivation

Problem Formulation

MethodologyTraining DataSuggested Net Architectures

Sequential Steps

Results

Conclusion

Table of ContentsSlide36

Two CNN architectures for learning a similarity measure on image patches were presented.

The two architectures were used for stereo matching.A relatively

simple

CNN outperformed all previous methods on the well-studied problem of stereo.Conclusion