/
3rd Workshop 3rd Workshop

3rd Workshop - PowerPoint Presentation

natalia-silvester
natalia-silvester . @natalia-silvester
Follow
403 views
Uploaded On 2017-07-30

3rd Workshop - PPT Presentation

On Semantic Perception Mapping and Exploration SPME Karlsruhe Germany 2013 Semantic Parsing for Priming Object Detection in RGBD Scenes Cesar Cadena and Jana Kosecka Motivation 552013 Longterm robotic operation ID: 574237

scenes semantic object rgb semantic scenes rgb object detection 2013 parsing priming ground objects structure classes specific detectors features motivation robotic results

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "3rd Workshop" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

3rd Workshop

On Semantic Perception, Mapping and Exploration (SPME)Karlsruhe, Germany ,2013

Semantic Parsing for Priming Object Detection in RGB-D Scenes

Cesar Cadena and Jana KoseckaSlide2

Motivation

5/5/2013Long-term robotic operationThe semantic information about the surrounding environment is important for high level robotic tasks.It is difficult to know a priori all the possible instances or classes of objects that the robot will find in a real operation.Even if we know a lot of them, it is unreasonable and expensive, run all specific object detectors at the same time.

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide3

Motivation

5/5/2013Long-term robotic operationThe semantic information about the surrounding environment is important for high level robotic tasks.It is difficult to know a priori all the possible instances or classes of objects that the robot will find in a real operation.Even if we know a lot of them, it is unreasonable and expensive, run all specific object detectors at the same time.

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide4

Motivation

5/5/2013Long-term robotic operationThe semantic information about the surrounding environment is important for high level robotic tasks.It is difficult to know a priori all the possible instances or classes of objects that the robot will find in a real operation.Even if we know a lot of them, it is unreasonable and expensive, run all specific object detectors at the same time.

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide5

Motivation

5/5/2013Long-term robotic operationThe semantic information about the surrounding environment is important for high level robotic tasks.It is difficult to know a priori all the possible instances or classes of objects that the robot will find in a real operation.Even if we know a lot of them, it is unreasonable and expensive, run all specific object detectors at the same time.Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide6

However:

There are things we can assume to be present (almost) alwaysGeneric “detachable” objects also share some characteristicsUrban: Ground Buildings Sky ObjectsIndoors: Ground Walls Ceiling Objects Today: Ground – Structure –

Furniture – Props

Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors

Motivation

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide7

However:

There are things we can assume to be present (almost) alwaysGeneric “detachable” objects also share some characteristicsUrban: Ground Buildings Sky ObjectsIndoors: Ground Walls Ceiling Objects Today: Ground – Structure –

Furniture – Props

Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors

Motivation

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide8

However:

There are things we can assume to be present (almost) alwaysGeneric “detachable” objects also share some characteristicsUrban: Ground Buildings Sky ObjectsIndoors: Ground Walls Ceiling Objects Today: Ground – Structure –

Furniture – Props

Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors

Motivation

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide9

However:

There are things we can assume to be present (almost) alwaysGeneric “detachable” objects also share some characteristicsUrban: Ground Buildings Sky ObjectsIndoors: Ground Walls Ceiling Objects Today: Ground – Structure –

Furniture – Props

Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors

Motivation

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide10

However:

There are things we can assume to be present (almost) alwaysGeneric “detachable” objects also share some characteristicsUrban: Ground Buildings Sky ObjectsIndoors: Ground Walls Ceiling Objects Today: Ground – Structure –

Furniture – Props

Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors

Our Problem

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide11

However:

There are things we can assume to be present (almost) alwaysGeneric “detachable” objects also share some characteristicsUrban: Ground Buildings Sky ObjectsIndoors: Ground Walls Ceiling Objects Today: Ground – Structure –

Furniture – Props

Efficiently to segment RGB+3D scenes into these general classes to be used as a prior for specific task detectors

Our Problem

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide12

NYU Depth v2

5/5/20131449 labeled frames. 26 scenes classes.Labeling spans over 894 different classes.

N.

Silberman

, D.

Hoiem, P. Kohli

, and R. Fergus, Indoor segmentation and support inference from RGBD images

, in ECCV, 2012.

Thanks to N.

Silberman

for proving the mapping 894 to 4 classes.

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide13

The System

5/5/2013Semantic SegmentationMAP

Marginals

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide14

Different approaches

5/5/2013Semantic SegmentationMAP

Marginals

N.

Silberman

et al. ECCV 2012

C.

Couprie et al. CoRR 2013X. Ren et al. CVPR 2012

D. Munoz et al. ECCV 2010

I.

Endres

and D.

Hoeim

, ECCV 2010

Th

ey have at least one:

Expensive over-segmentation

Expensive features

Expensive Inference

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide15

Our approach

5/5/2013MAPMarginals

Semantic Segmentation

Conditional Random Fields

Potentials

Graph Structure

Inference

Preprocessing

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide16

Outline

5/5/2013MAPMarginalsConditional Random FieldsPotentials

Graph Structure

Inference

Preprocessing

(1)

(2)

(3)

(5)

Results

(6)

Conclusions

(4)

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide17

Preprocessing: Over-segmentation

5/5/2013SLIC superpixelsR. Achanta

, A. Shaji

, K. Smith, A. Lucchi

, P. Fua

, and S. Susstrunk

,SLIC superpixels

compared to state-of-the-art superpixel methods,

PAMI, 2012.

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide18

Graph Structure

5/5/2013Classical choice on imagesSemantic Parsing for Priming Object Detection in RGB-D ScenesSlide19

Graph Structure: Our choice

5/5/2013Minimum Spanning TreeOver 3D

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide20

Graph Structure: Our choice

5/5/2013Minimum Spanning TreeOver 3D

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide21

Potentials: Pairwise CRFs

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide22

Potentials:

Pairwise CRFs5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide23

Potentials:

Pairwise CRFs5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide24

Potentials: unary

5/5/2013

frequency of label

j

in a k-NN query

frequency of label

j

the database

J.

Tighe

and S.

Lazebnik

,

Superparsing

: Scalable nonparametric image parsing with

superpixels

,

ECCV 2010.

The database is a

kd

-tree of features from training data

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide25

Features 12D

5/5/2013From Image:mean of Lab color space 3Dvertical pixel location 1Dentropy from vanishing points 1DFrom 3Dheight and depth 2Dmean and std of differences on depth 2Dlocal planarity 1Dneighboring planarity 1Dvertical orientation 1DSemantic Parsing for Priming Object Detection in RGB-D ScenesSlide26

Features

5/5/2013From Image:entropy from vanishing points Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide27

Features

5/5/2013

From 3D

mean and std of differences on depth

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide28

Features

5/5/2013

From 3D

mean and std of differences on depth

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide29

Features

5/5/2013

From 3D

mean and std of differences on depth

local planarity

neighboring planarity

vertical orientation

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide30

Potentials:

pairwise5/5/2013

Lab color

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide31

Inference

5/5/2013We use belief propagation:Exact results in MAP/marginalsEfficient computation, in Thanks to our graph structure choice!Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide32

Results: NYU-D v2 Dataset

5/5/2013

GT MAP

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide33

Results: NYU-D v2 Dataset

5/5/2013Confusion matrix:Comparisons:

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide34

Results: NYU-D v2 Dataset

5/5/2013Confusion matrix:Comparisons:

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide35

Results: NYU-D v2 Dataset

5/5/2013GT MAPSome failures:

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide36

Results: NYU-D v2 Dataset

5/5/2013Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide37

Marginal probabilities

5/5/2013Provide very useful information for specific tasks, e.g. :Specific object detectionSupport inferenceP(Ground)

P(Structure)

P(Furniture)

P(Props)

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide38

Conclusions

5/5/2013We have presented a computational efficient approach for semantic segmentation of priming objects in indoors.Our approach effectively uses 3D and Images cues. Depth discontinuities are evidence for occlusionsThe MST over 3D keeps intra-class components coherently connected.Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide39

Discussion

5/5/2013Features:Local classifier:Graph structureBunch of engineered features (>1000D)Learned features(>1000D)

Select meaningful features

(12D)

Logistic Regression

Neural Networks

k-NN

Dense Connections

Image

None

MST over 3D

Silberman

et al. 2012

Couprie

et al. 2013

Ours.

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide40

Thanks!!

5/5/2013Cesar Cadena ccadenal@gmu.eduJana Kosecka kosecka@.cs.gmu.eduFunded by the US Army Research Office Grant W911NF-1110476.Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide41

Working on:

5/5/2013People detection by Shenghui Zhou

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide42

Multi-view and video:

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide43

Multi-view and video:

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide44

Multi-view and video:

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide45

Multi-view and video:

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide46

Multi-view and video:

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D ScenesSlide47

Multi-view and video:

5/5/2013

Semantic Parsing for Priming Object Detection in RGB-D Scenes