Dan Munoz Drew Bagnell Martial Hebert The Labeling Problem 2 Input Our Predicted Labels Road Tree Fgnd Bldg Sky The Labeling Problem 3 The Labeling Problem Needed better representation ID: 410405
Download Presentation The PPT/PDF document "Stacked Hierarchical Labeling" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Stacked Hierarchical Labeling
Dan Munoz Drew Bagnell Martial HebertSlide2
The Labeling Problem
2
Input
Our Predicted Labels
Road
Tree
Fgnd
Bldg
SkySlide3
The Labeling Problem
3Slide4
The Labeling Problem
Needed:
better representation &
interactions
Ohta
‘78
4Slide5
Using Regions
5
Input
Ideal
Regions
Slide from T.
MalisiewiczSlide6
Using Regions
6
Input
Actual
Regions
Slide from T.
MalisiewiczSlide7
Using Regions + Interactions
7
Image Representation
Ideal
Prob. Graphical Model
High-order
Expressive interactions
small regions
big regionsSlide8
Using Regions + Interactions
8
Actual PGM
Restrictive interactions
Still
NP-hard
Image Representation
small regions
big regionsSlide9
Learning with Approximate Inference
PGM learning requires exact inference
Otherwise, may diverge Kulesza and Pereira ’08
9
Simple
Random Field
Learning PathSlide10
PGM Approach
10
Input
PGM Inference
OutputSlide11
Our Approach
11
Input
f
1
Output
f
N
Sequence of simple problems
…
Cohen ’05,
Daume
III ’06Slide12
A Sequence of Simple Problems
Training simple modules to net desired output
No searching in exponential spaceNot optimizing any joint distribution/energy
Not necessarily doing it before!
Kulesza
& Pereira ‘08
12
Input
f
1
f
N
Output
…
Stacked Hierarchical LabelingSlide13
Our Contribution
An effective PGM alternative for labelingTraining a hierarchical
procedure of simple problemsNaturally analyzes multiple scalesRobust to imperfect segmentationsEnables more expressive interactions
Beyond pair-wise smoothing
13Slide14
Related Work
14
small regions
big regions
Learning with multi-scale configurations
Joint probability distribution
Bouman
‘94,
Feng
‘02, He ’04
Borenstein
‘04, Kumar ’05
Joint score/energy
Tu ‘03, S.C. Zhu ‘06, L. Zhu ‘08Munoz ‘09, Gould ’09, Ladicky ’09
Mitigating the intractable joint optimization
Cohen ’05,
Daume
III ’06, Kou ‘07,
Tu
’08
, Ross ‘10Slide15
15
. . .
. . .
1
2
3Slide16
16
. . .
. . .
1
2
3
In this work, the
segmentation
tree is
given
We use the technique from
Arbelaez
’09Slide17
17
1
2
3
4
Segmentation Tree
(
Arbelaez
’09
)Slide18
Parent sees big pictureNaturally handles scales
18
Label Coarse To Fine
1
2
3
4
Segmentation Tree
(
Arbelaez
’09
)Slide19
19
Parent sees big picture
Naturally handles scales
Break into
simple tasks
Predict label
mixtures
f
1
f
2
f
3
f
4
1
2
3
4
Segmentation Tree
(
Arbelaez
’09
)Slide20
Handling Real Segmentation
fi
predicts mixture of labels for each region
Input
Segmentation Map
20Slide21
Actual Predicted Mixtures
21
P(Tree)
P(Building)
P(
Fgnd
)
(brighter
higher probability) Slide22
Training Overview
How to train each module fi
?How to use previous predictions?How to train the hierarchical sequence?
22
f
1
f
2Slide23
Training Overview
How to train each module f
i ?
How to use previous predictions?
How to train the hierarchical sequence?
23
f
1
f
2Slide24
Modeling Heterogeneous Regions
Count
true labels Pr present in each region r
Train a
model
Q
to match each
P
rLogistic Regressionmin
Q H(P,Q) Weighted Logistic RegressionImage features: texture, color, etc. (
Gould ’08)
24Slide25
Training Overview
How to train each module fi
?How to use previous predictions?How to train the hierarchical sequence?
25
f
1
f
2Slide26
Using Parent Predictions
Use broader context in the finer regions
Allow finer regions access to all parent predictions
Create &
append
3 types of context features
Kumar ’05,
Sofman
’06, Shotton ’06,
Tu ‘0826
Parent regions
Child regionsSlide27
Parent Context
Refining the parent
27
Parent
ChildSlide28
Detailed In Paper
Image-wise (co-occurrence)Spatial Neighborhood (center-surround)
Σ
regions
28Slide29
Training Overview
How to train each module fi
?How to use previous predictions?How to train the hierarchical sequence?
29
f
1
f
2Slide30
Approach #1
Train each module independentlyUse ground truth context features
Problem: Cascades of ErrorsModules depend on perfect context features
Observe no mistakes during training
Propagate mistakes during testing
30
f
1
f
2
f
3
f
4Slide31
Approach #2
Solution: Train in feed-forward mannerViola-Jones ‘01, Kumar ‘05, Wainwright ’06, Ross ‘10
31
f
1
f
2
f
3
f
4Slide32
Training Feed-Forward
32
f
l
(Parameters)
LogReg
A
B
CSlide33
Training Feed-Forward
33
A
B
C
f
l
f
l
f
lSlide34
Cascades of Overfitting
Solution: Stacking
Wolpert
’92
, Cohen ’05
Similar to x-validation
Don’t predict on data
used for training
34
F.F. Train Confusions
F.F. Test Confusions
Stacking Test ConfusionsSlide35
Stacking
35
f
l
LogReg
A
B
C
ASlide36
Stacking
36
A
f
l
ASlide37
Stacking
37
f
l
LogReg
A
B
C
BSlide38
Stacking
38
A
B
f
l
f
l
A
BSlide39
Stacking
39
A
B
C
f
l
f
l
f
l
A
B
CSlide40
Learning to Fix Mistakes
Segments
Level 5
Level 6
Level 7
Current
Output
Person part of incorrect segment
Person segmented, but relies on parent
Person fixes previous mistakeSlide41
Level 1/8 Predictions
SegmentationSlide42
15
%
Level 1/8 Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Segmentation
18
%
12
%
31
%Slide43
15
%
Level 1/8 Predictions
Road
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
Segmentation
18
%
12
%
31
%Slide44
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Level 2/8
Predictions
SegmentationSlide45
P(
Foreground
)
P(
Tree
)
P(
Road
)
Level 2/8
Predictions
Current Output
Segmentation
P(
Building
)Slide46
Level 3/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide47
Level 4/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide48
Level 5/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide49
Level 6/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide50
Level 7/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide51
Level 8/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide52
Level 1/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide53
Level 2/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide54
Level 3/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide55
Level 4/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide56
Level 5/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide57
Level 6/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide58
Level 7/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide59
Level 8/8
Predictions
P(
Foreground
)
P(
Tree
)
P(
Building
)
P(
Road
)
Current Output
SegmentationSlide60
Stanford Background Dataset
8 Classes715 Images
Inference timeSegmentation & image features held constant
60
Method
sec/image
Gould ICCV ‘09
30
- 600
SHL (Proposed)
10
- 12
Method
Avg
Class Accuracy
Gould ICCV ‘09
65.5
LogReg
(Baseline)
58.0
SHL (Proposed)
66.2Slide61
MSRC-21
21 Classes591 Images
61
Method
Avg
Class Accuracy
Gould IJCV ‘08
64
LogReg
(Baseline)
60
SHL (Proposed)71
Ladicky
ICCV ‘09
75
Lim ICCV’09
67
Tu
PAMI’09
69
Zhu NIPS’08
74Slide62
MSRC-21
21 Classes591 Images
62
Method
Avg
Class Accuracy
Gould IJCV ‘08
64
LogReg
(Baseline)
60
SHL (Proposed)71
Ladicky
ICCV ‘09
75
LogReg
(Baseline)
69
SHL (Proposed)
75
Lim ICCV’09
67
Tu
PAMI’09
69
Zhu NIPS’08
74Slide63
Ongoing Work
63
Labeling 3-D Point Clouds
with
Xuehan
Xiong
Building
Car
Ground
Veg
Tree Trunk
PoleSlide64
Conclusion
An effective structured prediction alternativeHigh performance with no graphical model
Beyond site-wise representationsRobust to imperfect segmentations & multiple scales
Prediction is a series of simple problems
Stacked to avoid cascading errors and
overfitting
64
Input
f
1
f
N
Output
…Slide65
Thank You
AcknowledgementsQinetiQ North America Robotics FellowshipONR MURI: Reasoning in Reduced Information Spaces
Reviewers, S. Ross, A. Grubb, B. Becker, J.-F. LalondeQuestions?
65Slide66
66Slide67
Image-wise
Σ
regions
67Slide68
Spatial neighborhood
68Slide69
Interactions
Described in this talk
Described in the paper69Slide70
SHL vs. M3N
70Slide71
SHL vs. M3N
71