Binary Image Selection From Inaccurate User Input Kartic Subr Sylvain Paris Cyril Soler Jan Kautz University College London Adobe Research INRIAGrenoble Selection is a common operation in images ID: 475001
Download Presentation The PPT/PDF document "Accurate" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Accurate Binary Image SelectionFrom Inaccurate User Input
Kartic Subr, Sylvain Paris, Cyril Soler, Jan Kautz
University College London, Adobe Research, INRIA-GrenobleSlide2
Selection is a common operation in imagesSlide3
For example Slide4
Tools available: Related workSimple brush and lassomagnetic lasso [MB95]edge-aware brush [CPD07,OH08]
for multi-touch screens
[BWB06]
User indication
bounding box
[RKB04]
scribbles
[
BJ01, ADA*04, LSS09, LAA08
]Slide5
Accurate marking can be tediousSlide6
Precise input requires skill…Slide7
… and patienceSlide8
Robust approaches: Related worksoft selectionAppProp [AP08]instant propagation [LJH10]
interactive segmentation
dynamic, iterative graph cut
[SUA12]Slide9
Summary of related workAccurate methods require precise inputunstableRobust methodsnot accurate marking is not intuitiveSlide10
Ours: intuitive, robust, fast
Input
Output
foreground scribble
b
ackground scribble
foreground
backgroundSlide11
Bird’s eye viewformulate as image-labeling problemlabels: foreground, background, voidInputhistogram of probabilities for each labelOutputeach pixel assigned foreground/background labelSlide12
Formulate as labeling problem
probability
f
b
v
pixel grid
labeled pixel grid
LabelingSlide13
Formulate as labeling problem
probability
f
b
v
pixel grid
labeled pixel grid
tSlide14
Histogram of probabilities using input scribbles
fg
bg
void
0.5
0.25
0.25
0.25
0.5
0.25
0.33
0.330.33Slide15
Labeling: MAP inference in dense CRF [KK2011]
approximate
fast: 0.2 s (50K
vars
)
reduces to bilateral filteringSlide16
Labeling: MAP inference in dense CRF [KK2011]approximatefast: 0.2 s (50K vars)reduces to bilateral filteringLow memory requirementdoes not store full distance matrix between pixelsSlide17
Fast inference assumes Gaussian kernel [KK2011]Pair-wise potential (between pixels) linear sum of GaussiansGaussians over Euclidean feature spaceWe generalize to arbitrary kernels!Slide18
Generalizing MAP inference to arbitrary kernelsDeep in the details (see paper) lurks a Gaussian kernel + Euclidean feature space …
K( , ) = exp(-1/2 ( - )
T
∑
-1
( - ))
feature vectors in Euclidean spaceSlide19
Generalizing MAP inference to arbitrary kernelsK( , ) = exp(-1/2 ( - )T ∑ -1 ( - ))
feature vectors in Euclidean space
what if the input is an arbitrary dissimilarity measure between pixels?
D( , )
pixels
Deep in the details (see paper) lurks a
Gaussian kernel + Euclidean feature space
… Slide20
Need an Euclidean embedding!K( , ) = exp(-1/2 ( - )T ∑ -1 ( - ))
D( , )
embeddingSlide21
ContributionsGeneralized kernel approx. mean field inference (fully-connected CRF) Application: interactive image binary selectionrobust to inaccurate inputSlide22
Overview
t
t`
t
embedded pixels
t
[KK11]
image
scribbles
dissimilaritySlide23
Overviewt
t`
t
embedded pixels
t
[KK11]
image
scribbles
dissimilarity
tSlide24
Provide dissimilarity measure between pixels
t
t`
t
embedded pixels
t
[KK11]
image
scribbles
dissimilarity
t
D( , )
pixelsSlide25
Overviewt
t`
t
embedded pixels
t
[KK11]
image
scribbles
dissimilarity
t
tSlide26
Approximately-Euclidean pixel embedding
pixel
p
i
pixel
p
j
dissimilarity matrixSlide27
Approximately-Euclidean pixel embedding
pixel
p
i
pixel
p
j
q
j
q
i
dissimilarity matrix
embeddingSlide28
Approximately-Euclidean pixel embedding
pixel
p
i
pixel
p
j
q
j
q
i
t
D( , )
≈
q
i
-
q
j
2
For each p
i
find
q
i
so that
holds for all pixels
p
i
embeddingSlide29
Landmark multidimensional scaling (LMDS) [dST02]
distance matrix might be huge
10
12
elements for 1
MPix
image
stochastic sampling approach
Nystrom
approximation
Complexity
time:
O(N(c+p
2) + c3)space: O(Nc) Slide30
Landmark multidimensional scaling (LMDS) [dST02]
distance matrix might be huge
10
12
elements for 1
MPix
image
stochastic sampling approach
Nystrom
approximation
Complexity
time:
O(N(c+p
2) + c3)space: O(Nc) N: # pixelsc: # stochastic samplesp: dimensionality of embeddingSlide31
Overview
t
t`
t
embedded pixels
t
[KK11]
image
scribbles
dissimilarity
t
tSlide32
Overview
t
t`
t
embedded pixels
t
[KK11]
image
scribbles
dissimilarity
tSlide33
Thank youSlide34
You can’t be serious! What about results?Importance of embeddingRole of fully-connected CRF (FC-CRF)ValidationComparison with related workExamplesSlide35
Embedding allows use of arbitrary dissimilarities
Input
Euclidean distance in RGB
[KK11
]
Chi-squared distance
on local histograms + FC-CCRFSlide36
Embedding alone is not sufficiently accurateChi-squared distance (local histograms)+ nearest neighbour labeling
Input
Euclidean distance in RGB
[KK11
]
Chi-squared distance
on local histograms + FC-CCRF
tSlide37
Validation: Accurate output for high input errorsPrecise output
Precise inputSlide38
Color dominant vs texture dominant selection
Color dominant
Texture dominantSlide39
Qualitative comparison
Ours
[LJH10]
[CLT12]
[FFL10]Slide40
Quantitative comparison
OursSlide41
Quantitative comparison
OursSlide42
SummaryOur selection is robustrelies on relative indication of foreground and backgroundSlide43
ConclusionMost selection algorithms require precise inputours is relatively robustGenaralising dissimilarities is powerfulin context of pairwise potentials for FC-CRFTwo distance metrics stood out
RGB distance (
colour
dominant images)
Chi-squared distance on local histograms (tex
ture dominant images)Slide44
Thank youSlide45
Ours
[LJH10]
[CLT12]
[FFL10]Slide46
And with human scribbles