/
776 Computer Vision Jan-Michael Frahm, Enrique Dunn 776 Computer Vision Jan-Michael Frahm, Enrique Dunn

776 Computer Vision Jan-Michael Frahm, Enrique Dunn - PowerPoint Presentation

sherrill-nordquist
sherrill-nordquist . @sherrill-nordquist
Follow
342 views
Uploaded On 2019-06-21

776 Computer Vision Jan-Michael Frahm, Enrique Dunn - PPT Presentation

Spring 2013 FaceObject detection Previous Lecture The ViolaJones Face Detector A seminal approach to realtime object detection Training is slow but detection is very fast Key ideas Integral images ID: 759615

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "776 Computer Vision Jan-Michael Frahm, E..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

776 Computer Vision

Jan-Michael Frahm, Enrique DunnSpring 2013

Face/Object detection

Slide2

Previous Lecture

The Viola/Jones Face DetectorA seminal approach to real-time object detection Training is slow, but detection is very fastKey ideasIntegral images for fast feature evaluationBoosting for feature selectionAttentional cascade for fast rejection of non-face windows

P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. CVPR 2001.

P. Viola and M. Jones.

Robust real-time face detection.

IJCV 57(2), 2004.

Slide3

Image Features

“Rectangle filters”

Value =

∑ (pixels in white area) –

∑ (pixels in black area)

Slide4

Computing the integral image

Cumulative row sum: s(x, y) = s(x–1, y) + i(x, y) Integral image: ii(x, y) = ii(x, y−1) + s(x, y)

ii(x, y-1)

s(x-1, y)

i(x, y)

MATLAB: ii = cumsum(cumsum(double(i)), 2);

Slide5

Boosting for face detection

First two features selected by boosting:This feature combination can yield 100% detection rate and 50% false positive rate

Slide6

The AdaBoost Algorithm

Slide7

Attentional cascade

Chain classifiers that are progressively more complex and have lower false positive rates:

vs

false

neg

determined by

% False Pos

% Detection

0

50

0 100

FACE

IMAGE

SUB-WINDOW

Classifier 1

T

Classifier 3

T

F

NON-FACE

T

Classifier 2

T

F

NON-FACE

F

NON-FACE

Receiver operating characteristic

Slide8

Boosting vs. SVM

Advantages of boosting

Integrates classifier training with feature selection

Complexity of training is linear instead of quadratic in the number of training examples

Flexibility in the choice of weak learners, boosting scheme

Testing is fast

Easy to implement

Disadvantages

Needs many training examples

Training is slow

Often doesn’t work as well as SVM (especially for many-class problems)

Slide9

Face Recognition

N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar, "Attribute and Simile Classifiers for Face Verification," ICCV 2009.

Attributes for training

Similes for training

Slide10

Face Recognition with Attributes

Images

Verification

Attributes

Male

Round

Jaw

Asian

Different

Low-level

features

RGB

HOG

LBP

SIFT

RGB

HOG

LBP

SIFT

Dark

hair

+

+

-

-

Slide11

Learning an attribute classifier

Males

Females

Gender

classifier

Male

Feature

selection

Train

classifier

Training

images

Low-level

features

RGB

HoG

HSV

RGB

HoG

HSV

RGB, Nose

HoG, Eyes

HSV, Hair

Edges, Mouth

0.87

Slide12

Using simile classifiers for verification

Verification

classifier

Slide13

Eigenfaces

Slide14

Principal Component Analysis

A N x N pixel image of a face, represented as a vector occupies a single point in N2-dimensional image space.Images of faces being similar in overall configuration, will not be randomly distributed in this huge image space.Therefore, they can be described by a low dimensional subspace.Main idea of PCA for faces:To find vectors that best account for variation of face images in entire image space.These vectors are called eigen vectors.Construct a face space and project the images into this face space (eigenfaces).

Slide15

Image Representation

Training set of m images of size N*N are represented by vectors of size N2 x1,x2,x3,…,xM Example

Slide16

Average Image and Difference Images

The average training set is defined by m= (1/m) ∑mi=1 xi Each face differs from the average by vector ri = xi – m

Slide17

Covariance Matrix

The covariance matrix is constructed as C = AAT where A=[r1,…,rm]Finding eigenvectors of N2 x N2 matrix is intractable. Hence, use the matrix ATA of size m x m and find eigenvectors of this small matrix.

Size of this matrix is N2 x N2

Slide18

Eigenvalues and Eigenvectors - Definition

If v is a nonzero vector and λ is a number such that Av = λv, then              v is said to be an eigenvector of A with eigenvalue λ. Example

A

l

v

(eigenvectors)

(eigenvalues)

Slide19

Eigenvectors of Covariance Matrix

The eigenvectors vi of ATA are:

Consider the eigenvectors v

i

of

A

T

A

such that

A

T

A

v

i

=

i

v

i

Premultiplying both sides by

A

, we have

A

A

T

(

A

v

i

) =

i

(

A

v

i

)

Slide20

Face Space

The eigenvectors of covariance matrix are ui = Avi

u

i

resemble facial images which look ghostly, hence called Eigenfaces

Face Space

A face image can be projected into this face space by

p

k

= U

T

(

x

k

m

) where k=1,…,m

Slide21

Projection into Face Space

Slide22

EigenFaces

Algorithm

Slide23

EigenFaces

Algorithm

Slide24

EigenFaces

Algorithm

Slide25

EigenFaces for Detection

Slide26

EigenFaces for Recognition

Slide27

Limitations of Eigenfaces Approach

Variations in lighting conditionsDifferent lighting conditions for enrolment and query. Bright light causing image saturation.

Differences in pose – Head orientation

- 2D feature distances appear to distort.

Expression

- Change in feature location and shape.

Slide28

Linear Discriminant Analysis

PCA does not use class informationPCA projections are optimal for reconstruction from a low dimensional basis, they may not be optimal from a discrimination standpoint.LDA is an enhancement to PCAConstructs a discriminant subspace that minimizes the scatter between images of same class and maximizes the scatter between different class images

Slide29

More sliding window detection:Discriminative part-based models

Many slides based on

P.

Felzenszwalb

Slide30

Challenge: Generic object detection

Slide31

Slide32

Slide33

Slide34

Pedestrian detection

Features: Histograms of oriented gradients (HOG)Learn a pedestrian template using a linear support vector machineAt test time, convolve feature map with template

N.

Dalal

and B.

Triggs

, Histograms of Oriented Gradients for Human Detection, CVPR 2005

Template

HOG feature map

Detector response map

Slide35

Slide36

Discriminative part-based models

P.

Felzenszwalb

, R.

Girshick

, D.

McAllester, D. Ramanan, Object Detection with Discriminatively Trained Part Based Models, PAMI 32(9), 2010

Root filter

Part filters

Deformation weights

Slide37

Discriminative part-based models

Slide38

Object hypothesis

Multiscale model: the resolution of part filters is twice the resolution of the root

Slide39

Scoring an object hypothesis

The score of a hypothesis is the sum of filter scores minus the sum of deformation costs

Filters

Subwindow

features

Deformation weights

Displacements

Slide40

Scoring an object hypothesis

The score of a hypothesis is the sum of filter scores minus the sum of deformation costsRecall: pictorial structures

Matching cost

Deformation cost

Filters

Subwindow

features

Deformation weights

Displacements

Slide41

Scoring an object hypothesis

The score of a hypothesis is the sum of filter scores minus the sum of deformation costs

Concatenation of filter and deformation weights

Concatenation of

subwindow

features and displacements

Filters

Subwindow

features

Deformation weights

Displacements

Slide42

Detection

Define the score of each root filter location as the score given the best part placements:

Slide43

Detection

Define the score of each root filter location as the score given the best part placements:Efficient computation: generalized distance transformsFor each “default” part location, find the best-scoring displacement

Head filter

Head filter responses

Distance transform

Slide44

Detection

Slide45

Matching result

Slide46

Training

Training data consists of images with labeled bounding boxesNeed to learn the filters and deformation parameters

Slide47

Training

Our classifier has the formw are model parameters, z are latent hypothesesLatent SVM training:Initialize w and iterate:Fix w and find the best z for each training example (detection)Fix z and solve for w (standard SVM training)Issue: too many negative examplesDo “data mining” to find “hard” negatives

Slide48

Car model

Component 1

Component 2

Slide49

Car detections

Slide50

Person model

Slide51

Person detections

Slide52

Cat model

Slide53

Cat detections

Slide54

Bottle model

Slide55

More detections

Slide56

Quantitative results (PASCAL 2008)

7 systems competed in the 2008 challengeOut of 20 classes, first place in 7 classes and second place in 8 classes

Bicycles

Person

Bird

Proposed approach

Proposed approach

Proposed approach

Slide57

Summary