Hao Zhang Computer Science Department 1 Problem Statement Verification Identification A B Same Different persons A B C D Which has the same identity as A 2 Solutions Extensions of still face recognition algorithms ID: 799531
Download The PPT/PDF document "Video Face Recognition: A Literature Rev..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Video Face Recognition: A Literature Review
Hao ZhangComputer Science Department
1
Slide2Problem StatementVerification
Identification
A
B
Same / Different persons?
A
B
C
D
Which has the same identity as A?
2
Slide3SolutionsExtensions of still face recognition algorithms
3D model reconstructionEmploying temporal informationSet-to-set matching methods3
Slide4Extensions of still face recognition algorithms
Joint sparse representation
Data:
k-
th
partition of a query videoDictionary: a concatenation of all dictionaries of k-th partition of training videos4
probe
gallery
Slide5Extensions of still face recognition algorithms
Joint
s
parse
representation :
Conclusion
Joint sparse representation
Only suitable for face identificationCannot handle new facesViolates the protocol of face verification
5
Slide6Multiple metric learning (MML)
Extensions of still face recognition algorithms
Video
Volumes
P
atches
Feature Extraction
MML
* A part of this figure is from [5]
6
Slide7Extensions of still face recognition algorithms
Multiple metric learning (MML): A conclusionIt can be easily adapted to solve both still and video problems.
It discards additional information in the video.
7
Slide83D model reconstruction
From a single frontal image: Analysis
* The
two above images
are from [8]
8
reconstructed 3D shape
Mean training 3D shape
PCA projection matrix of training 3D shapes
2D mappings of
i
nput 2D shape
s
cale and translation term
Slide93D model reconstruction
Reconstruction from a single image: Synthesis
Pose
Illumination
Expression
* This figure is from [8]9
Slide103D model reconstruction
Reconstruction from a single image: Conclusion
H
andle pose and illumination variations
2D images of good quality
Synthesis of lighting and expression is far from perfect10
Slide11Employing temporal information
Dynamic system model, ARMA
:
state vector encoding pose at time t
:
face appearance at time tVideo similarity is computed using an observability matrix formed by A and C.
11
Slide12Employing temporal information
Dynamic system model: ConclusionIncorporate time information for recognition
Linear assumption
Manifold learning methods can be applied using the
observability matrix
12
Slide13Employing temporal information
Probabilistic model
*
The
figure is from [9]
: Image I’s distance to the manifold of k-th video Can be adapted to handle occlusion
13
: probability of image I’s projection in
Slide14Employing temporal information
Probabilistic model: ConclusionIncorporate time information
to make decisions more robustly
Error can propagate
Majority v
oting 14
Slide15Set-to-set matching
Manifold-manifold distance
distance
Manifold A
Manifold B
Clustering criteria:
15
Slide16Set-to-set matching
Manifold-manifold distance: ConclusionOvercomes the drawbacks of voting methods
Clustering results will be different due to random initialization
16
Slide17Set-to-set matching
Affine Hull Representation
Convex hull
Affine hull
Reduced affine hull:
17
Slide18Set-to-set matching
Affine Hull Representation: Conclusion
“Size
changeable”
affine
hullsUnclear which representation is betterWhich to use: convex hull, affine hull or linear span?18
Slide19Set-to-set matching
Statistical methods on Grassmann manifolds
Local mapping using exponential map preserves geodesic distance
Distribution is defined on the tangent plane of
Karcher
mean19
Slide20Set-to-set matching
Statistical methods on Grassmann manifolds: Conclusion
Distribution models on manifold
A video is simply represented as a linear space
Too few samples
Thoughts:Partition the video to obtain multiple points on Grassmann manifold20
Slide21A summary for each category
ApproachSummary
Still extensions
Largely inherit properties
of still algorithms
3D modelHandle pose and illumination variations2D image of good qualitySynthesis is not goodTemporalEncode face dynamicsError may propagateSet-to-setSolid mathematical backgroundGenerally less computational burden
21
Slide22Important Datasets
2001
2003
2009
2011
201322
Slide23Comparing Results?
SR
MML
MBGS
ARMA
ProbAffineM2MStatMoBoxxxxx0.98 (1,3)
0.94 (rand)x
Honda
0.97 (#frames)xx
0.9 (15,30)0.92 ?
0.92 (20,39,noise)
0.97 (rand)
xMBGC
0.88 (s234)
xxxx
xx
0.71 (s234)
YTFx
0.79 (cr)0.76 (
cr)xx
xxx
Still extensions
Temporal
Set-to-set
23
Alg
Data
set
Slide24Summary
Current trends:Extensions of still face recognition algorithmsSet-to-set matching
methods
Common issues:
Computational burden
Pose variationsThoughts: good training data and transfer learningNeed common protocols and datasetsMuch better recently 24
Slide25References[1] G.
Aggarwal, A. K. R. Chowdhury, and R. Chellappa. A system identification approach for video-based face recognition. In Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on, volume 4, pages 175–178. IEEE, 2004. [2] J. R. Beveridge
, P. J. Phillips, D.
Bolme
, B. A. Draper, G. H. Givens, Y. M.
Lui, M. N. Teli, H. Zhang, W. T. Scruggs, K. W. Bowyer, et al. The challenge of face recognition from digital point-and-shoot cameras. IEEE Conference on Biometrics: Theory, Applications and Systems, 2013. [3] H. Cevikalp and B. Triggs. Face recognition based on image sets. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 2567–2573. IEEE, 2010. [4] Y.-C. Chen, V. Patel, S. Shekhar, R. Chellappa, and P. Phillips. Video-based face recognition via joint sparse representation. In Automatic Face and Gesture Recognition (FG), 2013 10th IEEE International Conference and Workshops on, pages 1–8, 2013. [5] Z. Cui, W. Li, D. Xu, S. Shan, and X. Chen. Fusing robust face region descriptors via multiple metric learning for face recognition in the wild. In Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on, pages 3554–3561, 2013. [6] G. Doretto, A. Chiuso, Y. N. Wu, and S. Soatto. Dynamic textures. International Journal of Computer Vision, 51(2):91–109, 2003. [7] R. Gross and J. Shi. The cmu motion of body (mobo) database. Technical Report CMU-RI-TR- 01-18, Robotics Institute, Pittsburgh, PA, June 2001. [8] D. Jiang, Y. Hu, S. Yan, L. Zhang, H. Zhang, and W.
Gao. Efficient 3d reconstruction for face recognition. Pattern Recognition, 38(6):787–798, 2005. [9] K.-C. Lee, J. Ho, M.-H. Yang, and D. Kriegman. Video-based face recognition using probabilistic appearance manifolds. In Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on, volume 1, pages I–313. IEEE, 2003.
[10] P. J. Phillips, P. J. Flynn, J. R. Beveridge, W. T. Scruggs, A. J. OToole, D. Bolme, K. W. Bowyer, B. A. Draper, G. H. Givens, Y. M.
Lui, et al. Overview of the multiple biometrics grand challenge. In Advances in Biometrics, pages 705–714. Springer, 2009. [11] J. B. Tenenbaum, V. De Silva, and J. C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319–2323, 2000.
[12] P. Turaga, A. Veeraraghavan, A. Srivastava
, and R. Chellappa. Statistical computations on grassmann and stiefel
manifolds for image and video-based recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 33(11):2273–2286, 2011. [13] R. Wang, S. Shan, X. Chen, and W. Gao. Manifold-manifold distance with application to face recognition based on image set. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1–8. IEEE, 2008. [14] L. Wolf, T.
Hassner, and I. Maoz. Face recognition in unconstrained videos with matched back- ground similarity. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 529–534. IEEE, 2011.
25