CS 678 Spring 2018 Outline Human vision system Human vision for computer vision Cameras and image formation Projection geometry Reading Chapter 1 FampP book Chapter 2 ID: 759222
Download Presentation The PPT/PDF document "Human Vision and Cameras" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Human Vision and Cameras
CS 678Spring 2018
Slide2Outline
Human vision system
Human vision for computer vision
Cameras and image formation
Projection geometry
Reading: Chapter 1 (F&P book)
Chapter 2 (
Szeliski
)
Slide3Human Eyes
The human eye is the organ which gives us the sense of sight, allowing us to observe and learn more about the surrounding world than we do with any of the other four senses. The eye allows us to see and interpret the shapes, colors, and dimensions of objects in the world by processing the light they reflect or emit. The eye is able to detect bright light or dim light, but it cannot sense objects when light is absent.
Slide4Anatomy of the Human Eye
http://www.tedmontgomery.com/the_eye/
Slide5Some Concepts
Retina – The retina is the innermost layer of the eye and is comparable to the film inside of a camera. It is composed of nerve tissue which senses the light entering the eye.
Slide6Concepts
The
macula
lutea
is the small, yellowish central portion of the retina. It is the area providing the clearest, most distinct vision.
The center of the macula is called the
fovea
centralis
, an area where all of the photoreceptors are cones; there are no rods in the fovea.
Learn more concepts at
http://www.tedmontgomery.com/the_eye/
Slide7Rods and Cones
The retina contains two types of photoreceptors, rods and cones. The rods are more numerous, some 120 million, and are more sensitive than the cones. However, they are not sensitive to color. The 6 to 7 million cones provide the eye's color sensitivity and they are much more concentrated in the central yellow spot known as the macula.
Slide8The Electromagnetic Spectrum
Slide9Human Vision System
We do not “see” with our eyes, but with our brains
Slide10Human Vision for Computer Vision
"Human vision is vastly better at recognition than any of our current computer systems, so any hints of how to proceed from biology are likely to be very useful."
– David Lowe
Slide11Feedforward Processing
Thorpe & Fabre-Thorpe 2001
LGN --
The lateral
geniculate
nucleus
V1 – The primary visual cortex
V2 – Visual area V2
IT – Inferior temporal cortex
Slide12A Hierarchical Model
Contains alternating layers called simple and complex cell units creating increasing complexity: [Hubel & Wiesel 1962, Riesenhuber & Poggio 1999, Serre et al. 2007]Simple Cell (linear operation) – SelectiveComplex Cell (nonlinear operation) – Invariant
Slide13S1 Layer
Being selectiveApplying Gabor filters to the input image, setting parameters to match what is known about the primate visual system
8 bands and 4 orientations
Slide14Models with Four Layers(S1 C1 S2 C2)
Powerful for object category recognition, Representative works includeRiesenhuber & Poggio ’99, Serre et al. ’05, ’07, Mutch & Lowe ’06
Air planes
Motor bikes
Faces
Cars
Slide15S1
C1
Gabor filtering
Pixelwise
MAX
Spatial pooling
MAX
p
1
p
2
p
N
Pre-learned prototypes
…
.
.
.
.
.
.
v
1
v
2
v
N
.
.
.
S2
C2
Template
matching
Global
MAX
Off-line
On-line
Serre
et al.’s
model (PAMI’07)
Slide16Biologically Inspired Models
T.
Serre
, L. Wolf, S.
Bileschi
, M.
Riesenhuber
, and T.
Poggio
. Robust object recognition with cortex-like mechanisms.
IEEE Trans. Pattern Anal. Mach.
Intell
., 29(3):411–426, 2007.
G.
Guo
, G. Mu, Y. Fu, and T. S. Huang. Human age estimation using
bioinspired
features. In
IEEE CVPR, 2009.
Any other biologically inspired models?
Optional Homework: read
Serre
et al.’ paper or other newer models, discuss and compare those models, and think about the
advantages
and
disadvantages
Cameras
Slide18Image Formation
Digital Camera
Film
Alexei
Efros
’ slide
Slide19How do we see the world?
Let’s design a cameraIdea 1: put a piece of film in front of an objectDo we get a reasonable image?
Slide by Steve Seitz
Slide20Pinhole camera
Add a barrier to block off most of the raysThis reduces blurringThe opening known as the aperture
Slide by Steve Seitz
Slide21Pinhole camera model
Pinhole model:Captures pencil of rays – all rays through a single pointThe point is called Center of Projection (focal point)The image is formed on the Image Plane
Slide by Steve Seitz
Slide22Figures © Stephen E. Palmer, 2002
Dimensionality Reduction Machine (3D to 2D)
3D world
2D image
What have we lost?
Angles
Distances (lengths)
Slide by A. Efros
Slide23Projection properties
Many-to-one: any points along same ray map to same point in image
Points → points
But projection of points on focal plane is undefined
Lines → lines (collinearity is preserved)
But line through focal point projects to a point
Planes → planes (or half-planes)
But plane through focal point projects to line
Slide24Projection properties
Parallel lines converge at a vanishing pointEach direction in space has its own vanishing pointBut parallels parallel to the image plane remain parallelAll directions in the same plane have vanishing points on the same line
How do we construct the vanishing point/line?
Slide25Vanishing points
each set of parallel lines meets at a different pointThe vanishing point for this directionSets of parallel lines on the same plane lead to collinear vanishing points. The line is called the horizon for that plane
Good ways to spot faked images
scale and perspective don’t work
vanishing points behave badly
Slide26Distant objects are smaller
Size is inversely proportional to distance.
Slide27Perspective distortion
What does a sphere project to?
Slide28Shrinking the aperture
Why not make the aperture as small as possible?Less light gets throughDiffraction effects…
Slide by Steve Seitz
Slide29Shrinking the aperture
Slide30The reason for lenses
Slide31Adding a lens
A lens focuses light onto the filmRays passing through the center are not deviated
Slide by Steve Seitz
Slide32Adding a lens
A lens focuses light onto the filmRays passing through the center are not deviatedAll parallel rays converge to one point on a plane located at the focal length f
Slide by Steve Seitz
focal point
f
Slide33Adding a lens
A lens focuses light onto the filmThere is a specific distance at which objects are “in focus”other points project to a “circle of confusion” in the image
“circle of
confusion”
Slide by Steve Seitz
Slide34Thin lens formula
f
D
D’
Frédo Durand’s slide
Slide35Thin lens formula
f
D
D’
Similar triangles everywhere!
Frédo Durand’s slide
Slide36Thin lens formula
f
D
D’
Similar triangles everywhere!
y’
y
y’/y = D’/D
Frédo Durand’s slide
Slide37Thin lens formula
f
D
D’
Similar triangles everywhere!
y’
y
y’/y = D’/D
y’/y = (D’-f
)/f
Frédo Durand’s slide
Slide38Thin lens formula
f
D
D’
1
D’
D
1
1
f
+
=
Any point satisfying the thin lens equation is in focus.
Frédo Durand’s slide
Slide39Depth of Field
http://www.cambridgeincolour.com/tutorials/depth-of-field.htm
Slide by A. Efros
Slide40How can we control the depth of field?
Changing the aperture size affects depth of fieldA smaller aperture increases the range in which the object is approximately in focusBut small aperture reduces amount of light – need to increase exposure
Slide by A. Efros
Slide41Varying the aperture
Large aperture = small DOF
Small aperture = large DOF
Slide by A. Efros
Slide42Nice Depth of Field effect
Source: F. Durand
Slide43Field of View (Zoom)
Slide by A. Efros
Slide44Field of View (Zoom)
Slide by A. Efros
Slide45f
Field of View
Smaller FOV = larger Focal Length
Slide by A. Efros
f
FOV depends on focal length and size of the camera retina
Slide46Field of View / Focal Length
Large FOV, small f
Camera close to car
Small FOV, large f
Camera far from the car
Sources: A. Efros, F. Durand
Slide47Same effect for faces
standard
wide-angle
telephoto
Source: F. Durand
Slide48Source: Hartley & Zisserman
Approximating an affine camera
Slide49Lens systems
A good camera lens may contain 15 elements and cost a thousand dollars
The best modern lenses may contain aspherical elements
Slide50Lens Flaws: Chromatic Aberration
Lens has different refractive indices for different wavelengths: causes color fringing
Near Lens Center
Near Lens Outer Edge
Slide51Lens flaws: Spherical aberration
Spherical lenses don’t focus light perfectly Rays farther from the optical axis focus closer
Slide52Lens flaws: Vignetting
Slide53No distortion
Pin cushion
Barrel
Radial Distortion
Caused by imperfect lenses
Deviations are most noticeable for rays that pass through the edge of the lens
Slide54Digital camera
A digital camera replaces film with a sensor arrayEach cell in the array is light-sensitive diode that converts photons to electronsTwo common typesCharge Coupled Device (CCD) Complementary metal oxide semiconductor (CMOS)http://electronics.howstuffworks.com/digital-camera.htm
Slide by Steve Seitz
Slide55CCD vs. CMOS
CCD: transports the charge across the chip and reads it at one corner of the array. An analog-to-digital converter (ADC) then turns each pixel's value into a digital value by measuring the amount of charge at each photosite and converting that measurement to binary formCMOS: uses several transistors at each pixel to amplify and move the charge using more traditional wires. The CMOS signal is digital, so it needs no ADC.
http://www.dalsa.com/shared/content/pdfs/CCD_vs_CMOS_Litwiller_2005.pdf
http://electronics.howstuffworks.com/digital-camera.htm
Slide56Color sensing in camera: Color filter array
Source: Steve Seitz
Estimate missing components from neighboring values
(
demosaicing)
Why more green?
Bayer grid
Human Luminance Sensitivity Function
Slide57Problem with
demosaicing
: color moire
Slide by F. Durand
Slide58The cause of color moire
detector
Fine black and white detail in image
misinterpreted as color information
Slide by F. Durand
Slide59Color sensing in camera: Prism
Requires three chips and precise alignmentMore expensive
CCD(B)
CCD(G)
CCD(R)
Slide60Color sensing in camera: Foveon X3
Source: M. Pollefeys
http://en.wikipedia.org/wiki/Foveon_X3_sensor
http://www.foveon.com/article.php?a=67
CMOS sensor
Takes advantage of the fact that red, blue and green light penetrate silicon to different depths
better image quality
Slide61Issues with digital cameras
Noiselow light is where you most notice noiselight sensitivity (ISO) / noise tradeoffstuck pixelsResolution: Are more megapixels better?requires higher quality lensnoise issuesIn-camera processingoversharpening can produce halosRAW vs. compressedfile size vs. quality tradeoffBloomingcharge overflowing into neighboring pixelsColor artifactspurple fringing from microlenses, artifacts from Bayer patternswhite balanceMore info online: http://electronics.howstuffworks.com/digital-camera.htm http://www.dpreview.com/
Slide by Steve Seitz
Slide62Historical context
Pinhole model:
Mozi (470-390 BCE), Aristotle (384-322 BCE)Principles of optics (including lenses): Alhacen (965-1039 CE) Camera obscura: Leonardo da Vinci (1452-1519), Johann Zahn (1631-1707)First photo: Joseph Nicephore Niepce (1822)Daguerréotypes (1839)Photographic film (Eastman, 1889)Cinema (Lumière Brothers, 1895)Color Photography (Lumière Brothers, 1908)Television (Baird, Farnsworth, Zworykin, 1920s)First consumer camera with CCD: Sony Mavica (1981)First fully digital camera: Kodak DCS100 (1990)
Niepce, “La Table Servie,” 1822
CCD chip
Alhacen’s notes
Slide63Modeling projection
The coordinate systemWe will use the pinhole model as an approximationPut the optical center (O) at the originPut the image plane (Π’) in front of O
x
y
z
Source: J. Ponce, S. Seitz
Slide64x
y
z
Modeling projection
Projection equations
Compute intersection with
Π’ of ray from P = (x,y,z) to ODerived using similar triangles
Source: J. Ponce, S. Seitz
We get the projection by throwing out the last coordinate:
Slide65Homogeneous coordinates
Is this a linear transformation?
Trick: add one more coordinate:
homogeneous image coordinates
homogeneous scene coordinates
Converting from homogeneous coordinates
no—division by z is nonlinear
Slide by Steve Seitz
Slide66divide by the third coordinate
Perspective Projection Matrix
Projection is a matrix multiplication using homogeneous coordinates:
Slide67divide by the third coordinate
Perspective Projection Matrix
Projection is a matrix multiplication using homogeneous coordinates:
In practice: lots of coordinate transformations…
World to
camera coord.
trans. matrix
(4x4)
Perspective
projection matrix(3x4)
Camera to pixel coord. trans. matrix (3x3)
=
2Dpoint(3x1)
3D
point
(4x1)
Slide68Weak perspective
Assume object points are all at same depth -z
0
Slide69Orthographic Projection
Special case of perspective projectionDistance from center of projection to image plane is infiniteAlso called “parallel projection”What’s the projection matrix?
Image
World
Slide by Steve Seitz
Slide70Pros and Cons of These Models
Weak perspective (including orthographic) has simpler mathematics
Accurate when object is small relative to its distance.
Most useful for recognition.
Perspective is much more accurate for scenes.
Used in structure from motion.
When accuracy really matters, we must model the real camera
Use perspective projection with other calibration parameters (e.g., radial lens distortion)