Human Vision and Cameras - PowerPoint Presentation

344 views
Uploaded On 2019-06-20

Human Vision and Cameras - PPT Presentation

CS 678 Spring 2018 Outline Human vision system Human vision for computer vision Cameras and image formation Projection geometry Reading Chapter 1 FampP book Chapter 2 ID: 759222

Link:

Copy

Embed:

<iframe width="560" height="315" src="https://www.docslides.com/embed/759222" frameborder="0" allowfullscreen></iframe>

Download Presentation from below link

Download Presentation The PPT/PDF document "Human Vision and Cameras" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation Transcript

Slide1

Human Vision and Cameras

CS 678Spring 2018

Slide2

Outline

Human vision system

Human vision for computer vision

Cameras and image formation

Projection geometry

Reading: Chapter 1 (F&P book)

Chapter 2 (

Szeliski

)

Slide3

Human Eyes

The human eye is the organ which gives us the sense of sight, allowing us to observe and learn more about the surrounding world than we do with any of the other four senses. The eye allows us to see and interpret the shapes, colors, and dimensions of objects in the world by processing the light they reflect or emit. The eye is able to detect bright light or dim light, but it cannot sense objects when light is absent.

Slide4

Anatomy of the Human Eye

http://www.tedmontgomery.com/the_eye/

Slide5

Some Concepts

Retina – The retina is the innermost layer of the eye and is comparable to the film inside of a camera. It is composed of nerve tissue which senses the light entering the eye.

Slide6

Concepts

The

macula

lutea

is the small, yellowish central portion of the retina. It is the area providing the clearest, most distinct vision.

The center of the macula is called the

fovea

centralis

, an area where all of the photoreceptors are cones; there are no rods in the fovea.

Learn more concepts at

http://www.tedmontgomery.com/the_eye/

Slide7

Rods and Cones

The retina contains two types of photoreceptors, rods and cones. The rods are more numerous, some 120 million, and are more sensitive than the cones. However, they are not sensitive to color. The 6 to 7 million cones provide the eye's color sensitivity and they are much more concentrated in the central yellow spot known as the macula.

Slide8

The Electromagnetic Spectrum

Slide9

Human Vision System

We do not “see” with our eyes, but with our brains

Slide10

Human Vision for Computer Vision

"Human vision is vastly better at recognition than any of our current computer systems, so any hints of how to proceed from biology are likely to be very useful."

– David Lowe

Slide11

Feedforward Processing

Thorpe & Fabre-Thorpe 2001

LGN --

The lateral

geniculate

nucleus

V1 – The primary visual cortex

V2 – Visual area V2

IT – Inferior temporal cortex

Slide12

A Hierarchical Model

Contains alternating layers called simple and complex cell units creating increasing complexity: [Hubel & Wiesel 1962, Riesenhuber & Poggio 1999, Serre et al. 2007]Simple Cell (linear operation) – SelectiveComplex Cell (nonlinear operation) – Invariant

Slide13

S1 Layer

Being selectiveApplying Gabor filters to the input image, setting parameters to match what is known about the primate visual system

8 bands and 4 orientations

Slide14

Models with Four Layers(S1 C1 S2 C2)

Powerful for object category recognition, Representative works includeRiesenhuber & Poggio ’99, Serre et al. ’05, ’07, Mutch & Lowe ’06

Air planes

Motor bikes

Faces

Cars

Slide15

Gabor filtering

Pixelwise

MAX

Spatial pooling

MAX

Pre-learned prototypes

…

Template

matching

Global

MAX

Off-line

On-line

Serre

et al.’s

model (PAMI’07)

Slide16

Biologically Inspired Models

Serre

, L. Wolf, S.

Bileschi

, M.

Riesenhuber

, and T.

Poggio

. Robust object recognition with cortex-like mechanisms.

IEEE Trans. Pattern Anal. Mach.

Intell

., 29(3):411–426, 2007.

Guo

, G. Mu, Y. Fu, and T. S. Huang. Human age estimation using

bioinspired

features. In

IEEE CVPR, 2009.

Any other biologically inspired models?

Optional Homework: read

Serre

et al.’ paper or other newer models, discuss and compare those models, and think about the

advantages

and

disadvantages

Slide17

Cameras

Slide18

Image Formation

Digital Camera

Film

Alexei

Efros

’ slide

Slide19

How do we see the world?

Let’s design a cameraIdea 1: put a piece of film in front of an objectDo we get a reasonable image?

Slide by Steve Seitz

Slide20

Pinhole camera

Add a barrier to block off most of the raysThis reduces blurringThe opening known as the aperture

Slide by Steve Seitz

Slide21

Pinhole camera model

Pinhole model:Captures pencil of rays – all rays through a single pointThe point is called Center of Projection (focal point)The image is formed on the Image Plane

Slide by Steve Seitz

Slide22

Figures © Stephen E. Palmer, 2002

Dimensionality Reduction Machine (3D to 2D)

3D world

2D image

What have we lost?

Angles

Distances (lengths)

Slide by A. Efros

Slide23

Projection properties

Many-to-one: any points along same ray map to same point in image

Points → points

But projection of points on focal plane is undefined

Lines → lines (collinearity is preserved)

But line through focal point projects to a point

Planes → planes (or half-planes)

But plane through focal point projects to line

Slide24

Projection properties

Parallel lines converge at a vanishing pointEach direction in space has its own vanishing pointBut parallels parallel to the image plane remain parallelAll directions in the same plane have vanishing points on the same line

How do we construct the vanishing point/line?

Slide25

Vanishing points

each set of parallel lines meets at a different pointThe vanishing point for this directionSets of parallel lines on the same plane lead to collinear vanishing points. The line is called the horizon for that plane

Good ways to spot faked images

scale and perspective don’t work

vanishing points behave badly

Slide26

Distant objects are smaller

Size is inversely proportional to distance.

Slide27

Perspective distortion

What does a sphere project to?

Slide28

Shrinking the aperture

Why not make the aperture as small as possible?Less light gets throughDiffraction effects…

Slide by Steve Seitz

Slide29

Shrinking the aperture

Slide30

The reason for lenses

Slide31

Adding a lens

A lens focuses light onto the filmRays passing through the center are not deviated

Slide by Steve Seitz

Slide32

Adding a lens

A lens focuses light onto the filmRays passing through the center are not deviatedAll parallel rays converge to one point on a plane located at the focal length f

Slide by Steve Seitz

focal point

Slide33

Adding a lens

A lens focuses light onto the filmThere is a specific distance at which objects are “in focus”other points project to a “circle of confusion” in the image

“circle of

confusion”

Slide by Steve Seitz

Slide34

Thin lens formula

D’

Frédo Durand’s slide

Slide35

Thin lens formula

D’

Similar triangles everywhere!

Frédo Durand’s slide

Slide36

Thin lens formula

D’

Similar triangles everywhere!

y’

y’/y = D’/D

Frédo Durand’s slide

Slide37

Thin lens formula

D’

Similar triangles everywhere!

y’

y’/y = D’/D

y’/y = (D’-f

)/f

Frédo Durand’s slide

Slide38

Thin lens formula

D’

Any point satisfying the thin lens equation is in focus.

Frédo Durand’s slide

Slide39

Depth of Field

http://www.cambridgeincolour.com/tutorials/depth-of-field.htm

Slide by A. Efros

Slide40

How can we control the depth of field?

Changing the aperture size affects depth of fieldA smaller aperture increases the range in which the object is approximately in focusBut small aperture reduces amount of light – need to increase exposure

Slide by A. Efros

Slide41

Varying the aperture

Large aperture = small DOF

Small aperture = large DOF

Slide by A. Efros

Slide42

Nice Depth of Field effect

Source: F. Durand

Slide43

Field of View (Zoom)

Slide by A. Efros

Slide44

Field of View (Zoom)

Slide by A. Efros

Slide45

Field of View

Smaller FOV = larger Focal Length

Slide by A. Efros

FOV depends on focal length and size of the camera retina

Slide46

Field of View / Focal Length

Large FOV, small f

Camera close to car

Small FOV, large f

Camera far from the car

Sources: A. Efros, F. Durand

Slide47

Same effect for faces

standard

wide-angle

telephoto

Source: F. Durand

Slide48

Source: Hartley & Zisserman

Approximating an affine camera

Slide49

Lens systems

A good camera lens may contain 15 elements and cost a thousand dollars

The best modern lenses may contain aspherical elements

Slide50

Lens Flaws: Chromatic Aberration

Lens has different refractive indices for different wavelengths: causes color fringing

Near Lens Center

Near Lens Outer Edge

Slide51

Lens flaws: Spherical aberration

Spherical lenses don’t focus light perfectly Rays farther from the optical axis focus closer

Slide52

Lens flaws: Vignetting

Slide53

No distortion

Pin cushion

Barrel

Radial Distortion

Caused by imperfect lenses

Deviations are most noticeable for rays that pass through the edge of the lens

Slide54

Digital camera

A digital camera replaces film with a sensor arrayEach cell in the array is light-sensitive diode that converts photons to electronsTwo common typesCharge Coupled Device (CCD) Complementary metal oxide semiconductor (CMOS)http://electronics.howstuffworks.com/digital-camera.htm

Slide by Steve Seitz

Slide55

CCD vs. CMOS

CCD: transports the charge across the chip and reads it at one corner of the array. An analog-to-digital converter (ADC) then turns each pixel's value into a digital value by measuring the amount of charge at each photosite and converting that measurement to binary formCMOS: uses several transistors at each pixel to amplify and move the charge using more traditional wires. The CMOS signal is digital, so it needs no ADC.

http://www.dalsa.com/shared/content/pdfs/CCD_vs_CMOS_Litwiller_2005.pdf

http://electronics.howstuffworks.com/digital-camera.htm

Slide56

Color sensing in camera: Color filter array

Source: Steve Seitz

Estimate missing components from neighboring values

(

demosaicing)

Why more green?

Bayer grid

Human Luminance Sensitivity Function

Slide57

Problem with

demosaicing

: color moire

Slide by F. Durand

Slide58

The cause of color moire

detector

Fine black and white detail in image

misinterpreted as color information

Slide by F. Durand

Slide59

Color sensing in camera: Prism

Requires three chips and precise alignmentMore expensive

CCD(B)

CCD(G)

CCD(R)

Slide60

Color sensing in camera: Foveon X3

Source: M. Pollefeys

http://en.wikipedia.org/wiki/Foveon_X3_sensor

http://www.foveon.com/article.php?a=67

CMOS sensor

Takes advantage of the fact that red, blue and green light penetrate silicon to different depths

better image quality

Slide61

Issues with digital cameras

Noiselow light is where you most notice noiselight sensitivity (ISO) / noise tradeoffstuck pixelsResolution: Are more megapixels better?requires higher quality lensnoise issuesIn-camera processingoversharpening can produce halosRAW vs. compressedfile size vs. quality tradeoffBloomingcharge overflowing into neighboring pixelsColor artifactspurple fringing from microlenses, artifacts from Bayer patternswhite balanceMore info online: http://electronics.howstuffworks.com/digital-camera.htm http://www.dpreview.com/

Slide by Steve Seitz

Slide62

Historical context

Pinhole model:

Mozi (470-390 BCE), Aristotle (384-322 BCE)Principles of optics (including lenses): Alhacen (965-1039 CE) Camera obscura: Leonardo da Vinci (1452-1519), Johann Zahn (1631-1707)First photo: Joseph Nicephore Niepce (1822)Daguerréotypes (1839)Photographic film (Eastman, 1889)Cinema (Lumière Brothers, 1895)Color Photography (Lumière Brothers, 1908)Television (Baird, Farnsworth, Zworykin, 1920s)First consumer camera with CCD: Sony Mavica (1981)First fully digital camera: Kodak DCS100 (1990)

Niepce, “La Table Servie,” 1822

CCD chip

Alhacen’s notes

Slide63

Modeling projection

The coordinate systemWe will use the pinhole model as an approximationPut the optical center (O) at the originPut the image plane (Π’) in front of O

Source: J. Ponce, S. Seitz

Slide64

Modeling projection

Projection equations

Compute intersection with

Π’ of ray from P = (x,y,z) to ODerived using similar triangles

Source: J. Ponce, S. Seitz

We get the projection by throwing out the last coordinate:

Slide65

Homogeneous coordinates

Is this a linear transformation?

Trick: add one more coordinate:

homogeneous image coordinates

homogeneous scene coordinates

Converting from homogeneous coordinates

no—division by z is nonlinear

Slide by Steve Seitz

Slide66

divide by the third coordinate

Perspective Projection Matrix

Projection is a matrix multiplication using homogeneous coordinates:

Slide67

divide by the third coordinate

Perspective Projection Matrix

Projection is a matrix multiplication using homogeneous coordinates:

In practice: lots of coordinate transformations…

World to

camera coord.

trans. matrix

(4x4)

Perspective

projection matrix(3x4)

Camera to pixel coord. trans. matrix (3x3)

2Dpoint(3x1)

point

(4x1)

Slide68

Weak perspective

Assume object points are all at same depth -z

Slide69

Orthographic Projection

Special case of perspective projectionDistance from center of projection to image plane is infiniteAlso called “parallel projection”What’s the projection matrix?

Image

World

Slide by Steve Seitz

Slide70

Pros and Cons of These Models

Weak perspective (including orthographic) has simpler mathematics

Accurate when object is small relative to its distance.

Most useful for recognition.

Perspective is much more accurate for scenes.

Used in structure from motion.

When accuracy really matters, we must model the real camera

Use perspective projection with other calibration parameters (e.g., radial lens distortion)