Video Processing and Communications / Yao Wang, Jôrn. Ostermann, Ya-Qin
Zhang,. Handbook of Image and Video Processing / Alan C. Bovik. 6. Syllabus.
Administration
(Topics in) Video Processing Computer Science Semester B
•
Pre-requisites / prior knowledge
•
Regular course – not a seminar
•
Course Home Page:
Yacov Hel-Or
[email protected] Yossi Rubner
[email protected]
1
Some slides were taken from: Bahadir Gunturk, Yung-Yu Chuang, Ran Eshel
•
–
Lecture slides and handouts
–
“What’s new”
–
Homework, grades
Exercises: –
Programming in Matlab, ~3 Assignments
–
Final project
2
Schedule
Administration (Cont.) •
Matlab software:
•
1
Introduction
06.03.07
Acquisition
13.03.07
Post-acquisition processing 1
Available in PC labs
–
Student version
20.03.07
Post-acquisition processing 2
–
For next week: Run Matlab “demo” and read Matlab primer until
27.03.07
Registration
section 13.
03.04.07
Passover holiday
10.04.07
Passover holiday
17.04.07
Panorama and stitching
24.04.07
Independence day
01.05.07
Super-resolution
08.05.07
High-Dynamic Range
15.05.07
Guest lecture
22.05.07
Shavuot
29.05.07
Tracking / Recognition (project presentation)
05.06.07
Video Coding (guest lecture)
–
Final Grade will be based on: Exercises (60%) , Final project (40%)
–
Exercises will be weighted
–
Exercises can be submitted in pairs
Office Hours: by email appointment to
[email protected] 3
Subject
–
Grading policy:
•
Date 27.02.07
4
Further Reading
Syllabus • Introduction Pinhole camera model Shading models Light and color HVS pathway
Multidimensional Signal, Image, and Video Processing and Coding / John .W. Woods Digital Video Processing / Murat Tekalp
• Acquisition Camera pipe-line Sensors Temporal sampling (interlacing/progressive) Spatial sampling (Bayer) Noise models & distortions Camera parameters trade-offs Video formats
Video Processing and Communications / Yao Wang, Jôrn Ostermann, Ya-Qin Zhang, Handbook of Image and Video Processing / Alan C. Bovik
• Post-Acquisition Processing Geometrical distortion rectification White balancing De-interlacing De-mosaicing De-noising 5
6
Introduction (today)
• Image Registration Global motion registration Dense motion: optical flow
• Spatio-Temporal Processing Mosaicing: panorama, stitching, blending Video summarizing Video in-painting
• What is an image ? • What is a color ?
• Enhancement & Restoration Super-resolution: spatial/temporal High Dynamic Range
• Tracking (tentative) Kalman-filtering Particle-filtering Mean-Shift
• Recognition Action detection Anomaly behavior detection
• Coding Video Compression 7
2
8
Acquisition – Camera pipe-line – Sensors – Temporal sampling (interlacing/progressive) – Spatial sampling (Bayer) – Noise models & distortions – Camera parameters trade-offs – Video formats
9
10
Post-acquisition Processing – Geometrical distortion rectification – White Balancing – De-interlacing – De-mosaicing – De-noising
11
3
Image De-mosaicing
12
from Helmut Dersch
De-interlacing
Correcting radial distortion
13
Image Registration – Global motion registration – Dense motion: Optical Flow
warmer +3
White Balancing
automatic white balance
15
4
16
14
17
Global motion registration
Optical Flow
18
Spatio-Temporal Processing +
– Mosaicing: panorama, stitching, blending – Video summarizing – Video in-painting
y
5
+
t
x 19
+
+
example: http://www.cs.washington.edu/education/courses/cse590ss/01wi/projects/project1/students/dougz/index.html 20
Panorama
Video Panorama 21
22
Video summarizing
Video inpainting 23
6
24
Enhancement and Restoration
HDR
– Super-resolution: spatial/temporal – High Dynamic Range
Aperture
High Aperture: Narrow depth of field
Over Exposure: Saturated image
Under Exposure: Bad signal/noise ratio
Long Shutter: Motion blur
Shutter Duration
25
26
HDR
27
7
HDR
28
Example – Low Light
29
30
31
32
Example - Super-resoluton
8
33
34
Action detection / recognition – Action detection – Anomaly behavior detection
35
9
36
Anomaly behavior detection
Video Coding
Video Processing Introduction
• Compression • Video formats
37
38
The Visual Sciences
Image/Video Processing - Computer Vision Low Level
2D Images
Acquisition, representation, compression,transmission
Image/video Processing
Image/video Processing
image enhancement Rendering
edge/feature extraction
Computer Vision
Pattern matching
Computer Vision
3D Object 39
10
Geometric Modeling
image "understanding“ (Recognition, 3D)
Model
High Level 40
What is an Image ?
Today’s Plan
• An image is a projection of a 3D scene into a 2D projection plane. • An image can be defined as a 2 variable function I(x,y) , where for each position (x,y) in the projection plane, I(x,y) defines the light intensity at this point.
• Light and the EM spectrum • The H.V.S. and Color Perception
41
42
Camera trial #1
Pinhole camera pinhole camera
scene
film
scene
Put a piece of film in front of an object. 43 source: Yung-Yu Chuang
11
44
barrier
film
Add a barrier to block off most of the rays. • It reduces blurring • The pinhole is known as the aperture • The image is inverted
source: Yung-Yu Chuang
The Pinhole Camera Model (where) (x,y)
Y d
X
(x,y,z) center of projection (pinhole)
d – focal length
45
46
The Shading Model (what)
Z
⎛X⎞ ⎛ x ⎞ ⎛1 0 0 0⎞⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ Y ⎟ ⎜ y ⎟ = ⎜0 1 0 0⎟⎜ ⎟ ⎜ w⎟ ⎜0 0 −1 d 0⎟⎜ Z ⎟ ⎝ ⎠ ⎝ ⎠⎜ 1 ⎟ ⎝ ⎠
Shading Model Parameters • The factors determining the shading effects are: – The light source properties: • Positions, Electromagnetic Spectrum, Shape.
– The surface properties: • Position, orientation, Reflectance properties.
– The eye (camera) properties: • Position, orientation, Sensor spectrum sensitivities.
Shading Model: Given the illumination incident at a point on a surface, what is reflected? 47
12
48
Light and the Visible Spectrum
The light Spectrum Electromagnetic Radiation - Spectrum Gamma X rays
-12
10
Ultraviolet
Infrared
-8
-4
10
10
Radar
ShortAC FM TV wave AM electricity
4
1 10 10 Wavelength in meters (m)
8
Visible light
400 nm 500 nm 600 nm 700 nm
Newton’s Experiment, 1665 Cambridge. Discovering the fundamental spectral components of light. 49
Wavelength in nanometers (nm) 50
Spectral Power Distribution
Monochromators Monochromators measure the power or energy at different wavelengths
The Spectral Power Distribution (SPD) of a light is a function e(λ) which defines the energy at each wavelength.
Relative Power
1
0.5
0 400
500
600
Wavelength (λ) 51
13
52
700
Examples of Spectral Power Distributions 1
Surface Parameters
1
Incident light normal
Specular reflection
0.5
0.5
Diffuse reflection 0
400
500
600
700
0
Blue Skylight 1
0.5
0.5
53
400
500
600
700
Red monitor phosphor
500
600
700
Tungsten bulb
1
0
400
0
Diffuse (lambertian) reflection reflected randomly between color particles reflection is equal in all directions
400
500
600
Specular reflection mirror like reflection at the surface
700
Monochromatic light
54
Spectral Property of Lambertian Surfaces Yellow
Red
1
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2 400 1
0.2 500
600
Blue
700
0.8
0.6
0.6
0.4
0.4
400
55
14
56
500
600
700
500
600
700
Gray
1
0.8
0.2
Different Types of Surfaces
400
0.2 500
600
700
400
Wavelength (nm)
Surface Body Reflectances (albedo)
R V
R
N L
N L
V
θ
θ
Surface properties Light properties
Ambient reflection: Iamb=
K(λ) ea(λ)
geometry
Ambient reflection: Iamb=
K(λ) ea(λ)
Diffuse reflection:
Idiff= K(λ) ep(λ) (N⋅L)
Diffuse reflection:
Idiff= K(λ) ep(λ) (N⋅L)
Specular reflection:
Ispec= Ks(λ)ep (λ) (R⋅V)n
Specular reflection:
Ispec= Ks(λ)ep (λ) (R⋅V)n
• ep ea - the ambient and point light intensities. • K , Ks ∈ [0,1] - the surface ambient / diffuse / specular reflectivity. •57 N - the surface normal, L - the light direction, V – viewing direction
• ep ea - the ambient and point light intensities. • K , Ks ∈ [0,1] - the surface ambient / diffuse / specular reflectivity. •58 N - the surface normal, L - the light direction, V – viewing direction
• The final illumination equation:
I(λ) = Iamb+Idiff+Ispec • If several light sources are placed in the scene:
I(λ)= Iamb+Σk (Ikdiff+Ikspec) Ambient surface 59
15
Diffuse surface
Diffuse + Specular 60
The Human Visual System
Composition of Light Sources
Lens Cornea Pupil Iris
Fovea
Optic Nerve
Vitreous Humor Optic Disc Retina
Ocular Muscle
61
¾ Cornea - קרנית ¾ Pupil - אישון ¾ Iris - קשתית 62 ¾ Retina - רשתית
The Visual Pathway
Retina Optic Nerve Optic Chiasm Lateral Geniculate Nucleus (LGN) Visual Cortex
63
16
64
The Human Retina
Eye v.s. Camera
cones
rods
horizontal
bipolar
amacrine ganglion
light 65
Yaho Wang’s slides
• Retina contains 2 types of photo-receptors – Cones: • Day vision, can perceive color tone
– Rods: • Night vision, perceive brightness only
66
Cones: • High illumination levels (Photopic vision) • Sensitive to color (there are three cone types: L,M,S) • Produces high-resolution vision • 6-7 million cone receptors, located primarily in the central
portion of the retina
Relative sensitivity
Cone Spectral Sensitivity 1 0.75 0.5 0.25 0 67
17
L M M S
68
400
500 600 Wavelength (nm)
700
A side note: • Humans and some monkeys have three types of cones (trichromatic vision); most other mammals have two types of cones (dichromatic vision). • Marine mammals have one type of cone. • Most birds and fish have four types. •Lacking one or more type of cones result in color blindness.
Photoreceptor Distribution
Rods:
Foveal Periphery photoreceptors
• Low illumination levels (Scotopic vision).
• • • •
Highly sensitive (respond to a single photon). Produces lower-resolution vision 100 million rods in each eye. No rods in fovea. Relative sensitivity
Rod Spectral Sensitivity 1 0.75 0.5 0.25 0 69
400
500 600 Wavelength (nm)
700
S - Cones
rods
L/M - Cones
70
Cone Receptor Mosaic (Roorda and Williams, 1999)
Cone’s Distribution: • L-cones (Red) occur at about ~65% of the cones throughout the retina . • M-cones (green) occur at about ~30% of the cones. • S-cones (blue) occur at about ~2-5% of the cones (Why so few?).
Receptors per square mm
18
71
18
L-cones
M-cones
S-cones
x 10
4
rods cones
14
Distribution of rod and cone photoreceptors
10 6 2 -60
72
-40
-20
0
fovea
20
40
60
Degrees of Visual Angle
The Cone Responses Assuming Lambertian Surfaces
Sensors
Illuminant
Surface
Metamer - two lights that appear the same visually. They might have different SPDs (spectral power distributions).
L = ∫ l (λ )e(λ ) k (λ ) M = ∫ m(λ )e(λ ) k (λ )
200
S = ∫ s ( λ )e(λ ) k (λ )
74
600
700
0
400
500
600
• Given a set of 3 primaries, one can determine for every spectral distribution, the intensity of the guns required to match the color of that spectral distribution. •
The 3 numbers can serve as a color representation.
test
match
T(λ)
Color matching with 3 primaries. Primaries
75
19
700
Color Matching Experiment
Thomas Young (1773-1829) -
Helmholtz & Maxwell (1850) -
500
The phosphors of the monitor were set to match the tungsten light.
The Trichromatic Color Theory
A few different retinal receptors operating with different wavelength sensitivities will allow humans to perceive the number of colors that they do. Suggested 3 receptors.
400
Wavelength (nm)
e(λ) – Fixed, point source illuminant k(λ) –surface’s reflectance l(λ),m(λ),s(λ) – Cone responsivities
Trichromatic: “tri”=three “chroma”=color color vision is based on three primaries (i.e., it is 3D).
400
100
0
73
Monitor emission 800
Power
Output
Tungsten light
76
+
-
R(λ)
+
-
G(λ)
+
-
B(λ)
T (λ ) ≡ rR (λ ) + gG (λ ) + bB (λ )
Color matching experiment for Monochromatic lights 1
0.5
0
0.5
400 500 600 700
0
3
1
Primary Intensity
1
0.5
400 500 600 700
0
400 500 600 700
r(λ)
2 1
b(λ)
g(λ)
0
Primary Intensities
400
500 600 Wavelength (nm)
700
Stiles & Burch (1959) Color matching functions. Primaries are: 444.4 525.3 and 645.2 Problems: Some perceived colors cannot be generated. This is true for any choice of visible primaries. 77
78
• Observation - Color matching is linear: – if (S≡P) then (S+N≡P+N) – if (S≡P) then (α S≡ α P) • Outcome 1: Any T(λ) can be matched:
The CIE Color Standard • The CIE (Commission Internationale d’Eclairage) defined three hypothetical lights X, Y, and Z whose matching functions are positive everywhere:
r = ∫ T (λ ) r (λ ) dλ ; g = ∫ T (λ ) g (λ ) dλ ; b = ∫ T (λ )b (λ ) dλ
• Outcome 2: CMF can be calculated for any chosen primaries U(λ), V(λ), W(λ):
79
20
⎛ u ⎞ ⎛ cru ⎜ ⎟ ⎜ ⎜ v ⎟ = ⎜ crv ⎜ w⎟ ⎜ c ⎝ ⎠ ⎝ rw
cgu cgv c gw
cbu ⎞⎛ r ⎞ ⎟⎜ ⎟ cbv ⎟⎜ g ⎟ cbw ⎟⎠⎜⎝ b ⎟⎠
80
Tristimulus
CIE Chromaticity Diagram Input light spectrum
Let X, Y, and Z be the tristimulus values.
A color can be specified by its trichromatic coefficients, defined as
y
x=
X X +Y + Z
X ratio
y=
Y X +Y + Z
Y ratio
z=
Z X +Y + Z
Z ratio x
Two trichromatic coefficients are enough to specify a color. (x + y + z = 1) From: Bahadir Gunturk
81
CIE Chromaticity Diagram
From: Bahadir Gunturk
CIE Chromaticity Diagram
Input light spectrum
Input light spectrum
y
y
x
From: Bahadir Gunturk
21
82
x
83
From: Bahadir Gunturk
84
CIE Chromaticity Diagram
CIE Chromaticity Diagram
Input light spectrum
Input light spectrum
y
700nm
Boundary
Boundary 380nm x
From: Bahadir Gunturk
85
CIE Chromaticity Diagram
From: Bahadir Gunturk
86
CIE Chromaticity Diagram
Light composition
Light composition
Light composition
From: Bahadir Gunturk
22
87
From: Bahadir Gunturk
88
CIE Chromaticity Diagram
The sRGB Color Standard
The CIE chromaticity diagram is helpful to determine the range of colors that can be obtained from any given colors in the diagram.
• The sRGB is a device-independent color space. It was created in 1996 by HP and Microsoft for use on monitors and printers. • It is the most commonly used color space. • It is defined by a transformation from the xyz color space.
Gamut: The range of colors that can be produced by the given primaries.
Source: http://hyperphysics.phy-astr.gsu.edu/hbase/vision/visioncon.html#c1 http://www.brucelindbloom.com/index.html?Eqn_ChromAdapt.html
89
90
Color Appearance
Color matching predicts matches, not appearance
91
23
92
Color Appearance
93
Color Appearance
94
RGB Color Space (additive)
Color Spaces
95
24
• Define colors with (r, g, b) amounts of red, green, and blue
96
CMY Color Space (subtractive) • Cyan, magenta, and yellow are the complements of red, green, and blue – We can use them as filters to subtract from white – The space is the same as RGB except the origin is white instead of black
97
HSV color space • Hue - the color we see (red, green, purple). • Saturation - how pure is the color (how far the color from gray ). • Value (brightness) - how bright is the color.
98
Opponent Color Space
HSV - a more intuitive color space
• Observation: Color bands are highly correlated in high spatial frequencies
Saturation Value
Hue
99
25
h ( x, y ) ∗
100
A joint Histogram of gx v.s. bx 500
450
450
400
400
350
350
Blue derivative
Green derivative
A joint Histogram of rx v.s. gx 500
300 250 200
150 100
50
50 200
300
400
500
100
102
A joint Histogram of rx v.s. bx 500 450 400 350 300 250 200 150 100 50 100
200
300
200
300
Green derivative
101
Blue derivative
200
100
Red derivative
400
500
Red derivative
26
250
150
100
103
300
104
400
500
• Define a new color basis (l,c1,c2): 1 ⎞ ⎛l ⎞ ⎛ R⎞ ⎛1 1 ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ c1 ⎟ = T ⎜ G ⎟ where T = n⎜1 − 1 0 ⎟ ⎜c ⎟ ⎜ B⎟ ⎜1 1 − 2 ⎟ ⎝ 2⎠ ⎝ ⎠ ⎝ l – luminance⎠ C1- red/green C2 – blue/yellow A joint Histogram of rx v.s. g x 500 450
L
Green derivative
400
Joint histograms of R v.s. G for a low pass images.
350
l – luminance value C1 – Red-Green C2 – Blue-Yellow
300 250 200
c1
150 100 50
105
106
100
200
300
400
500
Red derivative
Comments: – l channel encodes the color luminance. – C1 and C2 encodes the chrominance. – In the chrominance channels high freq. are attenuated. – It the luminance channel high freq. are maintained. – The 3 opponent channels are uncorrelated in the high freq. – Efficient for encoding 107
27
High freq. details
Low freq. details
Low freq. details
Claim: The HVS’ high spatial sensitivity in the luminance domain and low spatial sensitivity in the chrominance domains is a direct outcome of the statistical properties of color images! 108
Original Image
109
After blurring C1 and C2 bands
110
After blurring l band as well
Opponent Color Spaces • • • •
The standard representation used in TV broadcasting Backwards compatibility with B/W TV Low bit rate is needed in the chrominance channels There are various opponent representations: – YIQ - used for NTSC color TV – YUV (also called YCbCr) - used for PAL TV and video
• Question: why S cones are sparsely populated? 111
28
112
THE END
113
29