计算机视觉
Computer Vision
-- Image Processing 1
钟凡
zhongfan@sdu.edu.c
n
How to describe a pixel?
…
How to describe a pixel?
Spatial coordinate (x, y)
Pixel color ( RGB, YUV, … )
(x, y, R, G, B)
Basic Image Processing
Operates on spatial coordinates (x, y)
Geometric processing: ……
Operates on pixel color ( RGB, YUV, … )
Algebraic processing: ……
代数运算 : 像素灰度变换
Gray Level Transformation ( 灰度变换 )
The most simple image processing task;
Input an image, output the transformed image;
Processing pixel-by-pixel, and do transform to the gray level
of each pixel:
Intensity Adjustment ( 亮度调整 )
Contrast Adjustment ( 对比度调整 )
…
L ' T ( L) L, L ' [0, 255]
void scan_pixels(uchar *data, int width, int height, int step, int
nc)
{
uchar *row=data;
for(int yi=0; yi<height; ++yi, row+=step)
{
uchar *px=row;
for(int xi=0; xi<width; ++xi, px+=nc)
{
// px now address the pixel (xi, yi)
}
}
}
void gray_transform(uchar *data, int width, int height, int step,
…)
{
uchar *row=data;
for(int yi=0; yi<height; ++yi, row+=step)
{
uchar *px=row;
for(int xi=0; xi<width; ++xi, px++)
{
uchar L=*px;
*px= Transform (L);
}
}
}
void gray_transform(uchar *data, int width, int height, int step, const uchar
T[256] )
{
uchar *row=data;
for(int yi=0; yi<height; ++yi, row+=step)
{
uchar *px=row;
for(int xi=0; xi<width; ++xi, px++)
{
*px= T[ *px ];
}
}
}
Intensity
Bright
Dark
Intensity Adjustment: Other Functions
Intensity adjustment by other functions
255 255
218
32
0 128 255 0 128 255
Intensity Intensity
Transformation Functions
Log Transformations
s = c log(1 + r)
c: constant
Power-Law Transformations
s = crγ
c,γ: positive constants
Gamma correction
Some basic gray-
level
transformation
functions used
for image
enhancement.
Linear: Negative, Identity
Logarithmic: Log, Inverse Log
Power-Law: nth power, nth root
Plot of the
equation s =
crγ for various
of γ (c = 1 in all
cases).
(a) Aerial image. (b)
- (d) Results of
applying the power-
law transformation
with c = 1 and γ =
3.0, 4.0, and 5.0,
respectively.
(Original image for
this example
courtesy of NASA.)
a b
c d
Gamma Transformation
s = rγ
Gamma Correction
Contrast Adjustment
High
Low
Contrast Adjustment
Original Mona Lisa Image Contrast Stretched Image
Contrast Adjustment
Contrast ↑
Original
Image
Contrast ↓
Original Image Contrast ↑ Contrast ↓
Contrast Adjustment
To increase the dynamic range of the gray levels in the
image being processed.
255
255
255
142
0 128 255
0 128 255
Binary image
0 48 218 255 Contrast ↓
Contrast ↑
255
Contrast ↑
0 64 192 255
255
64 Contrast ↓
0 255
255
0 128 255
255
0 120 255
Quantize & Threshold
Quantize Threshold
Contrast Adjustment
Contrast Stretching
Piecewise
255 255
216
142
23
0 48 196 255 0 128 255
Piecewise -Linear
255
243
6
0 16 62 255
255
235
13
0 255
35 104
Contrast stretching.
(a) Form of
transformation
function. (b) A low-
contrast image. (c)
Result of contrast
stretching. (d) Result
of thresholding.
(Original image
courtesy of Dr. Roger
Heady, Research
School of Biological
Sciences, Australian
National University,
Canberra, Australia)
a b
c d
Contrast: Sigmoid
代数运算 : 多幅图像
Algebraic Processing of Multiple Images
输入为多幅大小相同的图像
输出图像与输入图像大小相同
基于对应像素进行计算
1 2 N
O( x, y ) f [ I ( x, y ), I ( x, y ),..., I ( x, y )]
X
C(x,y) = A(x,y) * B(x,y)
= Used in color image
XOR
XOR
=
g(x,y) = f(x,y) XOR h(x,y)
Black0 White1
OR
g(x,y) = f(x,y) OR h(x,y)
OR
=
Black1 White0
AND
g(x,y) = f(x,y) AND h(x,y)
AND
=
Black1 White0
Alpha Blending
…………………………………………………………………
…………………………………………………………………
…………………………………………………………………
…………………………………………………………………
…………………………………………………………………..
…………………………………………………………………
…………………………………………………………………
…………………………………………………………………..
…………………………………………………………………
…………………………………………………………………
…………………………………………………………………..
Alpha Blending
= +
RGB A (alpha)
Alpha Blending
F B C
Image Composite ( 合成 ) : C F (1 ) B
合成的逆?
Given C, to solve alpha, F, B ?
F B C
Image Matting (masking, 抠图 )
Extracting specific object or region from image (get the
alpha map)
C
Image Matting
Need to handle semi-transparent regions and hairs etc.
Image Matting
Need to handle semi-transparent regions and hairs etc.
Need to get F, B ?
抠图
F, B ?
C
合成
F B C’
Use C as F ?
Get F by foreground restoration ( 前景恢复 )
源图 以C为F 正确的 F
Background Color B ?
….
背景相减( Background Subtraction)
当前帧 I 背景 B
?
T= 500 T= 1000
T= 2000 ? T= 4000
I ( x, y ) (r , g , b)
B( x, y ) (r , g , b)
Diff ( x, y ) || I ( x, y ) B( x, y ) ||2
Is foreground if Diff ( x, y ) T (a given threshold)
基本几何处理
The Geometry of Image
A 2D array of points (pixels)
Basic geometric processing
水平 & 垂直翻转 (flip)
缩放 ( resize / zoom in /zoom out /scale)
旋转 (rotation)
……
仿射变换 (Affine Transform)
透视变换 (Perspective Transform)
……
图像变形 (Image Warping)
Flip
Horizontal Flip
Vertical Flip
实现?
void vflip(const void *in, int width, int height, int istep, int pix_size, void *out, int
ostep)
{//vertical
}
void hflip(const void *in, int width, int height, int istep, int pix_size, void *out, int
ostep)
{//horizontal
}
void vflip(const void *in, int width, int height, int istep, int pix_size, void *out, int
ostep)
{
out=(char*)out+(height-1)*ostep;
for(int yi=0; yi<height; ++yi, in=(char*)in+istep, out=(char*)out-ostep)
{
memcpy(out, in, width*pix_size);
}
}
void hflip(const void *in, int width, int height, int istep, int pix_size, void *out, int
ostep)
{
char * _in=(char*)in;
char *_out=(char*)out+(width-1)*pix_size;
for(int yi=0; yi<height; ++yi, _in+=istep, _out+=ostep)
{
char *in_x=_in,
char *out_x=_out;
for(int xi=0; xi<width; ++xi, in_x+=px_size, out_x-=px_size)
memcpy(out_x, in_x, px_size) ;
}
}
Resize ( zoom in/zoom out )
??
Zoom in ( 放大 )
The correspondence of pixels before and after resize?
[ x, y ] [ x ', y ']
[ x ', y '] [ sx .x, s y . y ]
Zoom in ( 放大 )
How to fill the new added pixels?
Solution-1?
Projection (from source to target):
对小图中的每个像素,计算其在大图中对应的像素,再拷贝小图的像素值到大图。
( x, y )
( x ', y ')
小图 大图
Solution-2?
Lookup (from target to source):
对大图中的每个像素,计算其在小图中对应的像素,再拷贝小图的像素值到大图。
( x ', y ')
( x, y )
大图 小图
Projection vs Lookup ?
….
Projection
Some pixels may not be filled…..
f
Lookup
How to get pixel color in fractional coordinates, e.g. (1.3, 2.7)?
f 1
Target Source
Resampling ( 重采样 )
基于邻近像素的值,计算非整数位置上的颜色值
最近邻 ( Nearest Neighbor)
双线性插值 (Bilinear Interpolation)
双三次插值 (Bicubic Interpolation)
Nearest Bilinear Bicubic
Nearest Neighbor ( 最近邻 )
Find the nearest source pixel, and output its color
x int( x 0.5)
y int( y 0.5)
Bilinear Interpolation
For the 4-neighboring pixels, do horizontal and vertical 1D linear
interpolation, respectively.
2 次水平, 1 次垂直
2 次垂直, 1 次水平
2 水平 +1 垂直 2 垂直 +1 水平
Bilinear Interpolation
两种方式是否等价?
2 水平 +1 垂直 2 垂直 +1 水平
Implementation
一次双线性插值最少需多少次乘法运算?
Implementation
float bilinear(float a, float b, float c, float d, float dx, float dy)
{
float h1=a+dx*(b-a); // = (1-dx)*a + dx*b
float h2=c+dx*(d-c);
return h1+dy*(h2-h1);
}
a b
(d x , d y )
c d
NN vs Bilinear
NN Bilinear
Bicubic Interpolation
y ax b ?
Bicubic Interpolation
y ax b y ax 3 bx 2 cx d
Bicubic Interpolation
y ax b y ax 3 bx 2 cx d
a, b, c, d ?
f (0) f (1) f (2) f (3)
f (0) d
f (1) a b c d
[a, b, c, d ]
f (2) 8a 4b 2c d
f (3) 27 a 9b 3c d
基于导数的约束
f ( 1) f (0) f (1) f (2)
y ' 3ax 2 2bx c
f (0) d
f (1) a b c d
[a, b, c, d ]
f '(0) c
f '(1) 3a 2b c
The derivative of discrete function
f ( 1) f (0) f (1) f (2)
[ f ( x 1) f ( x)] [ f ( x) f ( x 1)]
f '( x)
2
f ( x 1) f ( x 1)
f '( x)
2
Bilinear & Bicubic
Bilinear Bicubic
NN Bilinear Bicubic
Super-Resolution ( 超分辨率 )
常见几何操作
水平 & 垂直翻转 (flip)
缩放 ( resize / zoom in /zoom out /scale)
旋转
……
仿射变换 (Affine Transform)
透视变换 (Perspective Transform)
……
图像变形 (Image Warping)
Image Transform ( 图像变换 )
D f (I )
( x ', y ')
( x, y )
D I
Rotation
f : [ x ', y '] [ x cos y sin , x sin y cos ]
??
The center of rotation
Rotation around specified center?
f ( x, y | ) [ x cos y sin , x sin y cos ]
f ( x, y | , cx , c y ) ??
(c x , c y )
(c x , c y )
更复杂情况
先绕 p1 旋转一定角度,再绕 p2 旋转一定角度…
依次绕 N 个中心旋转?
……
旋转、平移、缩放等复合?
Homogeneous Coordinates & Transform Matrix
homogeneous [ x, y ] [ x, y,1] [ x, y, ]
coordinates:
( 齐次坐标 )
x x ' a b c x
x '
2D Affine
y ' A23 y y ' d e
f y
Transform: 1 1 0 0 1 1
( 2 维仿射变换)
x ' x x ' a b c x
2D Perspective
y ' A y y ' d e
f y
33
Transform:
w 1 w g h 1 1
( 2 维透视变换)
Transform Matrix
x
x ' 1 0 dx
Translation ( 平移 ) : y ' y
0 1 d y
1
x
x ' s
x 0 0
Scale ( 缩放 ) : y ' y
0 s y 0 1
x
x ' cos sin 0
Rotation ( 旋转 ) : y ' y
sin cos 0
1
Transform Matrix
x
x '
x sy 1 s 0
X 切变( Shear):
y ' y
0 1 0 y
1
x
x '
x 1 0 0
Y 切变 :
y ' sx y s 1 0 y
1
Transformations in Matrix Form
(0, 0)
(cx, cy ) (cx, cy )
(cx, cy ) (cx, cy )
= 平移( -cx, -cy ) + 旋转 angle + 平移 ( cx, cy )
x ' 1 0 cx cos sin 0 1 0 cx x
y ' 0 1 c sin cos
0 0 1 c y y
y
1 0 0 1 0 0 1 0 0 1 1
Transformations in Matrix Form
多个变换的复合可表示为其变换矩阵的乘积
x ' 1 0 cx cos sin 0 1 0 cx x
y ' 0 1 c sin cos
0 0 1 c y y
y
1 0 0 1 0 0 1 0 0 1 1
x ' cos sin cx cos c y sin cx x
y ' sin cos
cx sin c y cos c y y
1 0 0 1 1
More complex cases ??
先绕 p1 旋转一定角度,再绕 p2 旋转一定角度…
依次绕 N 个中心旋转?
……
旋转、平移、缩放等复合?
Affine Transform
Linear Transform :
f (a b) f (a ) f (b)
f (ka ) kf (a)
Affine = Linear + Translation f (a) t
a b x u
e.g 2D 仿射变换: A
c d y v
Affine Transform
Keeping the following geometric properties:
共线性
共线向量的长度比
重心坐标
p2
p1
p0
Special Affine Transform
刚性变换( Rigid Transformation)
只包含平移和旋转
保持物体的形状 ( 保角)和尺寸
相当于正交变换
相似变换( Similar Transformation)
只包含平移、旋转和等比缩放
保持物体的形状
Perspective Transform
Affine Transform :
在同一平面内部的变换
Perspective Transform
可表示不同视角观察到的同一平面,或同一视角观察到的不同平面之间的变换
应用:图像匹配
t t 1
应用:图像匹配
基于仿射变换的图像匹配
先计算从第 t 帧到第 t+1 帧的仿射变换 A
利用 A 对第 t 帧的图像进行变换,将变换的结果作为与第 t+1 帧配准的图像。
t t 1
估计两副图像之间的仿射变换 ?
p1
p1 '
A p2 '
p2 p0
p0 '
估计两副图像之间的仿射变换 ?
A
p1
p1 '
p2 '
p2 p0 p0 '
不共线的 3 个平面点对决定一个二维仿射变换
u0 a v0b c u0 '
u d v e f v '
ui 0 0 0
ui ' a b c u1a v1b c u1 '
v ' d e f vi
i
1 u1d v1e f v1 '
u2 a v2b c u2 '
pi [ui , vi ] u2 d v2 e f v2 '
>3 个点对?
A arg min || Api pi ' || 2
A i
点对?
通过特征点跟踪获得
应用:图像匹配
基于仿射变换的图像匹配
在第 t 帧检测特征点(特征检测)
计算特征点在第 t+1 帧的对应(特征跟踪)
根据特征点对估计第 t 帧到第 t+1 帧的仿射变换 A
利用 A 对第 t 帧的图像进行变换,将变换的结果作为与第 t+1 帧配准的图像:
Apt pt 1
常见几何操作
水平 & 垂直翻转 (flip)
缩放 ( resize / zoom in /zoom out /scale)
旋转
……
仿射变换 (Affine Transform)
透视变换 (Perspective Transform)
……
图像变形 (Image Warping)
图像变形
实验 1.3 :图像变形
记 [x’, y’]=f([x, y]) 为像素坐标的一个映射,实现 f 所表示的图像形变。
f 的逆映射为:
[ x ', y '] if r 1
[ x, y ] f 1 ([ x ', y '])
[cos( ) x ' sin( ) y ',sin( ) x ' cos( ) y '] otherwise
其中: r x '2 y '2 (1 r ) 2
[ x, y ],[ x ', y '] 都是中心归一化坐标,请先进行转换;
中心归一化坐标
(0, 0) ( 1, 1)
(1, 1)
(0, 0)
(W , H ) ( 1,1) (1,1)
x 0.5W y 0.5 H
x' y'
0.5W 0.5 H
逆向查找的缺点 / 局限性?
D f (I )
( x ', y ')
( x, y )
D I
正向变换如何实现 ?
f
交互式图像变形
简述图像几何变换两种方式:正向投射和逆向查找的过程,并说明两种方式分别需要解决的关键问
题。
三次插值和双线性插值相比,抗锯齿和马赛克的效果更好,请说明为什么。
已知两幅图像中的 N 个点对 ( , ) , i=1,…,N, N>3, 请简述求两幅图像之间符
合这 N 个点对约束的仿射变换 A 的方法。 pi qi