This tutorial discusses how to use MATLAB for image processing. Some familiarity with
MATLAB is
assumed (you should know how to use matrices and write an Mfile).
It is helpful to have the MATLAB Image Processing Toolbox, but
fortunately, no toolboxes are needed for most operations. Commands
requiring the Image Toolbox are indicated with [Image Toolbox].
Image representation
There are five types of images in MATLAB.
 Grayscale. A grayscale image M pixels tall and
N pixels wide is represented as a matrix of double datatype of size
M×N. Element values (e.g., MyImage(m,n))
denote the pixel grayscale intensities in [0,1] with 0=black and
1=white.
 Truecolor RGB. A truecolor redgreenblue (RGB) image is
represented as a threedimensional M×N×3 double matrix. Each pixel has red, green, blue
components along the third dimension with values in [0,1], for
example, the color components of pixel (m,n) are
MyImage(m,n,1) = red, MyImage(m,n,2) = green,
MyImage(m,n,3) = blue.
 Indexed.
Indexed (paletted) images are represented with an index matrix of size
M×N and a colormap matrix of size
K×3. The colormap holds all colors used in the
image and the index matrix represents the pixels by referring to
colors in the colormap. For example, if the 22nd color is magenta
MyColormap(22,:) = [1,0,1], then MyImage(m,n) = 22
is a magentacolored pixel.
 Binary. A binary image is represented by an M×N logical matrix where pixel values are 1 (true) or 0 (false).
 uint8. This type uses less memory and some operations
compute faster than with double types. For simplicity, this tutorial
does not discuss uint8 further.
Grayscale is usually the preferred format for image processing. In
cases requiring color, an RGB color image can be decomposed and
handled as three separate grayscale images. Indexed images must be
converted to grayscale or RGB for most operations.
Below are some common manipulations and conversions. A few
commands require the Image Toolbox and are indicated
with [Image Toolbox].
% Display a grayscale or binary image
image(MyGray*255);
axis image
colormap(gray(256));
% Display an RGB image (error if any element outside of [0,1])
image(MyRGB);
axis image
% Display an RGB image (clips elements to [0,1])
image(min(max(MyRGB,0),1));
axis image
% Display an indexed image
image(MyIndexed);
axis image
colormap(MyColormap);
% Separate the channels of an RGB image
MyRed = MyRGB(:,:,1);
MyGreen = MyRGB(:,:,2);
MyBlue = MyRGB(:,:,3);
% Put the channels back together
MyRGB = cat(3,MyRed,MyGreen,MyBlue);
% Convert grayscale to RGB
MyRGB = cat(3,MyGray,MyGray,MyGray);
% Convert RGB to grayscale using simple average
MyGray = mean(MyRGB,3);
% Convert RGB to grayscale using NTSC weighting [Image Toolbox]
MyGray = rgb2gray(MyRGB);
% Convert RGB to grayscale using NTSC weighting
MyGray = 0.299*MyRGB(:,:,1) + 0.587*MyRGB(:,:,2) + 0.114*MyRGB(:,:,3);
% Convert indexed image to RGB [Image Toolbox]
MyRGB = ind2rgb(MyIndexed,MyColormap);
% Convert indexed image to RGB
MyRGB = reshape(cat(3,MyColormap(MyIndexed,1),MyColormap(MyIndexed,2),...
MyColormap(MyIndexed,3)),size(MyIndexed,1),size(MyIndexed,2),3);
% Convert an RGB image to indexed using K colors [Image Toolbox]
[MyIndexed,MyColormap] = rgb2ind(MyRGB,K);
% Convert binary to grayscale
MyGray = double(MyBinary);
% Convert grayscale to binary
MyBinary = (MyGray > 0.5);
Reading and writing image files
MATLAB can read and write
images with the imread and imwrite commands. Although a fair
number of file formats are supported, some are not. Use imformats to see what your installation supports:
>> imformats
EXT ISA INFO READ WRITE ALPHA DESCRIPTION

bmp isbmp imbmpinfo readbmp writebmp 0 Windows Bitmap (BMP)
gif isgif imgifinfo readgif writegif 0 Graphics Interchange Format (GIF)
jpg jpeg isjpg imjpginfo readjpg writejpg 0 Joint Photographic Experts Group (JPEG)
pbm ispbm impnminfo readpnm writepnm 0 Portable Bitmap (PBM)
pcx ispcx impcxinfo readpcx writepcx 0 Windows Paintbrush (PCX)
pgm ispgm impnminfo readpnm writepnm 0 Portable Graymap (PGM)
png ispng impnginfo readpng writepng 1 Portable Network Graphics (PNG)
pnm ispnm impnminfo readpnm writepnm 0 Portable Any Map (PNM)
ppm isppm impnminfo readpnm writepnm 0 Portable Pixmap (PPM)
...
When reading images, an unfortunate problem is that imread
returns the image data in uint8 datatype, which must be converted to
double and rescaled before use. So instead of calling imread
directly, I use the following Mfile function to read and convert
images:
function Img = getimage(Filename)
%GETIMAGE Read an image given a filename
% V = GETIMAGE(FILENAME) where FILENAME is an image file. The image is
% returned either as an MxN double matrix for a grayscale image or as an
% MxNx3 double matrix for a color image, with elements in [0,1].
% Pascal Getreuer 20082009
% Read the file
[Img,Map,Alpha] = imread(Filename);
Img = double(Img);
if ~isempty(Map) % Convert indexed image to RGB
Img = Img + 1;
Img = reshape(cat(3,Map(Img,1),Map(Img,2),Map(Img,3)),size(Img,1),size(Img,2),3);
else
Img = Img/255; % Rescale to [0,1]
end
Rightclick and save getimage.m to use
this Mfunction. If image baboon.png is in
the current directory (or somewhere in the MATLAB search path), you can read it with
MyImage = getimage('baboon.png').
You can also use partial paths, for example if the image is
in <current directory>/images/ with
getimage('images/baboon.png').
To write a grayscale or RGB image, use
imwrite(MyImage,'myimage.png');
Take care that MyImage is a double matrix with elements in
[0,1]—if improperly scaled, the saved file will probably be blank.
When writing image files, I highly recommend using the PNG file format.
This format is a reliable choice since it is lossless, supports
truecolor RGB, and compresses pretty well. Use other formats with caution.
Basic operations
Below are some basic operations on a grayscale image u. Commands requiring the Image Toolbox are indicated
with [Image Toolbox].
% Statistics
uMax = max(u(:)); % Compute the maximum value
uMin = min(u(:)); % Minimum
uPower = sum(u(:).^2); % Power
uAvg = mean(u(:)); % Average
uVar = var(u(:)); % Variance
uMed = median(u(:)); % Median
hist(u(:),linspace(0,1,256)); % Plot histogram
% Basic manipulations
uClip = min(max(u,0),1); % Clip elements to [0,1]
uPad = u([1,1:end,end],[1,1:end,end]); % Pad image with onepixel margin
uPad = padarray(u,[k,k],'replicate'); % Pad image with kpixel margin [Image Toolbox]
uCrop = u(RowStart:RowEnd,ColStart:ColEnd); % Crop image
uFlip = flipud(u); % Flip in the up/down direction
uFlip = fliplr(u); % Flip left/right
uResize = imresize(u,ScaleFactor); % Interpolate image [Image Toolbox]
uRot = rot90(u,k); % Rotate by k*90 degrees with integer k
uRot = imrotate(u,Angle); % Rotate by Angle degrees [Image Toolbox]
uc = (u  min(u(:))/(max(u(:))  min(u(:))); % Stretch contrast to [0,1]
uq = round(u*(K1))/(K1); % Quantize to K graylevels {0,1/K,2/K,...,1}
% Simulating noise
uNoisy = u + randn(size(u))*sigma; % Add white Gaussian noise of standard deviation sigma
uNoisy = u; uNoisy(rand(size(u)) < p) = round(rand(size(u))); % Salt and pepper noise
% Debugging
any(~isfinite(u(:))) % Check if any elements are infinite or NaN
nnz(u > 0.5) % Count how many elements satisfy some condition
(Note: For any array, the syntax u(:) means “unroll
u into a column vector.” For example, if u = [1,5;0,2], then u(:) is [1;0;5;2].)
For example, image signal power is used in computing signaltonoise
ratio (SNR) and peak signaltonoise ratio (PSNR). Given clean
image uclean and noisecontaminated image u,
% Compute SNR
snr = 10*log10( sum((uclean(:)  u(:)).^2) / sum(uclean(:).^2) );
% Compute PSNR (where the maximum possible value of uclean is 1)
psnr = 10*log10( mean((uclean(:)  u(:)).^2) );
Be careful with norm: the
behavior is norm(v) on vector v computes sqrt(sum(v.^2)), but
norm(A) on matrix A computes the induced
L^{2} matrix norm,
norm(A) = sqrt(max(eig(A'*A))) gaah!
So norm(A) is certainly not sqrt(sum(A(:).^2)).
It is nevertheless an easy mistake to use
norm(A) where it should have been norm(A(:)).
Linear filters
Linear filtering is the cornerstone technique of signal processing.
To briefly introduce, a linear filter is an operation where at every
pixel x_{m,n} of an image, a linear function is evaluated on the pixel and its
neighbors to compute a new pixel value y_{m,n}.
A linear filter in two dimensions has the general form
y_{m,n} =
∑_{j}∑_{k}
h_{j,k} x_{m−j,n−k}
where x is the input, y is the output, and h
is the filter impulse response. Different choices of
h lead to filters that smooth, sharpen, and detect edges, to
name a few applications. The righthand side of the above equation is
denoted concisely as h∗x and is called the
“convolution of h and x.”
Spatialdomain filtering
Twodimensional linear filtering is implemented in MATLAB with conv2.
Unfortunately, conv2 can only handle filtering near the image
boundaries by zeropadding, which means that filtering results are
usually inappropriate for pixels close to the boundary. To work
around this, we can pad the input image and use the 'valid' option when calling conv2.
The following Mfunction does this.
function x = conv2padded(varargin)
%CONV2PADDED Twodimensional convolution with padding.
% Y = CONV2PADDED(X,H) applies 2D filter H to X with constant
% extension padding.
%
% Y = CONV2PADDED(H1,H2,X) first applies 1D filter H1 along the rows
% and then applies 1D filter H2 along the columns.
%
% If X is a 3D array, filtering is done separately on each channel.
% Pascal Getreuer 2009
if nargin == 2 % Function was called as "conv2padded(x,h)"
x = varargin{1};
h = varargin{2};
top = ceil(size(h,1)/2)1;
bottom = floor(size(h,1)/2);
left = ceil(size(h,2)/2)1;
right = floor(size(h,2)/2);
elseif nargin == 3 % Function was called as "conv2padded(h1,h2,x)"
h1 = varargin{1};
h2 = varargin{2};
x = varargin{3};
top = ceil(length(h1)/2)1;
bottom = floor(length(h1)/2);
left = ceil(length(h2)/2)1;
right = floor(length(h2)/2);
else
error('Wrong number of arguments.');
end
% Pad the input image
xPadded = x([ones(1,top),1:size(x,1),size(x,1)+zeros(1,bottom)],...
[ones(1,left),1:size(x,2),size(x,2)+zeros(1,right)],:);
% Since conv2 cannot handle 3D inputs, do filtering channel by channel
for p = 1:size(x,3)
if nargin == 2
x(:,:,p) = conv2(xPadded(:,:,p),h,'valid'); % Call conv2
else
x(:,:,p) = conv2(h1,h2,xPadded(:,:,p),'valid'); % Call conv2
end
end
Rightclick and save conv2padded.m to use
this Mfunction. Here are some examples:
% A light smoothing filter
h = [0,1,0;
1,4,1;
0,1,0];
h = h/sum(h(:)); % Normalize the filter
uSmooth = conv2padded(u,h);
% A sharpening filter
h = [0,1,0;
1,8,1;
0,1,0];
h = h/sum(h(:)); % Normalize the filter
uSharp = conv2padded(u,h);
% Sobel edge detection
hx = [1,0,1;
2,0,2;
1,0,1];
hy = rot90(hx,1);
u_x = conv2padded(u,hx);
u_y = conv2padded(u,hy);
EdgeStrength = sqrt(u_x.^2 + u_y.^2);
% Moving average
WindowSize = 5;
h1 = ones(WindowSize,1)/WindowSize;
uSmooth = conv2padded(h1,h1,u);
% Gaussian filtering
sigma = 3.5;
FilterRadius = ceil(4*sigma); % Truncate the Gaussian at 4*sigma
h1 = exp((FilterRadius:FilterRadius).^2/(2*sigma^2));
h1 = h1/sum(h1); % Normalize the filter
uSmooth = conv2padded(h1,h1,u);
A 2D filter h is said to be separable if it can be
expressed as the outer product of two 1D filters h1 and
h2, that is, h = h1(:)*h2(:)'. It is faster to pass
h1 and h2 than h, as is done above for the
moving average window and the Gaussian filter. In fact, the Sobel
filters hx and hy are also separable—what are
h1 and h2?
Fourierdomain filtering
Spatialdomain filtering with conv2 is easily
a computaionally expensive operation. For a K×K
filter on an M×N image, conv2 costs
O(MNK^{2}) additions and multiplications, or
O(N^{4}) supposing
M∼N∼K.
For large filters, filtering in the Fourier domain is
faster since the computational cost is reduced to
O(N^{2} log N). Using the
convolutionmultiplication property of the Fourier transform, the
convolution is equivalently computed by
% Compute y = h*x with periodic boundary extension
[k1,k2] = size(h);
hpad = zeros(size(x));
hpad([end+1floor(k1/2):end,1:ceil(k1/2)], ...
[end+1floor(k2/2):end,1:ceil(k2/2)]) = h;
y = real(ifft2(fft2(hpad).*fft2(x)));
The result is
equivalent to conv2padded(x,h) except near the boundary,
where the above computation uses periodic boundary extension.
Fourierbased filtering can also be done with
symmetric boundary extension by reflecting the input in each direction:
% Compute y = h*x with symmetric boundary extension
xSym = [x,fliplr(x)]; % Symmetrize horizontally
xSym = [xSym;flipud(xSym)]; % Symmetrize vertically
[k1,k2] = size(h);
hpad = zeros(size(xSym));
hpad([end+1floor(k1/2):end,1:ceil(k1/2)], ...
[end+1floor(k2/2):end,1:ceil(k2/2)]) = h;
y = real(ifft2(fft2(hpad).*fft2(xSym)));
y = y(1:size(y,1)/2,1:size(y,2)/2);
(Note: An even more efficient method is FFT
overlapadd filtering. The Signal Processing Toolbox
implements FFT overlapadd in onedimension in fftfilt.)
Nonlinear filters
A nonlinear filter is an operation where each filtered pixel
y_{m,n} is a nonlinear function of
x_{m,n} and its neighbors. Here we briefly discuss a
few types of nonlinear filters.
Order statistic filters
If you have the Image Toolbox, order statistic filters can be
performed with ordfilt2 and medfilt2. An order
statistic filter sorts the pixel values over a neighborhood and
selects the kth largest value. The min, max, and median filters
are special cases.
Morphological filters
If you have the Image Toolbox, bwmorph implements various
morphological operations on binary images, like erosion, dilation,
open, close, and skeleton. There are also commands available for
morphology on grayscale images: imerode,
imdilate and imtophat, among others.
Build your own filter
Occasionally we want to use a new filter that MATLAB does not have. The
code below is a template for implementing filters.
[M,N] = size(x);
y = zeros(size(x));
r = 1; % Adjust for desired window size
for n = 1+r:Nr
for m = 1+r:Mr
% Extract a window of size (2r+1)x(2r+1) around (m,n)
w = x(m+(r:r),n+(r:r));
% ... write the filter here ...
y(m,n) = result;
end
end
(Note: A frequent misguided claim is
that loops in MATLAB
are slow and should be avoided. This was once true, back in MATLAB 5 and earlier,
but loops in modern versions are reasonably fast.)
For example, the alphatrimmed mean filter ignores the d/2 lowest and d/2 highest values in the window, and averages the remaining (2r+1)^{2}−d
values. The filter is a balance between a median filter and a mean
filter. The alphatrimmed mean filter can be implemented in the
template as
% The alphatrimmed mean filter
w = sort(w(:));
y(m,n) = mean(w(1+d/2:endd/2)); % Compute the result y(m,n)
As another example, the bilateral filter is
y_{m,n} = 
∑_{j,k}
h_{j,k,m,n} x_{m−j,n−k} 
∑_{j,k} h_{j,k,m,n}

where
h_{j,k,m,n} =
e^{−(j2+k2)/(2σs2)}
e^{−(xm−j,n−k − xm,n)2/(2σd2)}.
The bilateral filter can be implemented as
% The bilateral filter
[k,j] = meshgrid(r:r,r:r);
h = exp( (j.^2 + k.^2)/(2*sigma_s^2) ) .* ...
exp( (w  w(r+1,r+1)).^2/(2*sigma_d^2) );
y(m,n) = h(:)'*w(:) / sum(h(:));
If you don't have the Image Toolbox, the template can be used to
write substitutes for missing filters, though they will not be as
fast as the Image Toolbox implementations.
Filter 
Code 
medfilt2 
y(m,n) = median(w(:)); 
ordfilt2 
w = sort(w(:));
y(m,n) = w(k); % Select the kth largest element 
imdilate 
% Define a structure element as a (2r+1)x(2r+1) array
SE = [0,1,0;1,1,1;0,1,0];
y(m,n) = max(w(SE)); 
Special topics
Up to this point, we have covered those operations that are generally
useful in imaging processing. You must look beyond this tutorial for
the details on the topic of your interest. A large amount of MATLAB
code and material is freely available online for filter design, wavelet
and multiresolution techniques, PDEbased imaging, morphology, and
wherever else researchers have made their code publicly available.
Here I will briefly advertise my code (no toolboxes are necessary to use any of the following, code available under
BSD license):
 Converting between RGB, YUV, HSV, CIE Lab, ...
Convert images between different color representations
 Wavelet CDF 9/7 Implementation
Implementation of the CohenDaubechiesFeauveau 9/7 wavelet. Supports color and arbitrarysize transforms.
 Wavelet Transforms without the Wavelet Toolbox
Selfcontained function for Daubechies, symlets, splines, S+P wavelets, JPEG2000 wavelets, and more.
 tvreg: Variational Imaging Methods
Functions for total variation based denoising, deconvolution, and inpainting, and
ChanVese segmentation.
You may also find these useful: