Classification of Hyperspectral Data from Urban
Areas using Morpholgical Preprocessing and
Independent Component Analysis
Jon Aevar Palmason1 , Jon Atli Benediktsson1 , Johannes R. Sveinsson1 and Jocelyn Chanussot2
1
Department of Electrical and Computer Engineering
University of Iceland, Hjardarhagi 2-6, 107 Reykjavik, Iceland
e-mail {jaep, benedikt, sveinsso}@hi.is
2
Signal & Images Laboratory - LIS / INPG BP 46 - 38402 St Martin d’Heres - FRANCE
e-mail jocelyn.chanussot@lis.inpg.fr
Abstract— Classification of high-resolution hyperspectral data
is investigated. Previously, in classification of high-resolution
panchromatic data, simple morphological profiles have been constructed with a repeated use of morphological opening and closing
operators with a structuring element of increasing size, starting
with the original panchromatic image. This approach has recently
been extended for hyperspectral data. In the extension, principal
components of the hyperspectral imagery have been computed
in order to produce an extended morphological profile. In this
paper, we investigate the use of independent components instead
of principal components in extended morphological profiles, i.e.,
selected independent components are used as base images for
an extended morphological profile. In the proposed approach,
the extended morphological profiles based on the independent
components are used as inputs to a neural network classifier. In
experiments, a hyperspectral data sets from an urban area in
Pavia, Italy is classified.
I. I NTRODUCTION
It is well known that remote sensing images of high
spatial resolution are needed for classification of urban areas.
The commonly available data of high spatial resolution have
been single-band panchromatic data. Unfortunately, using only
single-band high-resolution panchromatic data is usually not
sufficient for accurate classification of the structural information in urban areas. To overcome that problem, Pesaresi and
Benediktsson [2] proposed the use of morphological transformations to build a morphological profile for classification of
such data. In [5], the method in [2] was extended for hyperspectral data with high spatial resolution. The approach in [5]
is based on using several principal components (PCs) from the
hyperspectral data. From each of the PCs, a morphological
profile is built. Then, the profiles are used all together in
one extended morphological profile, which is consequently
classified with a neural network. The approach in [5] was
shown to perform well in terms of accuracies.
Here, an extension to the approach in [5] is proposed
for classification of hyperspectral urban data. The proposed
approach uses Independent Component Analysis (ICA) instead
of Principal Component Analysis (PCA). ICA is a statistical
technique for revealing hidden factors that underlie sets of
random variables, measurements, or signals. ICA is related
0-7803-9050-4/05/$20.00 ©2005 IEEE.
to principal component analysis but is more powerful and
capable of finding the underlying factors or sources when the
principal component approach fails. ICA defines a generative
model for the observed multivariate data, which is typically
given as a large database of samples. In the model, the data
variables are assumed to be linear mixtures of some unknown
latent variables, and the mixing system is also unknown.
The latent variables are assumed non-Gaussian and mutually
independent, and they are called the independent components
of the observed data. These independent components, also
called sources or factors, can be found by ICA [7].
In this paper, the proposed method is tested in experiments
on high resolution hyperspectral remote sensing data from
an urban area. The paper is organized as follows. In Section
2, the mathematical morphology approach to classification of
hyperspectral data from urban areas is briefly reviewed. The
ICA approach is reviewed in Section 3. Experimental results
are given in Section 4 and conclusions drawn in Section 5.
II. M ORPHOLOGICAL P ROFILES FOR H YPERSPECTRAL
DATA
The fundamental operators in mathematical morphology are
erosion and dilation [1]. When mathematical morphology is
used in image processing, these operators are applied to an
image with a set of a known shape, called a structuring element
(SE). The application of the erosion operator to an image
gives an output, which shows where the SE fits the objects in
the image. On the other hand, the application of the dilation
operator to an image gives an output, which shows where the
SE hits the objects in the image. The erosion and dilation
operators are in general dual but non-invertible. All other
morphological operators can be expressed in terms of erosion
and dilation.
Two commonly used morphological operators are opening
and closing [1]. The idea behind opening is to dilate an eroded
image in order to recover as much as possible of the eroded
image. In contrast, the idea behind closing is to erode a
dilated image in order to recover the initial shape of image
structures that have been dilated. Previously, a morphological
176
profile approach based on a range of different SE sizes for
both opening and closing has been used for classification of
panchromatic remote sensing data from urban areas [2].
When the morphological profile approach is applied to
hyperspectral data, a characteristic image needs to be extracted
from the data. As stated above, it was suggested in [3] to use
the first principal component (PC) of the hyperspectral data
for such a purpose. Although that approach seems reasonable
because principal component analysis is optimal for data
representation in the mean square sense, it should not be
forgotten that with only one PC, the hyperspectral data are
reduced from potentially several hundred data channels into
one single data channel. In addition, although the first PC may
represent most of the variation in the image, some important
information may be contained in the other PCs. Therefore,
an extension to this approach was proposed in [5] where
an extended morphological profile was built from several
different PCs. Here we extend that approach by working with
independent components instead of principal components.
where A is referred to as the mixing matrix with size n × m.
Row vectors of x are the n sensed signals and m row vectors
of s are the assumed sources.
The number of independent sources ICA can recover, is
less than or equal to number of sensed signals, i.e., m ≤ n.
In practice m < n should be expected.
By definition, the sources in s are assumed to be statistically
independent. This is a stronger requirement than being uncorrelated [7]. Based on this assumption multivariate probability
density function of s can be expressed as,
III. I NDEPENDENT COMPONENTS
The concept of ICA was introduced in the early 1980s
although it was not yet named until later. ICA belongs to
class of blind signal separation (BSS) methods, which aim to
separate data into underlaying information components. ICA
can also be used for feature extraction.
For demonstration, lets consider the sensed signals xi (t),
i = 1, 2, 3, which all are linear mixtures of the source signals
si (t), i = 1, 2, 3, in different proportions, such as
Row vectors of u are the unmixed signals, namely the
sources. In [6] Bell and Sejnowski published their approach of
blind signal deconvolution based on ICA by minimizing the
mutual information
p(u)
I(u1 , ..., um ) = E log m
,
(5)
i=1 p(ui )
x1 (t) = a11 s1 (t) + a12 s2 (t) + a13 s3 (t)
x2 (t) = a21 s1 (t) + a22 s2 (t) + a23 s3 (t)
x3 (t) = a31 s1 (t) + a32 s2 (t) + a33 s3 (t).
(1)
If the mixing parameters aij were known, one could easily
solve the equation set by inverting the linear problem. Without
any information about the sources si (t) nor the coefficients aij
the problem becomes more complex.
Well known example in blind signal separation is the
cocktail-party problem. Fig. 1 demonstrates mixing/unmixing
process of three audio signals. Mixes of original signals in (a)
are shown in (b). ICA is applied to recover signals shown in
(c) without prior knowledge about the source signals nor the
mixing parameters.
In general, the set of equation in (1) can be written in vector
form as follows,
x = As,
(a)
(b)
(2)
(c)
Fig. 1. Amplitudes of three audio signals, a) original sources, b) mixtures
of original sources, and c) unmixed signals.
0-7803-9050-4/05/$20.00 ©2005 IEEE.
p(s) =
m
p(si ),
(3)
i=1
where p(si ) are the probability density functions for the
individual source signals.
The unmixing process is about finding m × n matrix W to
transform the recorded signals, such as
u = W x.
(4)
where E is the expectation operator.
At minimum the ratio becomes one and the logarithm zero.
The unmixing matrix W is optimized by natural gradient
algorithm. It is an iterative algorithm where learning rate can
control the convergence speed. To speed up the process and
avoid effects of mean and variance, the data are whitened.
Varshney and Arora discuss two ICA feature extraction
algorithms in [8]. The first algorithm they discuss is applied
here. Independent components are extracted from the most important principal components with the accumulative variance
of 99%. The rest of the PCs are not used.
IV. E XPERIMENTAL RESULTS
The data used in the experiments were collected in the
framework of the HySens project, managed by Deutschen Zentrum fur Luft- und Raumfahrt (DLR) (the German Aerospace
Center) and sponsored by the European Union. The optical
sensor ROSIS 03 (Reflective Optics System Imaging Spectrometer) was used to record four flight lines over urban area
of Pavia, northern Italy. The ROSIS 03 sensor has spectral
coverage from 0.43µm through 0.86µm. The flight altitude was
chosen as the lowest available for the airplane, which resulted
in a spatial resolution of 1.3m per pixel. The data contains
102 features. Three color composite images are shown in Fig.
2(a), containing channel 80 for red, 45 for green and 10 for
blue.
The test site covers Pavia city center (hereafter referred
to as Pavia center) with dense residential area on one side
of the river Ticino and open areas on the other site. The
Pavia center image was originally 1096 by 1096 pixels. A 381
177
TABLE II
PAVIA CENTER . T RAINING AND TEST ACCURACIES (%) USING THE
M AXIMUM L IKELIHOOD CLASSIFIER .
pixel wide black strip in the left part of image was removed
for processing, resulting in a ”two part” image. This ”two
part” image is 1096 by 715 pixels. Nine classes have been
defined: Water, trees, asphalt, bricks, bitumen, tiles, shadow
and meadow. Number of training and testing samples for each
of the data sets used are listed in Table I whereas the available
reference data are shown in Fig. 2(b).
Data set
Features
Class
1
2
3
4
5
6
7
8
9
Ave
OA
TABLE I
PAVIA CENTER . I NFORMATION CLASSES AND SAMPLES .
No.
1
2
3
4
5
6
7
8
9
Class
Name
Water
Trees
Meadows
Bricks
Bare soil
Asphalt
Bitumen
Tiles
Shadow
Total
Samples
Train
Test
824
65147
820
6778
824
2266
1891
808
820
5764
8432
816
6479
808
41566
1260
476
2387
7456
140710
Maximum likelihood classification was applied on the data
based on the assumption that the data were Gaussian. Classification was done both on the data in full feature space (102
data channels), Decision Boundary Feature Extraction (DBFE)
data [4] with 29 transformed channels (99% criterion) and
Nonparametric Weighted Feature Extraction (NWFE) [4] data
with 12 transformed channels (99% criterion). The classification accuracies are in Table II. From the table it can be
seen that excellent overall accuracies are achieved by the ML
classifier, both before and after feature extraction. However,
the overall accuracy (OA) for the DBFE test data is the highest.
An extended morphological profile (MP) was constructed.
For classification of the MPs, a neural network with one hidden
layer was used. The number of neurons in the hidden layer
was set as geometrical mean of the number of inputs and
outputs, i.e., the square root of the product of the number
(a)
DBFE (99%)
29
Test
Train
91.5
100.0
92.0
98.9
99.5
97.7
95.8
86.9
98.4
95.6
94.4
97.5
96.4
93.3
99.4
99.3
99.8
92.3
98.1
94.0
98.1
94.5
NWFE (99%)
12
Train
Test
100.0
90.8
90.6
97.9
96.6
98.4
84.6
91.8
95.2
91.4
96.7
95.4
92.3
83.8
99.0
99.0
99.8
94.7
95.9
93.6
92.8
95.8
of input features and the number of information classes. The
input features for the neural network classifiers varied in the
experiments. In all the neural network classifications, three
independent components (ICs) were used. These three ICs are
displayed in Fig. 3. First, only these three components were
used as input features. Then, the extended morphological profiles were built based on the ICs. A circular structuring element
(SE) with a step size increment of 2 gave the best results and
is used in the experiments reported here. The input features
were varied by increasing the number of openings and closings
on each IC. Finally, feature extraction methods were applied
to reduce the dimension of the largest extended MPs because
of possible redundancies in the MPs. The considered feature
extraction approaches were DAFE (Discriminant Analysis
Feature Extraction) [4], DBFE and NWFE. MultiSpec [4] was
used for feature extraction but Matlab for the morphological
preprocessing and neural network classification.
The classification accuracies for the morphological profiles
are shown in Table III. From the table it can be seen that
using the three ICs without morphological processing gives
unacceptable results, as was expected. In contrast, classification of extended morphological profiles based on the ICs
gives excellent accuracies, especially when 3 openings and 3
closings are used. In that case, the overall test classification
accuracies are higher by more than 4% when compared to
(b)
Fig. 2. Pavia center, a) three-channel color composite, and b) available
reference data
0-7803-9050-4/05/$20.00 ©2005 IEEE.
Original data
102
Test
Train
92.0
100.0
99.4
91.3
99.9
96.6
98.9
81.8
99.9
95.2
85.9
99.4
99.1
95.6
99.1
99.4
79.6
100.0
99.5
90.8
93.8
99.6
178
Fig. 3.
Independent components for the Pavia Center data.
PAVIA CENTER . T RAINING AND TEST ACCURACIES (%)
ICs
#op/cl
FE
Features
Class
1
2
3
4
5
6
7
8
9
Ave
OA
1, 2, 3
3
Test
Train
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
100.0
100.0
0.0
0.0
11.1
11.1
16.9
29.6
1, 2,
1
9
Train
100.0
91.9
95.5
98.3
94.6
0.0
98.4
100.0
100.0
86.5
86.7
3
Test
99.6
87.4
87.9
92.3
87.5
0.0
96.9
99.9
93.1
82.7
92.1
TABLE III
M ORPHOLOGICAL P ROFILES OF I NDEPENDENT C OMPONENTS .
FOR EXTENDED
1, 2,
2
15
Train
100.0
0.0
100.0
99.9
99.9
100.0
99.9
100.0
100.0
88.8
89.0
3
1, 2,
3
21
Train
100.0
98.8
93.9
99.8
99.6
99.4
99.3
100.0
100.0
99.0
99.0
Test
95.6
0.0
92.1
97.3
97.7
99.3
98.1
99.6
96.3
86.6
94.0
3
Test
99.5
91.7
85.3
99.2
98.4
98.6
98.1
99.8
99.2
96.6
98.8
1, 2, 3
3
DAFE (100%)
8
Test
Train
0.0
0.0
79.5
94.4
72.6
61.4
98.9
91.1
94.2
95.4
97.2
97.4
80.6
83.1
99.7
98.4
93.1
72.0
81.3
75.2
81.8
49.8
1, 2, 3
3
DBFE (99%)
12
Train
Test
100.0 99.6
66.0
96.5
93.4
78.8
57.4
83.8
89.7
95.4
90.8
95.4
85.8
77.4
95.2
94.1
40.8
64.7
86.5
84.0
86.2
93.7
1, 2, 3
3
NWFE (95%)
11
Test
Train
99.6
100.0
100.0
84.8
100.0
85.5
98.9
100.0
99.9
97.2
100.0
99.2
98.5
100.0
99.2
100.0
100.0
88.7
100.0
94.6
100.0
98.2
building a morphological profile for each of the ICs and using
them all together in one extended morphological profile. The
extended morphological profiles based on three independent
components were then classified with a neural network, with
and without feature extraction. In experiments on one data
set, the proposed approach was applied with several different
feature extraction methods. The proposed approached gave
excellent results and outperformed the statistical maximum
likelihood classifer by more than 4% in terms of overall test
accuracies.
ACKNOWLEDGEMENT
(a)
The authors would like to thank Prof. Paolo Gamba and
Prof. Fabio Dell’Acqua of the Univeristy of Pavia, Italy, for
providing reference data. This research was supported in part
by the Research Fund of the University of Iceland and the
Icelandic Research Fund.
(b)
Fig. 4. Pavia center classification results: a) ML with DBFE (99%), b) ICs
1, 2, 3 with 3 openings/closing followed by NWFE (95%).
the obtained accuracies for the ML classification of the data.
The use of the NWFE in feature extraction gave outstanding
results when the 95% criterion was used. With 11 features
the classification of NWFE transformed extended profile gave
similar test accuracies as were obtained with the original
extended profile.
Classified images from the experiment on the Pavia center
data are shown in Fig. 4. The different characteristcs of
the results obtained from ML classification of the DBFE
transformed data and the neural network classification of the
extended MPs can be seen in the figure.
V. C ONCLUSIONS
Classification of ROSIS hyperspectral data from urban areas
in Pavia, Italy has been discussed. A new morphological
preprocessing method was proposed for classification of the
data. The morphological method is based on using several
independent components (ICs) from the hyperspectral data,
0-7803-9050-4/05/$20.00 ©2005 IEEE.
179
R EFERENCES
[1] P. Soille, Morphological Image Analysis - Principles and Applications, 2nd Edition, Springer Verlag, Berlin, 2003.
[2] M. Pesaresi and J.A. Benediktsson, “A New Approach for the Morphological Segmentation of High-resolution Satellite Imagery,“ IEEE
Transactions on Geosciene and Remote Sensing, vol. 39, no. 2, pp. 309320, 2001.
[3] F. Dell’Acqua, P. Gamba, A. Ferrari, J. A. Palmason, J. A. Benediktsson
and K. Arnason, “Exploiting Spectral and Spatial Information in Hyperspectral Urban Data with High Resolution” IEEE Geosci. and Remote
Sensing Letters, vol. 1, pp. 322-326, 2004.
[4] D.A. Landgrebe, Signal Theory Methods in Multispectral Remote
Sensing, John Wiley and Sons, Hoboken, New Jersey, 2003.
[5] J.A. Benediktsson, J.A. Palmason and J.R. Sveinsson, “Classification of
hyperspectral data from urban areas based on extended morphological
profiles,” IEEE Transactions on Geoscience and Remote Sensing, vol.
43, no. 3, pp. 480-491, 2005.
[6] A.J. Bell and T.J. Sejnowski, Blind separation and blind deconvolution:
an information-theoretic approach IEEE ICASSP-95, pp. 3415 - 3418,
IEEE, New Jersey, 1995.
[7] A Hyvrinen, J. Karhunen and E. Oja, Independent copmonent analysis,
John Wiley and Sons, New York.
[8] P.K. Varshney and M.K. Arora, Advanced Image Processing Techniques for Remotely Sensed Hyperspectral Data, Springer Verlag,
Berlin, 2003.