[go: up one dir, main page]

Academia.eduAcademia.edu
Classification of Hyperspectral Data from Urban Areas using Morpholgical Preprocessing and Independent Component Analysis Jon Aevar Palmason1 , Jon Atli Benediktsson1 , Johannes R. Sveinsson1 and Jocelyn Chanussot2 1 Department of Electrical and Computer Engineering University of Iceland, Hjardarhagi 2-6, 107 Reykjavik, Iceland e-mail {jaep, benedikt, sveinsso}@hi.is 2 Signal & Images Laboratory - LIS / INPG BP 46 - 38402 St Martin d’Heres - FRANCE e-mail jocelyn.chanussot@lis.inpg.fr Abstract— Classification of high-resolution hyperspectral data is investigated. Previously, in classification of high-resolution panchromatic data, simple morphological profiles have been constructed with a repeated use of morphological opening and closing operators with a structuring element of increasing size, starting with the original panchromatic image. This approach has recently been extended for hyperspectral data. In the extension, principal components of the hyperspectral imagery have been computed in order to produce an extended morphological profile. In this paper, we investigate the use of independent components instead of principal components in extended morphological profiles, i.e., selected independent components are used as base images for an extended morphological profile. In the proposed approach, the extended morphological profiles based on the independent components are used as inputs to a neural network classifier. In experiments, a hyperspectral data sets from an urban area in Pavia, Italy is classified. I. I NTRODUCTION It is well known that remote sensing images of high spatial resolution are needed for classification of urban areas. The commonly available data of high spatial resolution have been single-band panchromatic data. Unfortunately, using only single-band high-resolution panchromatic data is usually not sufficient for accurate classification of the structural information in urban areas. To overcome that problem, Pesaresi and Benediktsson [2] proposed the use of morphological transformations to build a morphological profile for classification of such data. In [5], the method in [2] was extended for hyperspectral data with high spatial resolution. The approach in [5] is based on using several principal components (PCs) from the hyperspectral data. From each of the PCs, a morphological profile is built. Then, the profiles are used all together in one extended morphological profile, which is consequently classified with a neural network. The approach in [5] was shown to perform well in terms of accuracies. Here, an extension to the approach in [5] is proposed for classification of hyperspectral urban data. The proposed approach uses Independent Component Analysis (ICA) instead of Principal Component Analysis (PCA). ICA is a statistical technique for revealing hidden factors that underlie sets of random variables, measurements, or signals. ICA is related 0-7803-9050-4/05/$20.00 ©2005 IEEE. to principal component analysis but is more powerful and capable of finding the underlying factors or sources when the principal component approach fails. ICA defines a generative model for the observed multivariate data, which is typically given as a large database of samples. In the model, the data variables are assumed to be linear mixtures of some unknown latent variables, and the mixing system is also unknown. The latent variables are assumed non-Gaussian and mutually independent, and they are called the independent components of the observed data. These independent components, also called sources or factors, can be found by ICA [7]. In this paper, the proposed method is tested in experiments on high resolution hyperspectral remote sensing data from an urban area. The paper is organized as follows. In Section 2, the mathematical morphology approach to classification of hyperspectral data from urban areas is briefly reviewed. The ICA approach is reviewed in Section 3. Experimental results are given in Section 4 and conclusions drawn in Section 5. II. M ORPHOLOGICAL P ROFILES FOR H YPERSPECTRAL DATA The fundamental operators in mathematical morphology are erosion and dilation [1]. When mathematical morphology is used in image processing, these operators are applied to an image with a set of a known shape, called a structuring element (SE). The application of the erosion operator to an image gives an output, which shows where the SE fits the objects in the image. On the other hand, the application of the dilation operator to an image gives an output, which shows where the SE hits the objects in the image. The erosion and dilation operators are in general dual but non-invertible. All other morphological operators can be expressed in terms of erosion and dilation. Two commonly used morphological operators are opening and closing [1]. The idea behind opening is to dilate an eroded image in order to recover as much as possible of the eroded image. In contrast, the idea behind closing is to erode a dilated image in order to recover the initial shape of image structures that have been dilated. Previously, a morphological 176 profile approach based on a range of different SE sizes for both opening and closing has been used for classification of panchromatic remote sensing data from urban areas [2]. When the morphological profile approach is applied to hyperspectral data, a characteristic image needs to be extracted from the data. As stated above, it was suggested in [3] to use the first principal component (PC) of the hyperspectral data for such a purpose. Although that approach seems reasonable because principal component analysis is optimal for data representation in the mean square sense, it should not be forgotten that with only one PC, the hyperspectral data are reduced from potentially several hundred data channels into one single data channel. In addition, although the first PC may represent most of the variation in the image, some important information may be contained in the other PCs. Therefore, an extension to this approach was proposed in [5] where an extended morphological profile was built from several different PCs. Here we extend that approach by working with independent components instead of principal components. where A is referred to as the mixing matrix with size n × m. Row vectors of x are the n sensed signals and m row vectors of s are the assumed sources. The number of independent sources ICA can recover, is less than or equal to number of sensed signals, i.e., m ≤ n. In practice m < n should be expected. By definition, the sources in s are assumed to be statistically independent. This is a stronger requirement than being uncorrelated [7]. Based on this assumption multivariate probability density function of s can be expressed as, III. I NDEPENDENT COMPONENTS The concept of ICA was introduced in the early 1980s although it was not yet named until later. ICA belongs to class of blind signal separation (BSS) methods, which aim to separate data into underlaying information components. ICA can also be used for feature extraction. For demonstration, lets consider the sensed signals xi (t), i = 1, 2, 3, which all are linear mixtures of the source signals si (t), i = 1, 2, 3, in different proportions, such as Row vectors of u are the unmixed signals, namely the sources. In [6] Bell and Sejnowski published their approach of blind signal deconvolution based on ICA by minimizing the mutual information   p(u) I(u1 , ..., um ) = E log m , (5) i=1 p(ui ) x1 (t) = a11 s1 (t) + a12 s2 (t) + a13 s3 (t) x2 (t) = a21 s1 (t) + a22 s2 (t) + a23 s3 (t) x3 (t) = a31 s1 (t) + a32 s2 (t) + a33 s3 (t). (1) If the mixing parameters aij were known, one could easily solve the equation set by inverting the linear problem. Without any information about the sources si (t) nor the coefficients aij the problem becomes more complex. Well known example in blind signal separation is the cocktail-party problem. Fig. 1 demonstrates mixing/unmixing process of three audio signals. Mixes of original signals in (a) are shown in (b). ICA is applied to recover signals shown in (c) without prior knowledge about the source signals nor the mixing parameters. In general, the set of equation in (1) can be written in vector form as follows, x = As, (a) (b) (2) (c) Fig. 1. Amplitudes of three audio signals, a) original sources, b) mixtures of original sources, and c) unmixed signals. 0-7803-9050-4/05/$20.00 ©2005 IEEE. p(s) = m  p(si ), (3) i=1 where p(si ) are the probability density functions for the individual source signals. The unmixing process is about finding m × n matrix W to transform the recorded signals, such as u = W x. (4) where E is the expectation operator. At minimum the ratio becomes one and the logarithm zero. The unmixing matrix W is optimized by natural gradient algorithm. It is an iterative algorithm where learning rate can control the convergence speed. To speed up the process and avoid effects of mean and variance, the data are whitened. Varshney and Arora discuss two ICA feature extraction algorithms in [8]. The first algorithm they discuss is applied here. Independent components are extracted from the most important principal components with the accumulative variance of 99%. The rest of the PCs are not used. IV. E XPERIMENTAL RESULTS The data used in the experiments were collected in the framework of the HySens project, managed by Deutschen Zentrum fur Luft- und Raumfahrt (DLR) (the German Aerospace Center) and sponsored by the European Union. The optical sensor ROSIS 03 (Reflective Optics System Imaging Spectrometer) was used to record four flight lines over urban area of Pavia, northern Italy. The ROSIS 03 sensor has spectral coverage from 0.43µm through 0.86µm. The flight altitude was chosen as the lowest available for the airplane, which resulted in a spatial resolution of 1.3m per pixel. The data contains 102 features. Three color composite images are shown in Fig. 2(a), containing channel 80 for red, 45 for green and 10 for blue. The test site covers Pavia city center (hereafter referred to as Pavia center) with dense residential area on one side of the river Ticino and open areas on the other site. The Pavia center image was originally 1096 by 1096 pixels. A 381 177 TABLE II PAVIA CENTER . T RAINING AND TEST ACCURACIES (%) USING THE M AXIMUM L IKELIHOOD CLASSIFIER . pixel wide black strip in the left part of image was removed for processing, resulting in a ”two part” image. This ”two part” image is 1096 by 715 pixels. Nine classes have been defined: Water, trees, asphalt, bricks, bitumen, tiles, shadow and meadow. Number of training and testing samples for each of the data sets used are listed in Table I whereas the available reference data are shown in Fig. 2(b). Data set Features Class 1 2 3 4 5 6 7 8 9 Ave OA TABLE I PAVIA CENTER . I NFORMATION CLASSES AND SAMPLES . No. 1 2 3 4 5 6 7 8 9 Class Name Water Trees Meadows Bricks Bare soil Asphalt Bitumen Tiles Shadow Total Samples Train Test 824 65147 820 6778 824 2266 1891 808 820 5764 8432 816 6479 808 41566 1260 476 2387 7456 140710 Maximum likelihood classification was applied on the data based on the assumption that the data were Gaussian. Classification was done both on the data in full feature space (102 data channels), Decision Boundary Feature Extraction (DBFE) data [4] with 29 transformed channels (99% criterion) and Nonparametric Weighted Feature Extraction (NWFE) [4] data with 12 transformed channels (99% criterion). The classification accuracies are in Table II. From the table it can be seen that excellent overall accuracies are achieved by the ML classifier, both before and after feature extraction. However, the overall accuracy (OA) for the DBFE test data is the highest. An extended morphological profile (MP) was constructed. For classification of the MPs, a neural network with one hidden layer was used. The number of neurons in the hidden layer was set as geometrical mean of the number of inputs and outputs, i.e., the square root of the product of the number (a) DBFE (99%) 29 Test Train 91.5 100.0 92.0 98.9 99.5 97.7 95.8 86.9 98.4 95.6 94.4 97.5 96.4 93.3 99.4 99.3 99.8 92.3 98.1 94.0 98.1 94.5 NWFE (99%) 12 Train Test 100.0 90.8 90.6 97.9 96.6 98.4 84.6 91.8 95.2 91.4 96.7 95.4 92.3 83.8 99.0 99.0 99.8 94.7 95.9 93.6 92.8 95.8 of input features and the number of information classes. The input features for the neural network classifiers varied in the experiments. In all the neural network classifications, three independent components (ICs) were used. These three ICs are displayed in Fig. 3. First, only these three components were used as input features. Then, the extended morphological profiles were built based on the ICs. A circular structuring element (SE) with a step size increment of 2 gave the best results and is used in the experiments reported here. The input features were varied by increasing the number of openings and closings on each IC. Finally, feature extraction methods were applied to reduce the dimension of the largest extended MPs because of possible redundancies in the MPs. The considered feature extraction approaches were DAFE (Discriminant Analysis Feature Extraction) [4], DBFE and NWFE. MultiSpec [4] was used for feature extraction but Matlab for the morphological preprocessing and neural network classification. The classification accuracies for the morphological profiles are shown in Table III. From the table it can be seen that using the three ICs without morphological processing gives unacceptable results, as was expected. In contrast, classification of extended morphological profiles based on the ICs gives excellent accuracies, especially when 3 openings and 3 closings are used. In that case, the overall test classification accuracies are higher by more than 4% when compared to (b) Fig. 2. Pavia center, a) three-channel color composite, and b) available reference data 0-7803-9050-4/05/$20.00 ©2005 IEEE. Original data 102 Test Train 92.0 100.0 99.4 91.3 99.9 96.6 98.9 81.8 99.9 95.2 85.9 99.4 99.1 95.6 99.1 99.4 79.6 100.0 99.5 90.8 93.8 99.6 178 Fig. 3. Independent components for the Pavia Center data. PAVIA CENTER . T RAINING AND TEST ACCURACIES (%) ICs #op/cl FE Features Class 1 2 3 4 5 6 7 8 9 Ave OA 1, 2, 3 3 Test Train 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 100.0 100.0 0.0 0.0 11.1 11.1 16.9 29.6 1, 2, 1 9 Train 100.0 91.9 95.5 98.3 94.6 0.0 98.4 100.0 100.0 86.5 86.7 3 Test 99.6 87.4 87.9 92.3 87.5 0.0 96.9 99.9 93.1 82.7 92.1 TABLE III M ORPHOLOGICAL P ROFILES OF I NDEPENDENT C OMPONENTS . FOR EXTENDED 1, 2, 2 15 Train 100.0 0.0 100.0 99.9 99.9 100.0 99.9 100.0 100.0 88.8 89.0 3 1, 2, 3 21 Train 100.0 98.8 93.9 99.8 99.6 99.4 99.3 100.0 100.0 99.0 99.0 Test 95.6 0.0 92.1 97.3 97.7 99.3 98.1 99.6 96.3 86.6 94.0 3 Test 99.5 91.7 85.3 99.2 98.4 98.6 98.1 99.8 99.2 96.6 98.8 1, 2, 3 3 DAFE (100%) 8 Test Train 0.0 0.0 79.5 94.4 72.6 61.4 98.9 91.1 94.2 95.4 97.2 97.4 80.6 83.1 99.7 98.4 93.1 72.0 81.3 75.2 81.8 49.8 1, 2, 3 3 DBFE (99%) 12 Train Test 100.0 99.6 66.0 96.5 93.4 78.8 57.4 83.8 89.7 95.4 90.8 95.4 85.8 77.4 95.2 94.1 40.8 64.7 86.5 84.0 86.2 93.7 1, 2, 3 3 NWFE (95%) 11 Test Train 99.6 100.0 100.0 84.8 100.0 85.5 98.9 100.0 99.9 97.2 100.0 99.2 98.5 100.0 99.2 100.0 100.0 88.7 100.0 94.6 100.0 98.2 building a morphological profile for each of the ICs and using them all together in one extended morphological profile. The extended morphological profiles based on three independent components were then classified with a neural network, with and without feature extraction. In experiments on one data set, the proposed approach was applied with several different feature extraction methods. The proposed approached gave excellent results and outperformed the statistical maximum likelihood classifer by more than 4% in terms of overall test accuracies. ACKNOWLEDGEMENT (a) The authors would like to thank Prof. Paolo Gamba and Prof. Fabio Dell’Acqua of the Univeristy of Pavia, Italy, for providing reference data. This research was supported in part by the Research Fund of the University of Iceland and the Icelandic Research Fund. (b) Fig. 4. Pavia center classification results: a) ML with DBFE (99%), b) ICs 1, 2, 3 with 3 openings/closing followed by NWFE (95%). the obtained accuracies for the ML classification of the data. The use of the NWFE in feature extraction gave outstanding results when the 95% criterion was used. With 11 features the classification of NWFE transformed extended profile gave similar test accuracies as were obtained with the original extended profile. Classified images from the experiment on the Pavia center data are shown in Fig. 4. The different characteristcs of the results obtained from ML classification of the DBFE transformed data and the neural network classification of the extended MPs can be seen in the figure. V. C ONCLUSIONS Classification of ROSIS hyperspectral data from urban areas in Pavia, Italy has been discussed. A new morphological preprocessing method was proposed for classification of the data. The morphological method is based on using several independent components (ICs) from the hyperspectral data, 0-7803-9050-4/05/$20.00 ©2005 IEEE. 179 R EFERENCES [1] P. Soille, Morphological Image Analysis - Principles and Applications, 2nd Edition, Springer Verlag, Berlin, 2003. [2] M. Pesaresi and J.A. Benediktsson, “A New Approach for the Morphological Segmentation of High-resolution Satellite Imagery,“ IEEE Transactions on Geosciene and Remote Sensing, vol. 39, no. 2, pp. 309320, 2001. [3] F. Dell’Acqua, P. Gamba, A. Ferrari, J. A. Palmason, J. A. Benediktsson and K. Arnason, “Exploiting Spectral and Spatial Information in Hyperspectral Urban Data with High Resolution” IEEE Geosci. and Remote Sensing Letters, vol. 1, pp. 322-326, 2004. [4] D.A. Landgrebe, Signal Theory Methods in Multispectral Remote Sensing, John Wiley and Sons, Hoboken, New Jersey, 2003. [5] J.A. Benediktsson, J.A. Palmason and J.R. Sveinsson, “Classification of hyperspectral data from urban areas based on extended morphological profiles,” IEEE Transactions on Geoscience and Remote Sensing, vol. 43, no. 3, pp. 480-491, 2005. [6] A.J. Bell and T.J. Sejnowski, Blind separation and blind deconvolution: an information-theoretic approach IEEE ICASSP-95, pp. 3415 - 3418, IEEE, New Jersey, 1995. [7] A Hyvrinen, J. Karhunen and E. Oja, Independent copmonent analysis, John Wiley and Sons, New York. [8] P.K. Varshney and M.K. Arora, Advanced Image Processing Techniques for Remotely Sensed Hyperspectral Data, Springer Verlag, Berlin, 2003.