Journal of Magnetic Resonance 173 (2005) 218–228
www.elsevier.com/locate/jmr
The use of multivariate MR imaging intensities versus metabolic
data from MR spectroscopic imaging for brain tumour classification
A. Devosa,*, A.W. Simonettib, M. van der Graaf c, L. Lukasa, J.A.K. Suykensa,
L. Vanhammea, L.M.C. Buydensb, A. Heerschapc, S. Van Huffela
a
K.U. Leuven, ESAT-SCD (SISTA), Leuven, Belgium
Laboratory for Analytical Chemistry, University of Nijmegen, Nijmegen, The Netherlands
Department of Radiology, University Medical Center Nijmegen, Nijmegen, The Netherlands
b
c
Received 21 September 2004; revised 20 December 2004
Available online 22 January 2005
Abstract
This study investigated the value of information from both magnetic resonance imaging and magnetic resonance spectroscopic
imaging (MRSI) to automated discrimination of brain tumours. The influence of imaging intensities and metabolic data was tested
by comparing the use of MR spectra from MRSI, MR imaging intensities, peak integration values obtained from the MR spectra
and a combination of the latter two. Three classification techniques were objectively compared: linear discriminant analysis, least
squares support vector machines (LS-SVM) with a linear kernel as linear techniques and LS-SVM with radial basis function kernel
as a nonlinear technique. Classifiers were evaluated over 100 stratified random splittings of the dataset into training and test sets.
The area under the receiver operating characteristic (ROC) curve (AUC) was used as a global performance measure on test data. In
general, all techniques obtained a high performance when using peak integration values with or without MR imaging intensities. For
example for low- versus high-grade tumours, low- versus high-grade gliomas and gliomas versus meningiomas, the mean test AUC
was higher than 0.91, 0.94, and 0.99, respectively, when both MR imaging intensities and peak integration values were used. The use
of metabolic data from MRSI significantly improved automated classification of brain tumour types compared to the use of MR
imaging intensities solely.
2004 Elsevier Inc. All rights reserved.
Keywords: Brain tumours; Classification; Magnetic resonance imaging; Magnetic resonance spectroscopic imaging; Linear discriminant analysis;
Least squares support vector machines
1. Introduction
Magnetic resonance imaging (MRI) is an important
noninvasive tool for identifying the location and size of
brain tumours, because it yields morphological and anatomical information about the brain tissue. However,
conventional MRI has a limited specificity is rather nonspecific in determining the underlying type of brain tumour and grade [1,2]. More recently developed MR
techniques like diffusion-weighted MRI, perfusion*
Corresponding author. Fax: +32 16 321970.
E-mail address: adevos@esat.kuleuven.ac.be (A. Devos).
1090-7807/$ - see front matter 2004 Elsevier Inc. All rights reserved.
doi:10.1016/j.jmr.2004.12.007
weighted MRI, and magnetic resonance spectroscopic
imaging (MRSI) are promising new techniques in the
characterization of brain tumours [3,4]. Diffusionweighted MRI visualizes the tissue structure and is useful
for assessing tumour cellularity, while perfusion-weighted
MRI provides measurements that reflect changes in tumour vasculature and tumour grading. MRSI or multivoxel magnetic resonance spectroscopy (MRS) provides
chemical information about metabolites present in normal and abnormal tissue [5–8]. Therefore, the differentiation of abnormal brain tissues, including brain tumours,
from normal brain forms a potentially major clinical
application of these new techniques. In general, diagnosis
A. Devos et al. / Journal of Magnetic Resonance 173 (2005) 218–228
of brain tumours is based on the microscopic examination
of tissue obtained by a biopsy, which includes risks associated with anesthesia and surgery. It would be very beneficial to the patient if the invasive biopsy could be guided
or even avoided by the use of noninvasive techniques like
diffusion-weighted MRI, perfusion-weighted MRI, and
MRS(I). In this study, we combined the use of conventional MRI intensities and one of the new techniques,
more specifically MRSI.
Several studies [9–18] have shown progress in automated pattern recognition for brain tumour classification
using MRI or MRS(I). However, currently only few studies (e.g., [14,15]) have used a combination of MRI and
MRSI features for classification of brain tumours. To enhance the diagnostic capabilities in clinical practice, we
investigated whether the combined use of MR imaging
intensities and metabolic data from MRSI could improve
the discrimination between several brain tumour and normal brain tissue types. Although the radiologist also uses
spatial and morphological information present in the MR
images, these features were not taken into account in this
study, as they are difficult to quantify. By comparing the
results obtained, we evaluated the strength of both, MR
imaging intensities and metabolic data from MRSI, in discriminating brain tissue types.
We considered linear as well as nonlinear classification techniques applied to several input features, such
as short echo time magnitude spectra, imaging intensities, peak integration values obtained from the spectra
and a combination of the latter two. The algorithms
were designed to extract the most important features
which were then used to classify each spectrum into
the corresponding tumour type. As classification is
required to be objective and user-friendly, all techniques
were automated. The purpose of this paper was twofold:
To investigate the discriminatory value of MRI intensities and metabolic data extracted from MRSI for
automated brain tumour diagnosis. This analysis also
provides the typical AUC values achievable for several relevant diagnostic problems of brain tumours.
To apply and compare several classification techniques, including the investigation of the influence
of the input features used.
2. Materials
2.1. Data
Data from 25 patients with a brain tumour and 4 volunteers were selected from the database developed in the
framework of the EU funded INTERPRET project
(IST-1999-10310) [19]. All data were provided by the
acquisition center UMCN (University Medical Center
219
Nijmegen), Nijmegen (The Netherlands). Each case
was clinically validated. The patientsÕ tumour type was
determined by a central consensus histopathological validation. For one of the 25 patients no consensus was
reached. Therefore, the data from the tumour region
of this patient were not used.
The dataset contained MR images as well as MR spectra, acquired and preprocessed as described in [14]. For
each subject, stacked MR images of cross-sections of
the whole brain at four contrasts were acquired: T1- and
T2-weighted images, a proton density weighted image
and a gadolinium enhanced (Gd-DTPA) T1-weighted image (256 · 256, FOV = 200 mm, slice thickness = 5 mm).
The image values will further be labeled as T1, T2, PD, and
GD. No Gd-DTPA administration was applied to the
healthy volunteers. Besides MR images, also 1H MRSI
data were acquired for each subject, both with and without water suppression using a 16 · 16 2D STEAM 1H
MRSI sequence with acquisition parameters TR = 2000
or 2500 ms, TE = 20 ms, slice thickness 12.5 or 15 mm,
FOV = 200 mm, SW = 1000 Hz, 1024 data points. The
position of the MRSI slice was chosen according to the
slice position of the GD image which showed the largest
GD enhancement.
To ensure that image pixels from subsequent images
originate from the same spatial location, the images
were co-aligned [14]. All MRSI data were semi-automatically preprocessed (cf. [14]), which involved:
Filtering of the k-space data by a Hanning filter of
50% using the LUISE software package (Siemens,
Erlangen, Germany).
Zero filling to 32 · 32, which involved an increase of
the apparent spatial resolution with a factor of 2.
Spatial 2D Fourier transformation to obtain time
domain signals for each voxel.
Correction for eddy current effects in the MR spectra
using a method which prevents the occasional occurrence of eddy current correction induced artefacts
[20]. This process resulted in a frequency alignment
and zero order phasing of the MR spectra.
Removal of the dominating residual water using
HLSVD [21], with 12 singular values and 4.0–
6.0 ppm as residual water region.
Frequency alignment was performed semi-automatically. First, the position of the NAA peak (N-acetylaspartate, 2CH3-group) in the mean spectrum of an
MRSI dataset was set to 2.02 ppm. The obtained shift
was used to reset each spectrum of the dataset in the
time domain automatically.
First order phase correction was also manually performed on the mean spectrum of a dataset. The
obtained first order time instant was used to automatically correct each spectrum in the dataset.
Fourier transformation was applied to the time
domain data to obtain frequency spectra.
220
A. Devos et al. / Journal of Magnetic Resonance 173 (2005) 218–228
Table 1
Number of data for each type of brain tissue (brain tumour or healthy tissue)
Label
Pathology
Number of data
Number of subjects
1
2
3
4
5
6
7
Normal from volunteers
Normal from patients
CSF
Gliomas, grade II
Gliomas, grade III
Gliomas, grade IV
Meningiomas
142
76
100
176
57
70
48
4
4
8
10
4
7
3
Total
669
29
The first and second columns give the label of the classes (from 1 till 7) and the pathology. The third and fourth columns display the number of data
and subjects for each class. Note that the total number of subjects was not simply the summation of the number of subjects per class, because for
several patients data were available from brain tumour as well as from healthy tissue.
The set of validated data consisted of 10 grade II, 4
grade III, and 7 grade IV gliomas and 3 meningiomas,
which gave us 4 classes of brain tumours. From each patient, data were selected from several voxels, identified
as lying in the tumour area. Besides brain tumour data,
also data from cerebro spinal fluid (CSF) from patients
and normal brain tissue from volunteers and patients
were selected. Data from all subjects with the same
pathology were combined into one class, which resulted
in the following 7 classes of pathologies (Table 1):
(1) normal tissue from volunteers: normal tissue from
healthy persons,
(2) normal tissue from patients: apparently normal
tissue from the contralateral half of the brain of
patients,
(3) cerebro spinal fluid (CSF): CSF from patients,
where the ventricles were clearly visible and the
voxels were located as far from the tumour as possible. Unfortunately, no CSF voxels could be
selected from volunteers, since the MRSI slices
of volunteers did not include the ventricles.
(4) grade II gliomas: diffuse astrocytomas (90 voxels
from 5 patients), oligodendrogliomas (22 voxels
from 2 patients), and mixtures (64 voxels from 3
patients),
(5) grade III gliomas: anaplastic astrocytomas (4 voxels from 1 patient), oligodendrogliomas (25 voxels
from 2 patients), and mixtures (28 voxels from 1
patient),
(6) grade IV gliomas: glioblastomas,
(7) meningiomas.
3. Methods
3.1. Input features
In this paper, an input pattern was either an MR
spectrum, a set of quantified values from the MR spec-
trum, a set of imaging intensities or a combination of the
latter two input types. This enabled us to investigate
whether the combination of imaging and spectroscopic
information can improve the performance for pattern
recognition of brain tumours. The following input features were considered:
Water normalized magnitude spectra (see Figs. 1 and
2). The amplitude of the water unsuppressed signal
was estimated as described in [9]. Then each spectral
value in the preprocessed water suppressed spectrum
was divided by the resulting estimate of the intensity
of the water peak. Only the spectral values in the
region of interest (0.5–4.0 ppm) were used as input
features.
Metabolite amplitudes obtained by peak integration.
Short echo time 1H MRSI signals are characterized
by the presence of a partially unknown broad baseline
underlying the resonances of the metabolites of interest, that hinders the assessment of the intensity (e.g.,
by peak integration) of low weight molecules. To
remove this broad baseline a simple baseline correction [9,14,22] was applied as additional preprocessing
step prior to the first order phase correction. This was
performed as described in [14]. Amplitude estimates
were then obtained from the baseline corrected frequency spectra using peak integration within a spectral range of 0.13 ppm. These selected frequency
regions correspond to resonances from metabolites
and lipids that are assumed to be characteristic to distinguish between tumour types [6,7,23–26]. As short
echo time 1H MR in vivo spectra are characterized
by substantial peak overlap and a relatively low spectral resolution at the clinical field strength of 1.5 T, a
particular region might cover resonances of more
than one metabolite. Such regions are [27]: L2 (lipids
at 0.9 ppm; 0.835–0.965), L1 (lipids at 1.2) + Lac
(lactate, 3CH3-group) + Ala (alanine, 1CH3-group)
(1.265–1.395 ppm), NAA (2CH3-group; 1.955–
2.085 ppm), Glx (glutamate/glutamine, 3CH2-group;
2.135–2.265), Cr (creatine, N(CH3)-group; 2.955–
A. Devos et al. / Journal of Magnetic Resonance 173 (2005) 218–228
221
Fig. 1. Mean water normalized magnitude 1H MR spectra (TE = 20 ms) of the considered classes: class 1 (top-left), class 2 (top-right), class 3
(bottom-left), and class 4 (bottom-right) correspond to the normal tissue of volunteers, normal tissue of patients, CSF, and gliomas of grade II. The
solid lines are the means, while the dotted lines are the means plus the standard deviations of each class.
3.095), Cho (choline, N(CH3)3-group; 3.135–
3.265 ppm), Tau (taurine, 1CH2-group; 3.375–
3.505 ppm), mI (myo-Inositol, 1CH-, 3CH-, 4CHand 6CH-group) + Gly (glycine, 2CH2-group)
(3.495–3.625 ppm), Glx + Ala (2CH-groups; 3.685–
3.815 ppm), and Cr (2CH2-group; 3.885–4.015 ppm).
The resulting peak integration values were then water
normalized as described above.
Imaging intensities (see Figs. 3 and 4). For each of the
images (T1, T2, PD, and GD), an image value was
extracted, producing four additional variables. To
obtain the same spatial resolution of both MRI and
MRSI data, the resolution of the MRI data was lowered to that of the MRSI grid by averaging the image
pixels within each spectroscopic voxel. Each intensity
value was divided by the highest intensity in the corresponding downsampled image and scaled to the
same range as the spectral data. No GD information
was available for the volunteers. Instead the T1-image
was used, under the assumption that no GD enhancement would occur in the brain tissue of volunteers.
However, this is an approximation as Gd-DTPA typically causes an increase in intensity in the blood
vessels.
Selection of the data used was based on the visual
inspection of the low and high resolution images and
MR spectra. To obtain data for a specific histopathological class, the following procedure was used for each
patient within this class. The four MR images were plotted together with a segmented image in which voxels
were clustered using a model-based clustering algorithm
[28] as described in [29]. The clustering provides an
objective segmentation based on similarities obtained
from the MR images as well as MR spectroscopic features, and thus is considered to be helpful in the selection of voxels. For each pathology, only voxels were
included in the dataset if their corresponding MR spectra were found to be typical for that pathology by an expert in MR spectroscopy. Since tumours are known to
be heterogeneous, this approach was considered better
than for example taking all spectra from one segment.
Neighbouring voxels were not selected to avoid too
much mutual correlation and as such to ensure that
samples were as independent as possible. Although the
method of class selection is subjective, we think it is
appropriate since in tumour diagnosis a ‘‘ground truth’’
is not available and the number of patients for each specific class is low.
222
A. Devos et al. / Journal of Magnetic Resonance 173 (2005) 218–228
Fig. 2. Mean water normalized magnitude 1H MR spectra
(TE = 20 ms) of the considered classes: class 5 (top), class 6 (middle),
and class 7 (bottom) correspond to gliomas of grade III, grade IV, and
meningiomas. The solid lines are the means, while the dotted lines are
the means plus the standard deviations of each class.
3.2. Experimental approach
Binary classification was performed by linear discriminant analysis (LDA) [30,31], least squares support vector machines (LS-SVM) [32,33] with linear or radial
basis function (RBF) kernel, which were also applied
in two previous extensive studies for classification based
on long [12] and short echo time 1H MRS [9].
To perform meaningful data-analysis in a high
dimensional space, a sufficiently large amount of training data is required. This limitation can be overcome,
e.g., by dimensionality reduction, which decreases the
amount of complexity and risk of overfitting and also
simplifies the calculation. In fact, peak integration
(Section 3.1) is a feature extraction method that reduces the input dimension and therefore uses prior
knowledge about the most discriminatory features in
the spectrum. As the input dimension is already small
when using imaging intensities (dimension 4), peak
integration values (dimension 10) or the combination
of both (dimension 14), dimensionality reduction is
only required when using magnitude spectra (dimension 231).
Especially LDA suffers from this dimensionality
problem, while LS-SVM is able to work without any
prior dimensionality reduction, thanks to the primaldual aspects of the model, even in case of a relatively
low number of input data. Therefore, the use of principal component analysis (PCA) as a feature extraction technique was only needed prior to LDA. The
231 given spectral variables were reduced by PCA that
retained those components that account for a larger
variance than the average over all individual components [34]. This strategy was different to the one taken
in [9,12], in which only the largest components were
selected that explain a 75% of total variance. In
[9,12] it was unfeasible to retain more components
due to rank deficiency problems related to a too small
number of training samples. Nevertheless, if feasible, it
is more appropriate to retain a certain number of
components, as performed by the strategy taken in
this study.
In certain problems, nonlinear techniques can improve
the classification performance [33]. Therefore, in addition
to the use of linear kernels in LS-SVM classifiers, we also
applied LS-SVM classifiers with RBF kernels. All input
patterns were classified using KULeuvenÕs LS-SVMlab
MATLAB/C toolbox [33,35,36] for LS-SVM classification with both linear and RBF kernels. Linear as well as
nonlinear classifiers were applied automatically, including feature selection, (hyper-) parameter estimation,
training and testing.
LS-SVM classifiers require the tuning of a set of
hyperparameters to achieve a high level of performance.
This tuning was performed in the same way as described
in [12]. The experiments consisted of the following steps,
similar as in [9,12]:
(1) divide the dataset in a training set (2/3 of the data)
and a test set (remainder),
(2) train the classifiers using the training set,
(3) evaluate the performance using the test set.
A. Devos et al. / Journal of Magnetic Resonance 173 (2005) 218–228
223
Fig. 3. Boxplots of the imaging intensities of the considered classes: class 1 (top-left), class 2 (top-right), class 3 (bottom-left), and class 4 (bottomright) correspond to the normal tissue of volunteers, normal tissue of patients, CSF, and gliomas of grade II.
Stratified random sampling was used while dividing
the dataset to preserve the proportion of the classes.
The aforementioned procedure was repeated 100 times
to avoid bias possibly introduced by selection of a specific training and test set. In this way we tried to obtain
a representative test performance. The test performance
was measured by the mean AUC and its pooled standard error calculated from 100 randomizations. As discussed by Obuchowski [37], the area under the receiver
operating characteristic (ROC) curve is a good summary measure of the test accuracy. The results were
tested for significant differences by the z test [38], applied as in [12].
4. Results
We evaluated the following binary classifications of
brain tissue types:
Healthy versus tumour tissue (classes 1, 2 and 3; 318
data versus 4, 5, 6, and 7; 351 data; Table 2). The
resulting mean AUC was for all techniques and input
patterns higher than 0.95. The performance based on
imaging intensities alone was significantly lower with
respect to using peak integration values (e.g., when
using LDA, p < 0.01) or the combination of imaging
intensities and peak integration values (e.g., when
using LDA, p < 0.001). Also using magnitude spectra
was significantly worse than using the combination of
imaging intensities and peak integration values for
LDA and LS-SVM with a linear kernel (p < 0.05).
In addition, using solely imaging intensities, LDA
and LS-SVM with a linear kernel reached a significantly lower performance than LS-SVM with an
RBF kernel (both p < 0.05).
Low- versus high-grade tumours (classes 4 and 7; 224
data versus 5 and 6; 127 data; Table 3). Classification
based on imaging intensities alone was poor, while
discrimination based on peak integration values was
significantly better, compared to classification using
imaging intensities alone (with, p < 10 8, <10 8, and
<0.001, respectively, in case of LDA, LS-SVM lin,
and LS-SVM RBF). Also the performances obtained
when using magnitude spectra and the combination
of imaging intensities and peak integration values
were significantly higher than those obtained using
imaging intensities alone. LS-SVM RBF achieved a
significantly higher performance with respect to
LDA and LS-SVM lin, when using imaging intensities (p < 0.001), peak integration (p < 0.05) and the
combination of imaging intensities and peak integration values (p < 0.001). Based on magnitude spectra,
LS-SVM RBF was significantly better than LDA
using 9 PCs (p < 0.05). Based on imaging intensities,
peak integration, and their combination, LS-SVM
RBF also achieved a significantly higher performance
than LDA and LS-SVM with a linear kernel.
224
A. Devos et al. / Journal of Magnetic Resonance 173 (2005) 218–228
Fig. 4. Boxplots of the imaging intensities of the considered classes:
class 5 (top), class 6 (middle), and class 7 (bottom) correspond to
gliomas of grade III, grade IV, and meningiomas.
Low- versus high-grade gliomas (class 4; 176 data versus 5 and 6; 127 data; Table 4). The classification problem resembles that of low- versus high-grade tumours,
but reached in general a slightly higher performance.
This might be partially due to the lower heterogeneity
in the two compared classes, while in low- versus highgrade tumours also nongliomas (namely meningiomas) were included. Once more, peak integration
and the combination of imaging intensities and peak
integration provided a significantly better performance than using imaging intensities alone. For
example, when comparing the use of peak integration
with imaging intensities we obtained p < 0.001 (LDA,
LS-SVM lin) and p < 0.01 (LS-SVM RBF) and the
significance was even stronger for the combination
of imaging intensities and peak integration values.
For this problem, classification with LS-SVMs using
magnitude spectra also reached a very high mean
AUC, for which p < 0.001 (LDA), p < 0.00001 (LSSVM lin), and <0.01 (LS-SVM RBF) with respect to
imaging intensities. From Figs. 3 and 4 we remark that
the imaging intensities for class 4 indeed highly ressemble those of classes 5 and 6, which explains the
low performance based on the imaging intensities.
LS-SVMs achieved a significantly better result than
LDA when using magnitude spectra, while LS-SVM
RBF provided a significantly higher AUC than LDA
and LS-SVM lin when using imaging intensities.
Gliomas versus meningiomas (classes 4, 5, and 6; 303
data versus class 7; 48 data; Table 5). Note that the
same data were used as in low- versus high-grade
tumours, but now gliomas were differentiated from
meningiomas. At least a mean AUC of 0.99 was
reached when based on peak integration or the combination of imaging intensities and peak integration
(AUC > 0.85 when using imaging intensities alone).
For LDA and LS-SVM lin, using peak integration,
combined with imaging intensities or not, gave a significantly better result than using imaging intensities
(p < 0.001 for all cases). However, no significant differences were found for LS-SVM RBF with respect
to the input type used.
Grade II versus grade III gliomas (class 4; 176 data
versus class 5; 57 data; Table 6). With respect to
the significant influence of the input features, we
observed the same as for low- versus high-grade gliomas, except when using LS-SVM RBF. In case of
LS-SVM RBF, only the combination of imaging
intensities and peak integration values gave a signifi-
Table 2
Average test performance for classification of healthy versus tumour tissue from 100 runs of stratified random splittings
Classifier
MRI
Peak integration
MRI/peak integration
Magnitude spectra
LDA
LS-SVM lin
LS-SVM RBF
0.9569 ± 0.0128
0.9570 ± 0.0128
0.9858 ± 0.0065
0.9912 ± 0.0042
0.9921 ± 0.0039
0.9974 ± 0.0036
0.9991 ± 0.0009
0.9991 ± 0.0008
0.9998 ± 0.0003
0.9828 ± 0.0066(10;89.7%)
0.9755 ± 0.0097
0.9851 ± 0.0076
MRI, imaging intensities; peak integration, peak integration values; MRI/peak integration, imaging intensities and peak integration values; and
magnitude spectra, water normalized magnitude spectra. Performance measure for each classifier are the mean AUC and its pooled standard error.
When using magnitude spectra, PCA was applied prior to LDA. The number of principal components that was given as input to LDA and the
amount of variance explained by those components is mentioned between brackets (row LDA, column magnitude spectra).
225
A. Devos et al. / Journal of Magnetic Resonance 173 (2005) 218–228
Table 3
Classification of low- versus high-grade tumours
Classifier
MRI
Peak integration
MRI/peak integration
Magnitude spectra
LDA
LS-SVM lin
LS-SVM RBF
0.6239 ± 0.0543
0.6239 ± 0.0543
0.8469 ± 0.0385
0.9210 ± 0.0243
0.9328 ± 0.0220
0.9827 ± 0.0124
0.9195 ± 0.0255
0.9260 ± 0.0240
0.9918 ± 0.0072
0.9193 ± 0.0265(9;92.8%)
0.9573 ± 0.0189
0.9797 ± 0.0161
For further explanation we refer to Table 2.
Table 4
Classification of low- versus high-grade gliomas
Classifier
MRI
Peak integration
MRI/peak integration
Magnitude spectra
LDA
LS-SVM lin
LS-SVM RBF
0.7429 ± 0.0517
0.7431 ± 0.0517
0.8774 ± 0.0354
0.9452 ± 0.0223
0.9453 ± 0.0222
0.9774 ± 0.0158
0.9563 ± 0.0203
0.9589 ± 0.0193
0.9920 ± 0.0084
0.9339 ± 0.0240(9;93.4%)
0.9809 ± 0.0114
0.9896 ± 0.0081
For further explanation we refer to Table 2.
Table 5
Classification of gliomas versus meningiomas
Classifier
MRI
Peak integration
MRI/peak integration
Magnitude spectra
LDA
LS-SVM lin
LS-SVM RBF
0.8593 ± 0.0406
0.8590 ± 0.0407
0.9520 ± 0.0303
0.9945 ± 0.0056
0.9947 ± 0.0055
0.9985 ± 0.0025
0.9961 ± 0.0040
0.9964 ± 0.0039
0.9989 ± 0.0019
0.9890 ± 0.0083(9;92.8%)
0.9889 ± 0.0092
0.9912 ± 0.0092
For further explanation we refer to Table 2.
Table 6
Classification of grade II versus grade III gliomas
Classifier
MRI
Peak integration
MRI/peak integration
Magnitude spectra
LDA
LS-SVM lin
LS-SVM RBF
0.7799 ± 0.0689
0.7799 ± 0.0688
0.9018 ± 0.0410
0.9278 ± 0.0310
0.9290 ± 0.0306
0.9706 ± 0.0217
0.9480 ± 0.0313
0.9488 ± 0.0308
0.9907 ± 0.0098
0.9339 ± 0.0274(9;93.1%)
0.9669 ± 0.0196
0.9861 ± 0.0141
For further explanation we refer to Table 2.
cantly higher mean AUC than imaging intensities
alone (p < 0.05). Linear classification techniques gave
a significantly lower performance than LS-SVM RBF
when using imaging intensities (p < 0.05).
5. Discussion
5.1. Classification techniques
Tables 2–6 show that all classification techniques performed very well, especially based on the peak integration values and the combination with imaging
intensities. In general, LDA is shown to be competitive
with linear LS-SVM classifiers, even when based on
the first principal components of the magnitude spectra.
Using all PCs that explain more variance than the average, LDA based on magnitude spectra reached a similar
performance as based on peak integration or the combination of imaging intensities and peak integration
values.
Nevertheless, in several cases a significant difference
was found, e.g., for the discrimination of low- and
high-grade gliomas, based on magnitude spectra both
kernel-based techniques reached a significantly higher
performance than PCA/LDA. However, not only the
classical linear LDA technique, but also the kernelbased linear LS-SVM did not always reach the performance of the nonlinear LS-SVM. This occurred for
the problem low- versus high-grade tumours based on
peak integration and the combination of imaging intensities and peak integration.
For several problems, the unbalanced situation (e.g.,
gliomas versus meningiomas) or the relatively small
number of data available forms a limitation for training
(e.g., grade II versus grade III gliomas). Therefore, the
discrimination boundary might strongly correlate with
the training set. Especially LDA requires a significant
amount of data to be able to draw a linear separating
line between overlapping classes. Kernel-based techniques are less sensitive to the amount of data and the
input dimension and are able to detect automatically
important characteristics independently of the input
226
A. Devos et al. / Journal of Magnetic Resonance 173 (2005) 218–228
pattern. Hence, these techniques are able to obtain a
high performance even without any prior dimensionality
reduction, although dimensionality reduction may further improve the results.
5.2. Imaging intensities versus metabolic data from MRSI
Although MRI is an established technique for the
characterization of brain tumours, it has a few limitations that produce uncertainty for an accurate assessment of the presence and extent of the tumour. In
practice, the contrast-enhancing lesion on an MR image
is often much smaller than the region of abnormal
metabolism [39,40]. From our results we observed that,
for some problems the use of imaging intensities alone
reached a significantly lower performance, e.g., for the
discrimination between low- and high-grade tumours.
This is in correspondence to the conclusion of [39–41],
that the MR imaging intensities are unable to fully explain the metabolic heterogeneity of brain tumours.
Scaling of the MR data is necessary to correct for effects independent of the tissue characteristics. As a result
of the scaling procedure, MR imaging intensities from
different subjects or acquired under different conditions,
should be more comparable. Although the applied scaling method might not be fully appropriate, the procedure yielded good classification results for healthy
versus tumour tissue and gliomas versus meningiomas.
Scaling with respect to healthy tissue might be an alternative method that possibly could improve results,
but—in contrast to the applied method—would involve
processing that is difficult to automate.
The obtained test performances based on metabolic
data are in agreement with our previous studies on short
and long echo time spectra reported in [9,12]. The results
were also similar with respect to those of other MRS(I)
studies [10,13–15,17], although these authors used other
performance measures than the test AUC. This confirms
the statement that spectroscopy is able to add valuable
information about the metabolic status of brain tumours, which is in agreement with a few clinical combined MRSI/MRI studies [39–41]. As such multivoxel
MRS could be very helpful for the diagnosis of brain tumours in combination with conventional MRI.
However, automated discrimination between different tumour types is still difficult, partially due to the fact
that only intensity values are used and no anatomical
(e.g., the location and homogeneity of the tumour) or
clinical information (e.g., complaints of the patient) is
included in the input features. To make an accurate
diagnosis a neuroradiologist exploits such clinical features—in contradiction to most classification studies—
as they could be very specific for certain types of tumour. For example, meningiomas are a type of tumour
of the meninges and are not really a type of brain tumour, while gliomas are intracerebral tumours that start
in glial cells. In an MRI classification study [18] also several other diagnostic factors were found to be important
for prediction of brain glioma like age, oedema, blood
supply, calcification and haemorrhage. Hence, adding
important anatomical and clinical information as input
features is expected to further improve automated diagnosis of brain tumours.
5.3. Several classification problems
If a classification technique is developed for diagnostic purposes, then the technique should be able to distinguish healthy tissue from tumour tissue. Several
metabolic differences between normal cells and various
tumour types are reflected in MR spectra: the NAA
and Cr levels are lower in tumour spectra, while the
Cho level is higher. The applied techniques were able
to extract and exploit this differing pattern available in
imaging intensities and MR spectra. Healthy brain and
tumour tissue could be distinguished almost perfectly
from each other.
Classification of low- versus high-grade gliomas
reached a high mean AUC (>0.93) using any input (except
imaging intensities or magnitude spectra with a low number of PCs given to LDA). A similar observation could be
made for gliomas versus meningiomas (AUC > 0.98),
low- versus high-grade tumours (AUC > 0.91) and grade
II versus grade III gliomas (AUC > 0.92).
Using the imaging intensities alone for discriminating
gliomas of grade II and grade III, a linear classifier
performed poorly. This might partially be due to the
unbalanced distribution of the data (176 data of class
4, and 57 of class 5). A significantly higher performance
was obtained by using peak integration or combining
the imaging intensities and peak integration. Also for
the discrimination between gliomas and meningiomas,
the use of peak integration and the combination of peak
integration and imaging intensities gave a significant
improvement with respect to using imaging intensities
alone. This was not the case when using LS-SVM with
an RBF kernel, since the performance was already high.
6. Conclusions
This study investigated the use of several classification techniques for classification of brain tumours using
MR imaging intensities and metabolic data from MRSI.
The nonlinear technique LS-SVM with an RBF kernel
reached in several specific problems a significantly better
performance than the linear techniques. Although linear
classifiers also performed well, based on these data this
indicates that a few diagnostic problems seem to have
a nonlinear behaviour.
From current studies it is clear that 1H MRSI is an
important adjunct to the clinical imaging techniques
A. Devos et al. / Journal of Magnetic Resonance 173 (2005) 218–228
for noninvasive diagnosis of viable tumours. For most
problems, binary classification based on imaging intensities and metabolic information from MRSI is very well
possible. The combined use of MR imaging intensities
and metabolic information significantly increased the
performance with respect to imaging intensities alone.
Also with respect to metabolic data alone, the combination of imaging intensities and metabolic data reached a
higher, although not significantly, performance. The results of this study strengthen the statement that imaging
intensities and metabolic data provide complementary
information for the accurate discrimination between
several brain tissue types. Therefore, we motivate the
integration of MRSI into a standard clinical examination which is performed for the diagnosis of brain tumours. However, to enhance the quality of automated
diagnosis, it would be benificial that classification datasets also include several other relevant anatomical and
clinical parameters. This would enable pattern recognition people to develop and test classifiers that use as
much information as relevant.
Acknowledgments
This research work was carried out at the ESAT laboratory and the Interdisciplinary Center of Neural Networks ICNN of the Katholieke Universiteit Leuven, in
the framework of the Belgian Programme on Interuniversity Poles of Attraction, initiated by the Belgian Federal Science Policy Office (IUAP Phase IV-02 and IUAP
Phase V-22), the EU funded projects BIOPATTERN
(EU network of excellence; Contract No. 508803) and
eTUMOUR (FP6 integrated project; Contract No.
503094), the Concerted Action Project MEFISTO and
AMBIORICS of the Flemish Community, the FWO
projects G.0407.02 and G.0269.02 and the IDO/99/03
and IDO/02/009 projects. AD research funded by a
Ph.D grant of the Institute for the Promotion of Innovation through Science and Technology in Flanders (IWTVlaanderen). LVH was a postdoctoral researcher with
the National Fund for Scientific Research FWO - Flanders. Use of the data provided by the EU funded
INTERPRET project (IST-1999-10310; http://carbon.uab.es/INTERPRET/) is gratefully acknowledged.
References
[1] B.L. Dean, B.P. Drayer, C.R. Bird, R.A. Flom, J.A. Hodak,
S.W. Coons, R.G. Carey, Gliomas: classification with MR
imaging, Radiology 174 (1990) 411–415.
[2] F.I.V. Earnest, P.J. Kelly, B.W. Scheithauer, B.A. Kall, T.L.
Cascino, R.L. Ehman, G.S. Forbes, P.L. Axley, Cerebral astrocytomas: histopathologic correlation of MR and CT contrast
enhancement with stereotactic biopsy, Radiology 166 (1988) 823–
827.
227
[3] S.J. Nelson, S. Cha, Imaging glioblastoma multiforme, Cancer J.
9 (2) (2003) 134–145.
[4] J. Rees, Advances in magnetic resonance imaging of brain
tumours, Curr. Opin. Neurol. 16 (2003) 643–650.
[5] S.J. Nelson, Multivoxel magnetic resonance spectroscopy of brain
tumors, Mol. Cancer Ther. 2 (2003) 497–507.
[6] X. Leclerc, T.A.G.M. Huisman, A.G. Sorensen, The potential of
proton magnetic resonance spectroscopy (1H) in the diagnosis and
management of patients with brain tumors, Curr. Opin. Oncol. 14
(2002) 292–298.
[7] S.K. Mukherji (Ed.), Clinical Applications of Magnetic Resonance Spectroscopy, Wiley-Liss, 1998.
[8] S.J. Nelson, E. Graves, A. Pirzkall, X. Li, A.A. Chan, D.B.
Vigneron, T.R. McKnight, In vivo molecular imaging for planning radiation therapy of gliomas: an application of 1H MRSI, J.
Magn. Reson. Imaging 16 (2002) 464–476.
[9] A. Devos, L. Lukas, J.A.K. Suykens, L. Vanhamme, A.R. Tate,
F.A. Howe, C. Majós, A. Moreno-Torres, M. van der Graaf, C.
Arús, S. Van Huffel, Classification of brain tumours using short
echo time 1H MR spectra, J. Magn. Reson. 170 (1) (2004) 164–
175.
[10] S. Herminghaus, T. Dierks, U. Pilatus, W. Möller-Hartmann, J.
Wittsack, G. Marquardt, C. Labish, H. Lanfermann, W. Schlote,
F.E. Zanella, Determination of histopathological tumor grade in
neuroepithelial brain tumors by using spectral pattern analysis of
in vivo spectroscopic data, J. Neurosurg. 98 (2003) 74–81.
[11] Y. Huang, P.J.G. Lisboa, W. El-Deredy, Tumour grading from
magnetic resonance spectroscopy: a comparison of feature
extraction with variable selection, Statist. Med. 22 (2003) 147–164.
[12] L. Lukas, A. Devos, J.A.K. Suykens, L. Vanhamme, F.A. Howe,
C. Majós, A. Moreno-Torres, M. van der Graaf, A.R. Tate, C.
Arús, S. Van Huffel, Brain tumour classification based on long
echo proton MRS signals, Artif. Intell. Med. 31 (1) (2004) 73–89.
[13] M.C. Preul, Z. Caramanos, D.L. Collins, J.-G. Villemure, R.
Leblanc, A. Olivier, R. Pokrupa, D.L. Arnold, Accurate, noninvasive diagnosis of human brain tumors by using magnetic
resonance spectroscopy, Nat. Med. 2 (3) (1996) 323–325.
[14] A.W. Simonetti, W.J. Melssen, M. van der Graaf, G.J. Postma,
A. Heerschap, L.M.C. Buydens, A new chemometric approach
for brain tumor classification using magnetic resonance imaging
and spectroscopy, Anal. Chem. 75 (20) (2003) 5352–5361.
[15] F. Szabo De Edelenyi, C. Rubin, F. Estève, S. Grand, M.
Décorps, V. Lefournier, J.-F. Le Bas, C. Rémy, A new approach
for analyzing proton magnetic resonance spectroscopic images of
brain tumors: nosologic images, Nat. Med. 6 (2000) 1287–1289.
[16] A.R. Tate, J.R. Griffiths, I. Martı́nez-Pérez, A. Moreno, I. Barba,
M.E. Cabañas, D. Watson, J. Alonso, F. Bartumeus, F. Isamat,
I. Ferrer, F. Vila, E. Ferrer, A. Capdevila, C. Arús, Towards a
method for automated classification of 1H MRS spectra from
brain tumours, NMR Biomed. 11 (1998) 177–191.
[17] A.R. Tate, C. Majós, A. Moreno, F.A. Howe, J.R. Griffiths, C.
Arús, Automated classification of short echo time in in vivo 1H
brain tumor spectra: a multicenter study, Magn. Reson. Med. 49
(2003) 29–36.
[18] C.-Z. Ye, J. Yang, D.-Y. Geng, Y. Zhou, N.-Y. Chen, Fuzzy
rules to predict degree of malignancy in brain glioma, Med. Biol.
Eng. Comput. 40 (2002) 145–152.
[19] International network for Pattern Recognition of Tumours Using
Magnetic Resonance. Available from: <http://carbon.uab.es/INTERPRET/>.
[20] A.W. Simonetti, W.J. Melssen, M. van der Graaf, A. Heerschap,
L.M.C. Buydens, Automated correction of unwanted phase jumps
in reference signals which corrupt MRSI spectra after eddy
current correction, J. Magn. Reson. 159 (2002) 151–157.
[21] W.W.F. Pijnappel, A. van den Boogaart, R. de Beer, D. van
Ormondt, SVD-based quantification of magnetic resonance
signals, J. Magn. Reson. 97 (1992) 122–134.
228
A. Devos et al. / Journal of Magnetic Resonance 173 (2005) 218–228
[22] I.D. Campbell, C.M. Dobson, R.J.P. Williams, A.V. Xavier,
Resolution enhancement of protein PMR spectra using the
difference between a broadened and a normal spectrum, J. Magn.
Reson. 11 (1973) 172–181.
[23] I.C.P. Smith, L.C. Stewart, Magnetic resonance spectroscopy in
medicine: clinical impact, Prog. Nucleic Mag. Reson. Spectrosc.
40 (2002) 1–34.
[24] F.A. Howe, S.J. Barton, S.A. Cudlip, M. Stubbs, D.E.
Saunders, M. Murphy, P. Wilkins, K.S. Opstad, V.L. Doyle,
M.A. McLean, B.A. Bell, J.R. Griffiths, Metabolic profiles of
human brain tumors using quantitative in vivo 1H magnetic
resonance spectroscopy, Magn. Reson. Med. 49 (2003) 223–
232.
[25] C. Majós, J. Alonso, C. Aguilera, M. Serrallonga, J.J. Acebes, C.
Arús, J. Gili, Adult primitive neuroectodermal tumor: proton MR
spectroscopic findings with possible application for differential
diagnosis, Radiology 225 (2002) 556–566.
[26] M. Murphy, A. Loosemore, A.G. Clifton, F.A. Howe, A.R. Tate,
S.A. Cudlip, P.R. Wilkins, J.R. Griffiths, B.A. Bell, The contribution of proton magnetic resonance spectroscopy (1H MRS) to
clinical brain tumour diagnosis, Br. J. Neurosurg. 16 (4) (2002)
329–334.
[27] V. Govindaraju, K. Young, A.A. Maudsley, Proton NMR
chemical shifts and coupling constants for brain metabolites,
NMR Biomed. 13 (2000) 129–153.
[28] C. Fraley, A.E. Raftery, MCLUST: software for model-based
cluster analysis, J. Classif. 16 (2) (1999) 297–306.
[29] R. Wehrens, A.W. Simonetti, L.M.C. Buydens, Mixture modelling of medical magnetic resonance data, J. Chemometr. 16 (6)
(2002) 274–282.
[30] R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification, second
ed., Wiley, New York, 2001.
[31] B.D. Ripley, Pattern Recognition and Neural Networks, Cambridge University Press, Cambridge, 1996.
[32] J.A.K. Suykens, J. Vandewalle, Least squares support vector
machine classifiers, Neur. Proc. Lett. 9 (3) (1999) 293–300.
[33] J.A.K. Suykens, T. Van Gestel, J. De Brabanter, B. De Moor, J.
Vandewalle, Least Squares Support Vector Machines, World
Scientific Publishing, Singapore, 2002.
[34] A.C. Rencher, Methods of multivariate analysis, Wiley series in
Probability and Mathematical Statistics (1995).
[35] LS-SVMlab MATLAB/C toolbox. Available from: <http://
www.esat.kuleuven.ac.be/sista/lssvmlab>.
[36] K. Pelckmans, J.A.K. Suykens, T. Van Gestel, J. De Brabanter, L.
Lukas, B. Hamers, B. De Moor, J. Vandewalle, LS-SVMlab
Toolbox UserÕs Guide, Internal Report 02-145, ESAT-SISTA,
K.U.Leuven, Leuven, Belgium, 2002.
[37] N.A. Obuchowski, Receiver operating characteristic curves and
their use in radiology, Radiology 229 (2003) 3–8.
[38] J.A. Hanley, B.J. McNeil, A method of comparing the areas
under receiver operating characteristic curves derived from the
same cases, Radiology 148 (1983) 839–843.
[39] D.J. Manton, M. Lowry, C. Rowland-Hill, D. Crooks, B.
Mathew, L.W. Turnbull, Combined proton MR spectroscopy
and dynamic contrast enhanced MR imaging of human intracranial tumours in vivo, NMR Biomed. 13 (2000) 449–459.
[40] S.J. Nelson, D.B. Vigneron, W.P. Dillon, Serial evaluation of
patients with brain tumors using volume MRI and 3D 1H MRSI,
NMR Biomed. 12 (1999) 123–138.
[41] D. Vigneron, A. Bollen, M. McDermott, L. Wald, M. Day, S.
Moyher-Moworolski, R. Henry, S. Chang, M. Berger, W. Dillon,
S. Nelson, Three-dimensional magnetic resonance spectroscopic
imaging of histologically confirmed brain tumors, Mag. Reson.
Imaging 19 (2001) 89–101.