[go: up one dir, main page]

Academia.eduAcademia.edu
2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, pp. 1-8. EEG Signal Analysis for BCI Application using Fuzzy System Thanh Nguyen, Saeid Nahavandi, Abbas Khosravi, Douglas Creighton, and Imali Hettiarachchi Centre for Intelligent Systems Research (CISR), Deakin University Geelong Waurn Ponds Campus, Victoria, 3216, Australia Email: thanh.nguyen@deakin.edu.au Tel: +613 52278281. Fax: +613 52271046 Abstract—An approach to EEG signal classification for braincomputer interface (BCI) application using fuzzy standard additive model is introduced in this paper. The Wilcoxon test is employed to rank wavelet coefficients. Top ranking wavelets are used to form a feature set that serves as inputs to the fuzzy classifiers. Experiments are carried out using two benchmark datasets, Ia and Ib, downloaded from the BCI competition II. Prevalent classifiers including feedforward neural network, support vector machine, k-nearest neighbours, ensemble learning Adaboost and adaptive neuro-fuzzy inference system are also implemented for comparisons. Experimental results show the dominance of the proposed method against competing approaches. Keywords: Wavelet transform; fuzzy standard additive model; Wilcoxon test; EEG signal classification; motor imagery data. I. INTRODUCTION EEG signal analysis to understand brain electrical activity is an important problem of a BCI system. Constructing a usable and reliable BCI therefore requires an accurate and effective classification of multichannel EEG signals. Various techniques have been introduced for EEG signal classification in the literature from low-cost methods such as LDA [1-4], logistic regression [5-7], k-nearest neighbour [810], to computationally expensive techniques such as support vector machine [11-14], artificial neural networks [15-17], and Adaboost ensemble learning [1, 18]. These methods however face a common drawback in handling the nonlinear, noisy, embedded outlier nature of EEG signal data. Consequently, fuzzy logic (FL), which has been well-known as a powerful tool for uncertainty modelling, is used for modelling EEG signals. Five adaptive neuro-fuzzy inference system (ANFIS) classifiers based on the inputs derived by wavelet transform (WT) were designed in Güler and Übeyli [19] to classify five classes of EEG signals. The ANFIS was built using backpropagation gradient descent method integrated with the least squares method. Type-2 FL systems for EEG signal classification based on features extracted by the power spectral density estimation with a sliding window strategy were investigated in Herman et al. [20-21]. Type-2 FL has demonstrated greater uncertainty handling capability and flexibility than type-1 FL and thus provided a promising potential to address nonstationary and highly variable EEG data. The use of fuzzy SVM (FSVM) for differentiating EEGbased left and right motor imagery with wavelet features https://doi.org/10.1109/IJCNN.2015.7280593 obtained in two sub-bands beta and mu rhythms was proposed in Xu et al. [22]. The FSVM classifier was shown as an effective method for identifying different metal tasks from EEG signals. In another approach, Yang et al. [23] exploited ANFIS classifier to distinguish electrical status epilepticus during sleep (ESES) and normal EEG signals. Permutation entropy and sample entropy of the EEG signals are fed into the ANFIS models. ANFIS was highlighted as a potential tool to classify the background EEG from ESES patients and normal control subjects. EEG signal analysis in general requires the investigation of a feature extraction to obtain useful information from data. Time series autoregressive (AR) models, Fourier transform (FT), time-frequency analysis and wavelet transform (WT) are broadly employed for exploring prominent discriminant features [24]. AR, FT and conventional time-frequency techniques commonly assume that EEG signal is stationary. However, this assumption is often violated in practice. Consequently, for non-stationary transient signals like EEG, WT is recommended rather than AR and FT. WT provides combined information in time-frequency domain that can enhance the performance of EEG classification [25]. This paper proposes a technique using WT and Wilcoxon test for EEG feature extraction. As conventional methods such as maximum variance (MV) and Kolmogorov-Smirnov (KS) test [26] has a shortcoming, the use of Wilcoxon statistics for selecting wavelet has a potential to improve the EEG signal classification performance. The arguments of the paper are structured as follows. The next section presents fuzzy standard additive model with tabu search learning (tabu-FSAM) as a classifier. Section II is devoted for the main methodology that includes a feature extraction by WT with the Wilcoxon test. Experimental results are presented in Section IV. Section V conveys conclusions and future research directions. II. FUZZY SYSTEM WITH TABU SEARCH LEARNING A. Fuzzy Standard Additive Model (FSAM) The FSAM system : → consists of if-then fuzzy rules, which have been proved to be able to uniformly approximate continuous and bounded measurable functions in a compact domain [27-30]. Any type of if-part fuzzy sets, which are denoted by  , can be employed based on this approximation theorem. The theorem also facilitates the choice of any then-part fuzzy sets, denoted by  , 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, pp. 1-8. because the FSAM system utilizes only the centroid and volume of to calculate the output ( ) where ∈ is the input vector. ∑ ( )= ( ) = ∑ ∑ ( ) ( ) Fuzzy rules in the word form “If = then = ” constitutes fuzzy rule patches of the form ×  × to cover the graph of an approximand . If-part fuzzy set  is characterized by a joint set function : →[0, 1] ( )= that factors across input components as: ( ) … ( ). Then-part fuzzy set  is described by a membership function : → [0, 1] having a volume (or area) and centroid [29]. The FSAM output ( ) can be represented as a convex sum of centroids of then-part fuzzy sets: ( )=∑ ( ) where ( ) are considered as the convex weights: ( )= ( ) ∑ ( ) (a) (b) Fig. 1. (a) A structure of FSAM system [30]. Each input sample is fed into each fuzzy rule that fires to some membership degree to calculate output F(x). (b) Fuzzy rules define fuzzy patches to cover the approximand in the inputoutput space. Fig. 1 illustrates the parallel configuration of FSAM, which requires an exponential explosion of fuzzy rule number to cover the function’s graph. FSAM system normally needs fuzzy rules to approximate the function : → in a compact domain [30]. https://doi.org/10.1109/IJCNN.2015.7280593 Learning is a vital process of FSAM to construct a knowledge-based system that comprises if-then fuzzy rules. The FSAM learning process conventionally includes two basic steps: unsupervised learning for constructing if-then fuzzy rules and supervised learning for tuning rule parameters [31, 32]. We propose the use of a meta-heuristic learning process, i.e. tabu search, to find the optimal if-then fuzzy rule structure for classification. Details of the learning process are presented in the following subsection. B. Tabu Search for Fuzzy System Learning Tabu search is a meta-heuristic algorithm that was introduced by Glover [33]. It has emerged as a competent technique for solving difficult optimization problems. A main feature of tabu search is the employment of special strategies to exploit adaptive memory. Memory-based strategies allow tabu search to penetrate complexities that often confound other solving approaches. They also enable the implementation of searching procedures that are able to explore the solution space of objective functions economically and effectively [34, 35]. Similar to ordinary local or neighbourhood search, tabu search begins from an initial point (solution) and proceeds repeatedly from one point to another point until a pre-set termination criterion is met. The basic principle is to avoid entrainment in cycles by restricting moves that take to previously visited points in the solution space. We apply tabu search to optimize the rule structure of FSAM system. Fuzzy rules are coded by binary digits where 1 means the rule is selected and 0 represents the case of omitting the rule. This application of tabu search becomes solving a problem where solution is a binary vector. The optimization problem is carried out with the objective that minimizes the error between the FSAM estimated outputs and real values. Fig. 2 presents the tabu search pseudo code used in our methodology. III. TABU FUZZY SYSTEM WITH WAVELETS FEATURES FOR EEG SIGNAL CLASSIFICATION Fuzzy systems in general or FSAM in particular often encounters a huge challenge in computation if there are many inputs. High-dimensional data would decline the performance of FSAM. Therefore, a dimension reduction or feature extraction tool can be implemented before FSAM is executed. This is particularly important as the EEG data are often assembled in noisy and high-dimensional nature. The approach to EEG signal analysis introduced in this paper is exhibited in Fig. 3. The raw EEG signal data are divided into different channels before processing. For each channel, the WT is executed separately to obtain the information contained in the EEG signals. The wavelet coefficients are then ranked based on the Wilcoxon test statistic to select the most discriminative coefficients of each channel. In the next step, these selected coefficients are combined to produce a feature set that serves as inputs to the tabu-FSAM classifier. The following presents in detail WT and the employment of the Wilcoxon test for wavelet coefficient selection. 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, pp. 1-8. Inputs: - Objective function: The objective function that uses the binary vector input - Dim: The length of the binary solution vector - Pre-set maximum number of switches: The number of times the algorithm searches a better point in its direct neighbouring before the algorithm terminates 1: 2: 3: 4: 5: Create random starting Point that is a binary vector Calculate its objective using Objective function Initiate the tabu list Set the number of switches equal to zero Do 6: Choose a random index that is less than Dim 7: Choose a different index if it is already present in the tabu list 8: Add the index to the tabu list 9: Create New Point by changing the value of current Point at the index from 0 to 1 or vice versa 10: Calculate New Point’s objective 11: If Objective of New Point is less than that of the current Point 12: Replace current Point by New Point: Point = New Point 13: Reset the number of switches to zero 14: End If 15: If tabu list is full (its length equal to Dim) 16: Break 17: End If 18: Increase number of switches by 1 19: While number of switches is less than its pre-set maximum 20: Return Point that is the binary solution Fig. 2. Tabu search pseudo code A. WT for Feature Extraction WT represents a signal in a time-frequency fashion [36]. WT eliminates the requirement of signal stationarity that often applies to conventional methods. Once the wavelets (the mother wavelet) ( ) is fixed, translations and dilations of the mother wavelet can be formed ,( , ) ∈ R × R . It is convenient to take special Fig. 3. Tabu FSAM classifier with wavelets features https://doi.org/10.1109/IJCNN.2015.7280593 and such as = 2 and = 2 where and are integers. One of the simplest wavelets is the Haar wavelet. Haar functions can uniformly approximate any continuous function. Dilations ( )= and translations of the function , which is . (2 − ), characterize an orthogonal basis in ( ). Therefore any element in ( ) may be defined as a linear expression using these basis functions [37]. WT has been employed for a number of problems such as those of medical data analysis, e.g. see [38-41]. We employ the four-level Haar wavelets for feature extraction applied to processing EEG signal data. Haar wavelets are employed because of their compact support and orthogonality by which the discriminant features of data samples can be characterized by a few representative wavelet coefficients [26]. Once the transformation is completed, a procedure to select significant wavelet coefficients that best discriminate the different classes is performed. A conventional approach to this procedure is the maximum variance (MV) criterion. MV selects coefficients (features) that have greatest variance. 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, pp. 1-8. Quiroga et al. [26] argued that coefficients with the largest variance do not necessarily show the best discrimination among classes. Accordingly, selected coefficients should have the largest deviation from normality for the best discrimination. For this end, Quiroga et al. [26] suggested using the Lilliefors modification version of a KolmogorovSmirnov (KS) statistical test. Given the dataset , the comparison between the data cumulative distribution function ( ) with a Gaussian distribution ( ) is investigated. The (| ( ) − ( )|) [26]. test statistic is then measured by Nevertheless, the KS test follows an unsupervised strategy that does not emphasize the difference or the discrimination of the classes. It is important to note that even in a single class, features may still present a large deviation from normality. If this context occurs, the KS test may nominate these features although they do not refer to the difference among the classes. Thus, the information used by KS test may not be appropriate to guarantee good discrimination properties of a feature passing the test. We introduce a method using the Wilcoxon test to select elite wavelet coefficients for classification. Unlike the MV or KS test, the Wilcoxon method provides information about the equality of population locations of the classes. It involves a supervised approach that takes into account class labels to separate features of different classes. The following subsection scrutinizes backgrounds of the Wilcoxon method. B. Wilcoxon Method Wilcoxon rank sum test is a nonparametric test, which evaluates the equality of population locations (medians). The null hypothesis is that two populations enclose identical distribution functions whereas the alternative hypothesis refers to the case two distributions differ regarding the medians. The normality assumption regarding the differences between the two samples is not required. That is why this test is used instead of the two sample t-test in many applications when the normality assumption is concerned. The main phases of the Wilcoxon ranking test are summarized below [42, 43]: 1) Assemble all observations of the two-class populations and sort them in the ascending order. 2) The Wilcoxon statistic is calculated by the sum of all the ranks corresponding with the observations from the smaller group. 3) The hypothesis decision is made based on the p-value, which is found in the table of Wilcoxon rank sum distribution or using statistical packages. In the application of the Wilcoxon test for wavelet coefficient selection, the absolute values of the standardized Wilcoxon statistics are employed to rank coefficients. Note that the Haar wavelets are orthogonal. This ensures that the higher ranking coefficients are more prominent [44]. IV. EXPERIMENTAL RESULTS Experiments in this study are deployed using the two widely-used Ia and Ib datasets, which are downloaded from the BCI Competition II. The data were generated by Birbaumer et al. [45]. https://doi.org/10.1109/IJCNN.2015.7280593 In the Ia dataset, number of training samples is 268 where 135 trials are of class “1” and 133 samples are of class “2”, which correspond to moving a cursor up and down. Number of testing trials of this dataset is 293. In the Ib dataset, number of training samples is 200 where classes “1” and “2” both have the same number of trials at 100. Number of testing samples in this dataset is 180 trials. Fig. 4a&b exhibit noisy, embedded outliers, non-stationary and multidimensional characteristics of the EEG signals recorded in the Ia and Ib datasets. The first step is to separate the EEG signals into various channels for ease of processing. Data of individual channels show different spectra with rather clear separation points. More specifically, Fig. 4a graphically indicates 6 channels in the Ia dataset whereas there are visibly 7 distinct channels in the Ib dataset depicted by Fig. 4b. (a) (b) Fig. 4. Plots of trials of different channels of the (a) Ia dataset and (b) Ib dataset After the raw EEG data are separated into different channels, the Haar wavelet transform at level 4 is implemented for each channel. Then some filter approaches are applied to remove coefficients with low absolute values, little variation, small ranges or low entropy. These 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, pp. 1-8. coefficients are generally not of interest because they have a low potential to discriminate the classes. Taking into account these features may enhance noise into the process. Fig. 5a&b display the distributions of wavelet coefficients obtained by employing WT on channel 1 of the Ia dataset and channel 6 of the Ib dataset respectively. The original signal is a sum of the coarse approximation component A4 and four detail components D1-D4. Each component corresponds to a particular frequency bandwidth. The blue triangular marks indicate the most discriminative coefficients selected through the statistical test based on the Wilcoxon method. Alternatively, the blue diamond marks specify the coefficient chosen by the KS test. The feature set in the Ia dataset consists of 6 features corresponding to 6 channels. Likewise, there are 7 features resulting from 7 channels of the Ib dataset. (a) (a) (b) Fig. 5. Wavelet coefficients of (a) channel 1 of the Ia dataset and (b) channel 6 of the Ib dataset Distributions of the first feature (i.e., wavelet coefficient of channel 1) of the feature set for the Ia and Ib datasets are illustrated in Fig. 6a&b respectively. It can be seen that there is a disturbance and vague distinction of two classes in both datasets. Modelling these data requires powerful uncertainty handling tools in which fuzzy logic, i.e. FSAM, is an example. https://doi.org/10.1109/IJCNN.2015.7280593 (b) Fig. 6. Distribution of the first feature (Channel 1) of the (a) Ia dataset, (b) Ib dataset Once the feature set has been generated, one is able to establish FSAM systems for classification. The number of inputs is equivalent to the number of features of the training feature set. The centres of the antecedent Gaussian fuzzy sets are set equivalent to the values of the features. In the consequent part, the centres of the fuzzy sets are assigned either “1” or “2” depending on the samples representing for class “1” or “2” correspondingly. Accuracy, F1 score statistics (F-measure), Gini coefficient and mutual information are metrics used to evaluate classification performance in the experiments. The F-measure is a single measurement of a classification method’s usefulness. The F-measure takes into account both the “Precision” and “Recall” of the classification procedure to calculate the evaluating score as the harmonic mean of “Precision” and “Recall” expressed by: F  measure  2  Precision Recall Precision  Recall The higher the F-measure, the superior is the predictive power of the classification technique. A score of 1 (or 100%) means the classification procedure is perfect. Gini coefficient (index) is an empirical measure of performance of classification methods based on the area under a receiver operating characteristic curve (AUC). It is a 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, pp. 1-8. linear rescaling of AUC: 2 ∗ – 1. The greater the Gini index, the better performance is the classifier. The mutual information (MI) between estimated and true label is calculated by: ( ̂, ) , =∑ ̂ ∑ ( ̂, ) log ( ̂) ( ) where ( ̂ , ) is the joint probability distribution of estimated and true class labels and , and ( ̂) and ( ) are the marginal probability distributions of and correspondingly [46]. For comparisons, the following procedures: feedforward neural network (FFNN), support vector machine (SVM), knearest neighbours (kNN), ensemble learning Adaboost and adaptive neuro-fuzzy inference system (ANFIS) are also executed. Table 1 & 2 present results of the proposed tabuFSAM method and the comparable techniques deployed on the Ia and Ib datasets respectively. With nondeterministic classifiers (i.e. FFNN, ANFIS, tabu-FSAM), average results over 30 independent trials are presented. For these methods, standard deviation statistics are also displayed adjacent to the means. Note that results are all measured in percentage. We also report here the best results of the competition on the two datasets, which can be seen at http://www.bbci.de/competition/ii/results/index.html. The winners of the Ia and Ib datasets respectively were Mensh et al. [25] and Bostanov [47]. Mensh et al. [25] used gramma-band power of EEG signals for BCIs due to its correlation with high-level mental states. Although most BCIs are deployed using the mu and beta rhythms, the authors recognized that most of the meaningful frequency information for EEG signal analysis in the Ia dataset is in the gamma band, with basically none below 24 Hz. The discriminant analysis was utilized efficiently in this dataset despite its linearity limitation. Bostanov [47] in another approach used the continuous WT and Student’s two-sample t-statistic for EEG data feature extraction. The method performs totally automated recognition and quantification of event-related brain potential components in the time-scale plane. The classical linear discriminant analysis is then employed for the classification. Mensh et al. [25] obtained the greatest accuracy on the Ia dataset at 88.7%. However, with the same method, the authors were just able to obtain 43.9% accuracy on the Ib dataset. On the other hand, the method of Bostanov [47] derived the best performance on the Ib dataset with the accuracy at 54.4%. This method however could just produce the accuracy at 82.6% on the Ia dataset. These statistics reveal the fact that none of the two competition-winner methods performs effectively on both Ia and Ib datasets. In contrast, the proposed tabu-FSAM clearly outperforms both of the competition-winner methods in both datasets. Tabu-FSAM obtains 90.20% and 57.28% accuracy in the Ia and Ib dataset respectively. Table 1. Performance on the Ia dataset F-measure Gini index MI Accuracy The best result of competition by Mensh et al. (2004) 88.70 The accuracy obtained by Bostanov (2004) 82.60 SVM 85.71 68.67 40.42 84.30 kNN 81.57 67.83 41.34 83.96 Adaboost 83.97 65.92 35.19 82.94 FFNN 80.17 (± 5.65) 62.49 (± 9.90) 32.47 (± 9.81) 81.26 (± 4.95) ANFIS 86.91 (± 1.86) 73.88 (± 3.97) 44.35 (± 5.08) 86.94 (± 1.99) Tabu-FSAM 90.07 (± 0.42) 80.40 (± 0.81) 53.82 (± 1.27) 90.20 (± 0.40) Table 2. Performance on the Ib dataset F-measure Gini index MI Accuracy The best result of competition by Bostanov (2004) 54.40 The accuracy obtained by Mensh et al. (2004) 43.90 SVM 54.84 6.67 0.32 53.33 kNN 56.22 10.00 0.72 55.00 Adaboost 55.14 7.78 0.44 53.89 FFNN 55.42 (± 3.64) 8.07 (± 5.60) 0.70 (± 0.66) 54.04 (± 2.80) ANFIS 57.49 (± 1.71) 10.70 (± 3.16) 0.91 (± 0.48) 55.35 (± 1.58) Tabu-FSAM 57.53 (± 1.17) 14.56 (± 2.13) 1.57 (± 0.45) 57.28 (± 1.07) https://doi.org/10.1109/IJCNN.2015.7280593 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, pp. 1-8. (a) (b) Fig. 7. Accuracy obtained by features selected by KS test and Wilcoxon test in (a) Ia dataset, (b) Ib dataset The comparisons among the classifiers also highlight the advantage of the proposed tabu-FSAM against the competitive classifiers. The dominance of tabu-FSAM is shown not only in accuracy but also in other performance measures, i.e. F-measure, Gini index and MI. Among nondeterministic classifiers, tabu-FSAM also yields more stable results with smaller standard deviations compared to FFNN and ANFIS (see Table 1 & 2). Fig. 7 presents the accuracy comparisons when performing classifiers using features selected by KS test and Wilcoxon test. Noticeably, the performance of applications of the Wilcoxon features is superior to those of the KS features through all classifiers. This is understandable as the KS test selects features without a reference to the class labels (i.e., an unsupervised approach). Inversely, the Wilcoxon method takes into account the class labels in examining the features and also provides the information about the equality of population locations of the classes so that it is a more efficient feature selection. V. CONCLUSIONS A method for EEG data classification using tabu-FSAM has been introduced in this paper. The noisy, nonlinear and outlier-embedded nature of EEG signals is modelled proficiently by FSAM whose if-then fuzzy rule structure is optimized by the tabu search algorithm. The paper also presents an approach to supervised EEG signal feature extraction based on WT decomposition and the Wilcoxon statistics. Well-known methods such as MV and KS tests for selecting discriminative wavelet coefficients are used to examine features in an unsupervised approach without reference to the class labels. This does not guarantee the separability of the feature set. The proposed method in this study suggests using the Wilcoxon test, which performs in a supervised strategy. The Wilcoxon test separates data samples according to the class labels and select features by evaluating the population locations of the classes. Experimental results on two benchmark datasets downloaded from the BCI competition II demonstrate the superiority of the Wilcoxon feature selection against the KS test approach. The tabu-FSAM designed for classification also shows great performance dominance to other comparable https://doi.org/10.1109/IJCNN.2015.7280593 classifiers, including FFNN, SVM, kNN, Adaboost and ANFIS. More noticeably, the tabu-FSAM in combination with wavelets outperforms the two winning methods of the BCI competition II in both benchmark Ia and Ib datasets by 1.50% and 2.88% respectively. As feature extraction plays an essential role in determining the classification performance, future research would investigate alternative feature extraction methods. Wavelet packet transform (WPT) is an example where it yields a broader range of potentials for signal analysis than WT. WPT allows wavelet detail components to be decomposed to acquire further its approximation and detail information components. The higher frequency components, which may store important information of the signal, therefore can be examined in the WPT. On the other hand, BCI behaviours may involve multiple actions that cause the classification task to be more complicated. As the present study focuses on the capability of tabu-FSAM for a binary classification, designing fuzzy systems for multi-class problems would be worth a further exploration. ACKNOWLEDGMENT This research is supported by the Australian Research Council (Discovery Grant DP120102112) and the Centre for Intelligent Systems Research (CISR) at Deakin University. [1] [2] [3] [4] [5] REFERENCES Sabeti, M., Katebi, S. D., Boostani, R., & Price, G. W. (2011). A new approach for EEG signal classification of schizophrenic and control participants. Expert Systems with Applications, 38(3), 2063-2071. Vidaurre, C., Kawanabe, M., von Bunau, P., Blankertz, B., & Muller, K. R. (2011). Toward unsupervised adaptation of LDA for brain– computer interfaces. IEEE Transactions on Biomedical Engineering, 58(3), 587-597. Zhang, R., Xu, P., Guo, L., Zhang, Y., Li, P., & Yao, D. (2013). Zscore linear discriminant analysis for EEG based brain-computer interfaces. PloS One, 8(9), e74433. Chen, M., Fang, Y., & Zheng, X. (2014). Phase space reconstruction for improving the classification of single trial EEG. Biomedical Signal Processing and Control, 11, 10-16. Hosseinifard, B., Moradi, M. H., & Rostami, R. (2013). Classifying depression patients and normal subjects using machine learning techniques and nonlinear features from EEG signal. Computer Methods and Programs in Biomedicine, 109(3), 339-345. 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, pp. 1-8. [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] Prasad, P., Halahalli, H., John, J., & Majumdar, K. (2014). Single-trial EEG classification using logistic regression based on ensemble synchronization. IEEE Journal of Biomedical and Health Informatics, 18(3), 2014. Li, Y., & Wen, P. P. (2014). Modified CC-LR algorithm with three diverse feature sets for motor imagery tasks classification in EEG based brain computer interface. Computer Methods and Programs in Biomedicine, 113(3), pp. 767-780. Wang, D., Miao, D., & Xie, C. (2011). Best basis-based wavelet packet entropy feature extraction and hierarchical EEG classification for epileptic detection. Expert Systems with Applications, 38(11), 14314-14320. Guo, L., Rivero, D., Dorado, J., Munteanu, C. R., & Pazos, A. (2011). Automatic feature extraction using genetic programming: An application to epileptic EEG classification. Expert Systems with Applications, 38(8), 10425-10436. Acharya, U. R., Molinari, F., Sree, S. V., Chattopadhyay, S., Ng, K. H., & Suri, J. S. (2012). Automated diagnosis of epileptic EEG using entropies. Biomedical Signal Processing and Control, 7(4), 401-408. Garrett, D., Peterson, D. A., Anderson, C. W., & Thaut, M. H. (2003). Comparison of linear, nonlinear, and feature selection methods for EEG signal classification. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 11(2), 141-144. Li, Y., & Wen, P. P. (2011). Clustering technique-based least square support vector machine for EEG signal classification. Computer Methods and Programs in Biomedicine, 104(3), 358-372. Vatankhah, M., Asadpour, V., & Fazel-Rezai, R. (2013). Perceptual pain classification using ANFIS adapted RBF kernel support vector machine for therapeutic usage. Applied Soft Computing, 13(5), 25372546. Joshi, V., Pachori, R. B., & Vijesh, A. (2014). Classification of ictal and seizure-free EEG signals using fractional linear prediction. Biomedical Signal Processing and Control, 9, 1-5. Ting, W., Guo-zheng, Y., Bang-hua, Y., & Hong, S. (2008). EEG feature extraction based on wavelet packet decomposition for brain computer interface. Measurement, 41(6), 618-625. Guo, L., Rivero, D., Dorado, J., Rabunal, J. R., & Pazos, A. (2010). Automatic epileptic seizure detection in EEGs based on line length feature and artificial neural networks. Journal of Neuroscience Methods, 191(1), 101-109. Özbay, Y., Ceylan, R., & Karlik, B. (2011). Integration of type-2 fuzzy clustering and wavelet transform in a neural network based ECG classifier. Expert Systems with Applications, 38(1), 1004-1010. Ahangi, A., Karamnejad, M., Mohammadi, N., Ebrahimpour, R., & Bagheri, N. (2013). Multiple classifier system for EEG signal classification with application to brain–computer interfaces. Neural Computing and Applications, 23(5), 1319-1327. Güler, I., & Übeyli, E. D. (2005). Adaptive neuro-fuzzy inference system for classification of EEG signals using wavelet coefficients. Journal of Neuroscience Methods, 148(2), 113-121. Herman, P., Prasad, G., & McGinnity, T. M. (2008). Design and online evaluation of type-2 fuzzy logic system-based framework for handling uncertainties in BCI classification. In 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS), pp. 4242-4245. Herman, P., Prasad, G., & McGinnity, T. M. (2008). Designing a robust type-2 fuzzy logic classifier for non-stationary systems with application in brain-computer interfacing. In IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 1343-1349. Xu, Q., Zhou, H., Wang, Y., & Huang, J. (2009). Fuzzy support vector machine for classification of EEG signals using wavelet-based features. Medical Engineering & Physics, 31(7), 858-865. Yang, Z., Wang, Y., & Ouyang, G. (2014). Adaptive neuro-fuzzy inference system for classification of background EEG signals from ESES patients and controls. The Scientific World Journal, 2014, 1-8. McFarland, D. J., Anderson, C. W., Muller, K., Schlogl, A., & Krusienski, D. J. (2006). BCI meeting 2005-workshop on BCI signal processing: feature extraction and translation. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 14(2), 135. Mensh, B. D., Werfel, J., & Seung, H. S. (2004). BCI competition 2003-data set Ia: combining gamma-band power with slow cortical potentials to improve single-trial classification of https://doi.org/10.1109/IJCNN.2015.7280593 [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] electroencephalographic signals. IEEE Transactions on Biomedical Engineering, 51(6), 1052-1056. Quiroga, R. Q., Nadasdy, Z., & Ben-Shaul, Y. (2004). Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering. Neural Computation, 16(8), 1661-1687. Kosko, B., (1994). Fuzzy systems as universal approximators. IEEE Transactions on Computers, 43(11), 1329-1333. Kosko, B., (1996). Fuzzy Engineering, Prentice Hall. Nguyen, T., Khosravi, A., Creighton, D., & Nahavandi, S. (2015). Classification of healthcare data using genetic fuzzy logic system and wavelets. Expert Systems with Applications, 42(4), 2184-2197. Mitaim, S. and Kosko, B., (1996). What is the best shape for a fuzzy set in function approximation?. Proc. 5th IEEE Int. Conf. Fuzzy Systems (FUZZ-IEEE-96), 2, 1237-1243. Dickerson, J. A., & Kosko, B. (1996). Fuzzy function approximation with ellipsoidal rules. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 26(4), 542-560. Mitaim, S. and Kosko, B. (2001). The shape of fuzzy sets in adaptive function approximation. IEEE Transactions on Fuzzy Systems, 9(4), 637-656. Glover, F. (1986). Future paths for integer programming and links to artificial intelligence. Computers and Operations Research, 13, 533549. Glover, F. (1989). Tabu search, Part I. ORSA Journal on Computing, 1(3), 190-206. Glover, F., Laguna, M. (1997). Tabu Search, Kluwer Academic Publishers, Boston. DeVore, R. A., & Lucier, B. J. (1992). Wavelets. Acta Numerica, 1(1), 1-56. Nguyen, T., Khosravi, A., Creighton, D., & Nahavandi, S. (2015). EEG signal classification for BCI applications by wavelets and interval type-2 fuzzy logic systems. Expert Systems with Applications, 42(9), 4370-4380. Li, D., Pedrycz, W., & Pizzi, N. J. (2005). Fuzzy wavelet packet based feature extraction method and its application to biomedical signal classification. IEEE Transactions on Biomedical Engineering, 52(6), 1132-1139. Bozhokin, S. V., & Suslova, I. B. (2014). Analysis of non-stationary HRV as a frequency modulated signal by double continuous wavelet transformation method. Biomedical Signal Processing and Control, 10, 34-40. Tan, Y., Li, G., Duan, H., & Li, C. (2014). Enhancement of medical image details via wavelet homomorphic filtering transform. Journal of Intelligent Systems, 23(1), 83-94. Torbati, N., Ayatollahi, A., Kermani, A. (2014). An efficient neural network based method for medical image segmentation. Computers in Biology and Medicine, 44, 76-87. Deng, L., Pei, J., Ma, J., & Lee, D. L. (2004). A rank sum test method for informative gene discovery. Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 410-419. Lehmann, E. L., & D'Abrera, H. J. (2006). Nonparametrics: Statistical Methods Based on Ranks. New York: Springer. Nguyen, T., Khosravi, A., Creighton, D., & Nahavandi, S. (2015). EEG data classification using wavelet features selected by Wilcoxon statistics. Neural Computing and Applications, In press, doi: 10.1007/s00521-014-1802-y. Birbaumer, N., Flor, H., Ghanayim, N., Hinterberger, T., Iverson, I., Taub, E., Kotchoubey, B., Kübler, A., & Perelmouter, J. (1999). A brain-controlled spelling device for the completely paralyzed. Nature, 398, 297-298. Piedra-Fernández, J. A., Cantón-Garbín, M., & Wang, J. Z. (2010). Feature selection in AVHRR ocean satellite images by means of filter methods. IEEE Transactions on Geoscience and Remote Sensing, 48(12), 4193-4203. Bostanov, V. (2004). BCI competition 2003-data sets Ib and IIb: feature extraction from event-related brain potentials with the continuous wavelet transform and the t-value scalogram. IEEE Transactions on Biomedical Engineering, 51(6), 1057-1061.