P300 electroencephalogram signal identification method based on improved convolutional neural network
Technical Field
The invention relates to an electroencephalogram signal identification method, in particular to a P300 electroencephalogram signal identification method based on an improved convolutional neural network.
Background
The convolutional neural network is a deep learning algorithm model, and a classic LeNet-5 model has a satisfactory effect in the fields of pattern recognition, voice processing and the like as shown in figure 1. Compared with the traditional machine learning algorithm, the convolutional neural network does not need to manually extract the features, can automatically complete the extraction and abstraction of the features in training, and can simultaneously perform pattern classification. The model trained by the convolutional neural network has invariance to distortion such as scaling, translation, rotation and the like, and has strong generalization. The unique weight sharing and local receptive field thought can greatly reduce the parameter quantity of the neural network, prevent overfitting and reduce the complexity of the neural network model.
As a feed-forward deep network, the convolutional neural network does not require much data processing. That is, the collected data can be sent to the network only by simple preprocessing, and a better model is obtained through network learning. In recent years, convolutional neural networks have been considered as an efficient deep neural network, and have been studied intensively as a focus of researchers in various fields, and have shown excellent results in many research fields. An article of 3.2011 Cecotti et al starts an example of deep learning combined with electroencephalogram signal analysis for us, namely a convolutional neural network adopts an error back-transmission algorithm as a training algorithm of the convolutional neural network, well combines specific time-space information characteristics in electroencephalogram signals by using concepts such as local receptive field and weight sharing, and learns internal modes in data; and the down-sampling operation in the convolutional neural network structure is insensitive to the input time shift, so that the convolutional neural network can be well corrected even if a small amount of electrode deviation or acquisition delay and the like occur in the signal acquisition process.
The close connection between layers and the spatial information in the convolutional neural network make the convolutional neural network particularly suitable for processing and understanding of images, but as electroencephalogram signals are signals combining time and spatial features, in order to prevent spatial and temporal information from being mixed in the features after convolution operation, convolution kernels in convolutional layers need to be set as vectors in a targeted mode instead of matrixes in general image recognition, and only the spatial features or the temporal features can be extracted.
Disclosure of Invention
The invention aims to provide a P300 electroencephalogram signal identification method based on an improved convolutional neural network by combining dropout operation on the basis of a classical convolutional neural network structure. The method can effectively identify the P300 electroencephalogram signals, the highest identification accuracy can reach 96.69%, and the added dropout operation can effectively prevent the convolution neural network from being over-fitted.
The technical scheme provided by the invention is as follows: a P300 electroencephalogram signal identification method based on an improved convolutional neural network comprises the following steps:
step 1, collecting a P300 electroencephalogram signal by using electroencephalogram collection equipment;
step 2, selecting 16 channels of electroencephalogram signals with the numbers of 32, 34, 36, 41, 9, 11, 13, 42, 47, 49, 51, 53, 55, 56, 60 and 62 respectively, and preprocessing the acquired electroencephalogram signals, including frequency reduction, noise reduction, resampling and the like;
step 3, reconstructing the original five-dimensional samples into a common two-dimensional matrix, performing superposition averaging for 15 times to increase the signal-to-noise ratio, setting each P300 sample label as 1, and setting the noise sample label as 0;
step 4, constructing a new convolutional neural network structure, selecting an input layer, a convolutional layer, a downsampling layer, a full-connection layer and an output layer, wherein the size of a convolutional core is 1X16, the size of the downsampling layer is 2X1, and the step length is 2;
step 5, training a network, namely sending the preprocessed data into a convolutional neural network, and determining network parameters according to experience and multiple experiments to obtain an improved convolutional neural network model for recognizing the P300 electroencephalogram signal;
in the step 2, firstly, z-score standardization is carried out on an original signal, and then a band-pass filtering of 0.5-30 Hz is carried out on the signal by using a sixth-order Butterworth filter. And finally, selecting the data within 625ms after each stimulation, namely the first 150 sampling points, and performing down-sampling by a sampling factor of 3.
In the step 5, the training process is similar to the traditional back propagation algorithm and mainly comprises two stages: firstly, a forward propagation stage; the second is the back propagation stage. The specific training process is as follows:
in the forward propagation stage, input data sequentially pass through a convolutional layer, a downsampling layer and a full-connection layer, and network output of each node is calculated respectively.
The convolution layer is convolved with the following formula:
wherein,denotes the jth feature map, M, in the ith layerjA set of feature maps representing the input is presented,the convolution kernel used to represent the l layers,the bias of the l-layer setting is expressed, f is the activation function, here, the ReLU function is used, and the operation formula is:
f(x)=max(0,x). (2)
the down-sampling layer uses a mean value pooling method, and the operation formula is as follows:
wherein,the l-th layer weight parameter is represented, and down represents a down-sampling operation function.
The full connectivity layer uses the Softmax function for classification, and the formula is as follows:
p represents the probability of classifying the sample x into the j-th class,represents the value of the j-th dimension vector and the denominator represents the sum of all the vector values.
In the back propagation stage, after the classification output value is obtained through forward calculation, the error between the output value and the actual value is calculated. The error is then propagated back along the output layer, optimizing the network weight parameters. Using equation (5) as a loss function:
wherein N represents the number of input samples, C represents the number of sample classes,representing the kth dimension of the nth sample to the label,representing the kth output of the nth sample for the network output.
The two processes of forward propagation and backward propagation are repeated in the CNN network until the loss function of the network reaches a minimum. During the training process, a gradient descent method is used to find the weight parameter that minimizes the loss function. The calculation formula is as follows:
wherein a represents the learning rate in the training process, λ represents the weight attenuation coefficient, wlAnd blRespectively representing weights and offsets.
The invention has the following beneficial effects:
the P300 electroencephalogram signal is identified by using the unique ideas of weight sharing and local receptive field of the convolutional neural network for reference, and the result shows that the improved convolutional neural network is feasible for detecting the P300 signal, so that the identification rate of the electroencephalogram signal is effectively improved, the successful attempt of the deep learning model in the application of a brain-computer interface system is provided, and a new idea is provided for feature extraction and classification of the electroencephalogram signal.
Drawings
FIG. 1 is a classic LeNet-5 block diagram;
FIG. 2 is a diagram of a modified convolutional neural network architecture;
FIG. 3 is a graph showing the results of the experiment
The specific implementation mode is as follows:
the present invention will be further described with reference to the following specific examples. The following description is exemplary and explanatory only and is not restrictive of the invention in any way.
As shown in fig. 2, the steps of the present invention are as follows:
step 1, collecting P300 electroencephalogram signals by using electroencephalogram collection equipment. The acquired electroencephalogram signal is from an international 10-20 lead system electrode cap worn by 2 subjects, the number of channels is 64, and the sampling frequency is 240 Hz;
and 2, selecting the electroencephalogram signals of 16 channels with the numbers of 32, 34, 36, 41, 9, 11, 13, 42, 47, 49, 51, 53, 55, 56, 60 and 62 respectively, and preprocessing the acquired electroencephalogram signals, including frequency reduction, noise reduction and resampling. The sampling frequency of the preprocessed signal is reduced to 80Hz, the main frequency range is 0.5 Hz-30 Hz, and the sampling point of each sample is changed from 240 to 50;
step 3, reconstructing the preprocessed electroencephalogram signals into 15300 two-dimensional vector samples of 50X16 according to the five-dimensional tensor of 12X15X50X16X85, setting the sample label containing the P300 signal as 1, and setting the sample label not containing the P300 signal as 0;
and step 4, the new convolutional neural network structure is composed of 5 layers. The specific description is as follows:
the first layer is an input layer, input data is a preprocessed P300 signal, the dimension of each input sample is 50x16, wherein 50 is the number of sampling points in each channel, and 16 is the number of selected channels; the second layer is the convolutional layer, where the input samples are spatially filtered and convolved. The number of convolution kernels set in the layer is 20, and the size is 1x16, which is selected from the parameters of section 4.1. Using a ReLU function to activate neurons, obtaining 20 characteristic maps with the size of 50x 1; the third layer is a down-sampling layer, and the main function is to reduce the characteristic dimension. Adopting a mean pooling method with the step length of 2, connecting each neuron with a 2x1 area of a corresponding characteristic map in the convolutional layer to obtain 20 characteristic maps with the size of 25x 1; the fourth layer is a fully connected layer, which introduces dropout operation, randomly turning off part of neurons with a probability of 0.6 during training to prevent the CNN model from overfitting. The full-connection layer converts the characteristic signals of all the neuron nodes into vector signals, and a softmax classifier is used for distinguishing the category of an input sample; the fifth layer is the output layer, since the purpose here is to identify whether the input sample is a P300 signal, there are two final classification results output: '1' denotes a P300 signal, and '0' denotes a noise signal.
Step 5, the training process is similar to the traditional back propagation algorithm and mainly comprises two stages: firstly, a forward propagation stage; the second is the back propagation stage. The specific training process is as follows:
in the forward propagation stage, input data sequentially pass through a convolutional layer, a downsampling layer and a full-connection layer, and network output of each node is calculated respectively.
The convolution layer is convolved with the following formula:
wherein,denotes the jth feature map, M, in the ith layerjA set of feature maps representing the input is presented,the convolution kernel used to represent the l layers,the bias of the l-layer setting is expressed, f is the activation function, here, the ReLU function is used, and the operation formula is:
f(x)=max(0,x). (2)
the down-sampling layer uses a mean value pooling method, and the operation formula is as follows:
wherein,the l-th layer weight parameter is represented, and down represents a down-sampling operation function.
The full connection layer uses a softmax function for classification, and the formula is as follows:
p represents the probability of classifying the sample x into the j-th class.
In the back propagation stage, after the classification output value is obtained through forward calculation, the error between the output value and the actual value is calculated. The error is then propagated back along the output layer, optimizing the network weight parameters. Using equation (5) as a loss function:
wherein N represents the number of input samples, C represents the number of sample classes,represents the kth dimension of the nth sample to the label,Representing the kth output of the nth sample for the network output.
The two processes of forward propagation and backward propagation are repeated in the CNN network until the loss function of the network reaches a minimum. During the training process, a gradient descent method is used to find the weight parameter that minimizes the loss function. The calculation formula is as follows:
where a represents the learning rate during training.
The experimental data was derived from the P300 speller experiment in the third international BCI competition in 2004, where both subjects performed 85 character experiments, and samples generated from the first 55 characters and samples generated from the last 30 characters of each subject were used as training sets and test sets, respectively. The recognition effect of the improved convolutional neural network on the P300 electroencephalogram signals of two subjects is shown in FIG. 3, and it can be seen that the superiority of the improved CNN model in the recognition of the P300 signals also indicates that individual differences do exist in the P300 signals among different subjects. In addition, it can be seen that the improved CNN model has a good effect on classification of P300 signals in a single experiment.