Improved PCA (principal component analysis) -based deep network image classification method
Technical Field
The invention belongs to the technical field of image processing and pattern recognition, and particularly relates to a depth network image classification method based on improved PCA.
Background
In the current society, as the cloud era comes, big data attracts more and more attention, and images as the main expression form of data information become important means for people to acquire information due to the characteristics of rich content, visual reflection and the like, and the quantity of the images is rapidly increasing at an astonishing speed. However, the problem of disorder of image information becomes increasingly prominent with the increase of image data. How to automatically identify, retrieve and classify massive image data by using an artificial intelligence technology has become a research focus in the field of computer vision identification at present.
The traditional image classification method, such as Scale Invariant Feature Transform (SIFT), Histogram of Oriented Gradients (HOG), etc., has shallow structure hierarchy and small calculation amount, and can complete model training and analysis without taking a large number of images as a basis. However, the traditional model cannot acquire semantic features and depth features of higher layers from the original image, and image features are not easy to extract under the condition of large data. With the rise of deep networks, many excellent image classification methods based on deep learning emerge, for example: AlexNet, VGGNet, google lenet, ResNet, etc. The deep learning identification method can obtain deeper image features, the image feature expression is richer, the image feature extraction is more accurate, excellent classification results are obtained, and the classification effect of partial deep networks even exceeds the precision of human beings.
The image classification is widely applied in many fields, and the application in the aspects of biological feature recognition, intelligent transportation, medical auxiliary diagnosis and the like brings great convenience to our lives, but the deep learning method still has the problems of large amount of images as a basis, large calculation amount, long model training time, high requirement on hardware environment, long classification process and the like, and along with the solution of the problems, the deep learning plays a greater role in the image classification field.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a depth network image classification method based on improved PCA, which comprises the following steps:
step 1: inputting m images in a CIFAR-100 image data set into a deep convolutional neural network model, carrying out gray scale and filtering pretreatment, and eliminating noise interference to obtain the original characteristics of each image;
step 2: constructing a feature extraction module in a deep convolutional neural network model, and extracting the image features of each image by using the constructed feature extraction module;
and step 3: carrying out improved PCA dimension reduction processing on the image features of each image, wherein the improved PCA dimension reduction processing means that firstly, the image features of each image are preliminarily screened through image information entropy, and then, the PCA dimension reduction processing is carried out on the preliminarily screened image features, and the method is specifically represented as follows:
step 3.1: calculating the image information entropy H of the extracted image features by using a formula (1), and primarily screening the image features according to an information entropy threshold;
where H denotes the entropy of the image information of the image feature, psRepresenting the probability value corresponding to the s-th gray value in each image, wherein n represents the total number of the input gray values of each image;
step 3.2: carrying out PCA (principal component analysis) dimension reduction processing on the image characteristics of each image obtained by primary screening;
and 4, step 4: inputting the image characteristics Y subjected to the dimension reduction processing into a Softmax classifier to finish classification processing, and outputting the result of image classification as a real classification result;
and 5: selecting a loss function of the deep convolutional neural network model during training as a cross entropy loss function, and calculating a difference value H (p, q) between a prediction classification result and a real result according to the cross entropy loss function, wherein the cross entropy loss function is expressed as:
wherein p (m) represents the predicted classification result, and q (m) represents the true classification result;
step 6: if the difference value H (p, q) between the predicted classification result and the real result is larger than the expected difference value H' (p, q), reversely propagating the cross entropy loss function through a deep convolutional network back propagation algorithm, and continuously adjusting the network weight wuvUntil the difference H (p, q) between the calculated predicted classification result and the true result is less than or equal to the desired difference H' (p, q), or is reachedAnd until a preset training time M, wherein u represents the u-th neuron of the previous layer, and v represents the v-th neuron of the next layer.
The feature extraction module in the step 2 is specifically expressed as follows: designing the number of layers of a deep convolutional neural network model to be 13 layers according to a CIFAR-100 image dataset, wherein the 13 layers comprise L2~L14Layer of which L1Represents an input layer, L2~L14The number of the layer convolution kernels is respectively 64, 64, 128, 128, 256, 256, 256, 512, 512, 512, 512, 512, 512 and L in sequence2~L14The convolution kernel size of the layers is 3 x 3, the pooling mode is chosen as Maxpool, and the activation function is chosen as ReLU.
The step 3.2 is specifically expressed as:
3.2.1) centralizing the image characteristic matrix of each image obtained by primary screening;
3.2.2) calculating the covariance of different dimensions of the image features after the centralization treatment, and forming a covariance matrix with the dimension m;
3.2.3) calculating the eigenvalue of each covariance matrix and the eigenvector corresponding to the eigenvalue;
3.2.4) firstly determining the value of the contribution rate f according to the actual situation, and then determining K characteristic values needing to be reserved by using a formula (3);
in the formula, λjRepresents the j-th eigenvalue to be preserved in each covariance matrix, K represents the total number of eigenvalues to be preserved in each covariance matrix, and lambdaiThe ith eigenvalue of each covariance matrix is represented, and m represents the total number of eigenvalues in each covariance matrix;
3.2.5) forming a transformation basis P by the preserved eigenvectors corresponding to the K eigenvalues, and finishing the dimension reduction processing by using a formula (4) according to the transformation basis P;
Y=PTX (4)
in the formula, Y represents the image feature after the dimension reduction processing,PTA transpose matrix representing the exchange base P and X representing the image feature after the centering process.
The invention has the beneficial effects that:
1) the invention introduces PCA dimension reduction on the basis of the deep convolutional neural network, and reduces the characteristic dimension.
2) The invention further introduces the image information entropy, thereby effectively reducing the huge calculation amount and hardware occupation of the calculation covariance matrix in the PCA dimension reduction process.
3) According to the invention, Softmax is used for replacing a full connection layer of a traditional deep network, so that the classification speed is effectively improved.
Drawings
FIG. 1 is a schematic structural diagram of a deep convolutional neural network model based on improved PCA in the present invention.
FIG. 2 is a comparison graph of image classification time before and after the improved PCA dimensionality reduction process of the present invention.
FIG. 3 is a comparison graph of classification accuracy of 4 experimental images before and after the improved PCA dimensionality reduction process in the present invention.
Detailed Description
The invention is further described with reference to the following figures and specific examples.
As shown in fig. 1, according to a structural schematic diagram of a built depth convolution neural network model based on improved PCA, a depth network image classification method based on improved PCA first extracts features of an input image using a depth convolution neural network, then filters the features by calculating an image information entropy and setting an image information entropy threshold, dimensionality reduction is performed on the filtered features through PCA (principal component analysis), image features are further simplified, feature quality is improved, and finally the filtered features are sent to a Softmax classifier to realize classification, which specifically comprises the following steps:
step 1: 5000 images in a CIFAR-100 image data set are input into a deep convolution neural network model, gray level and filtering preprocessing is carried out, noise interference is eliminated, the original characteristics of each image are obtained, and different classification labels are attached to each image and used as prediction classification results of the original characteristics;
step 2: constructing a feature extraction module in a deep convolutional neural network model, and extracting the image features of each image by using the constructed feature extraction module;
as shown in fig. 1, the feature extraction module is specifically expressed as: designing the number of layers of a deep convolutional neural network model to be 13 layers according to a CIFAR-100 image dataset, wherein the 13 layers comprise L2~L14Layer of which L1Represents an input layer, L2~L14The number of the layer convolution kernels is respectively 64, 64, 128, 128, 256, 256, 256, 512, 512, 512, 512, 512, 512 and L in sequence2~L14The convolution kernel size of the layers is 3 x 3, the pooling mode is chosen as Maxpool, and the activation function is chosen as ReLU.
And step 3: carrying out improved PCA dimension reduction processing on the image features of each image, wherein the improved PCA dimension reduction processing means that firstly, the image features of each image are preliminarily screened through image information entropy, and then, the PCA dimension reduction processing is carried out on the preliminarily screened image features, and the method is specifically represented as follows:
step 3.1: calculating the image information entropy H of the extracted image features by using a formula (1), and primarily screening the image features according to an information entropy threshold, wherein the information entropy threshold is set to be 0.15 in the embodiment;
where H denotes the entropy of the image information of the image feature, psRepresenting the probability value corresponding to the s-th gray value in each image, wherein n represents the total number of the input gray values of each image;
step 3.2: carrying out PCA (principal component analysis) dimension reduction processing on the image features of each image obtained by primary screening, wherein the specific expression is as follows:
3.2.1) centralizing the image characteristic matrix of each image obtained by primary screening;
3.2.2) calculating the covariance of different dimensions of the image features after the centralization treatment, and forming a covariance matrix with the dimension m;
3.2.3) calculating the eigenvalue of each covariance matrix and the eigenvector corresponding to the eigenvalue;
3.2.4) determining the contribution rate f to be 90% in the embodiment, and then determining K characteristic values needing to be reserved by using a formula (3);
in the formula, λjRepresents the j-th eigenvalue to be preserved in each covariance matrix, K represents the total number of eigenvalues to be preserved in each covariance matrix, and lambdaiThe ith eigenvalue of each covariance matrix is represented, and m represents the total number of eigenvalues in each covariance matrix;
3.2.5) forming a transformation basis P by the preserved eigenvectors corresponding to the K eigenvalues, and finishing the dimension reduction processing by using a formula (4) according to the transformation basis P;
Y=PTX (4)
wherein Y represents an image feature after dimension reduction processing, PTA transpose matrix representing the exchange base P and X representing the image feature after the centering process.
And 4, step 4: inputting the image characteristics Y subjected to the dimension reduction processing into a Softmax classifier to finish classification processing, and outputting the result of image classification as a real classification result;
and 5: selecting a loss function of the deep convolutional neural network model during training as a cross entropy loss function, and calculating a difference value H (p, q) between a prediction classification result and a real result according to the cross entropy loss function, wherein the cross entropy loss function is expressed as:
wherein p (m) represents the predicted classification result, and q (m) represents the true classification result;
step 6: if the difference value H (p, q) between the predicted classification result and the real result is larger than the expected difference value H' (p, q), reversely propagating the cross entropy loss function through a deep convolutional network back propagation algorithm, and continuously adjusting the network weight wuvUntil the difference H (p, q) between the calculated predicted classification result and the real result is less than or equal to the expected difference H' (p, q), or a preset training number M is reached, wherein u represents the u-th neuron of the previous layer and v represents the v-th neuron of the next layer.
As shown in fig. 2 and fig. 3, the technical scheme can effectively reduce the feature dimension and the calculation amount in the feature extraction stage, improve the image classification effect, and obviously improve the image classification time and the accuracy after the improved PCA dimension reduction processing.