CN110991554A

CN110991554A - A Deep Network Image Classification Method Based on Improved PCA

Info

Publication number: CN110991554A
Application number: CN201911291420.9A
Authority: CN
Inventors: 蒋强; 陈凯; 冯永新; 隋涛
Original assignee: Shenyang Ligong University
Current assignee: Shenyang Ligong University
Priority date: 2019-12-16
Filing date: 2019-12-16
Publication date: 2020-04-10
Anticipated expiration: 2039-12-16
Also published as: CN110991554B

Abstract

The invention discloses a deep network image classification method based on improved PCA. First, the features of an input image are extracted by using a deep convolutional neural network, and then the features are filtered by calculating the image information entropy and setting a threshold value of the image information entropy. After PCA dimensionality reduction, the image features are further simplified and the feature quality is improved, which effectively solves the problem of too slow recognition speed caused by too high data dimension in the process of classifying large data image sets. In the process of feature dimension reduction, the calculation of covariance matrix is huge, which improves the real-time performance of image classification.

Description

Improved PCA (principal component analysis) -based deep network image classification method

Technical Field

The invention belongs to the technical field of image processing and pattern recognition, and particularly relates to a depth network image classification method based on improved PCA.

Background

In the current society, as the cloud era comes, big data attracts more and more attention, and images as the main expression form of data information become important means for people to acquire information due to the characteristics of rich content, visual reflection and the like, and the quantity of the images is rapidly increasing at an astonishing speed. However, the problem of disorder of image information becomes increasingly prominent with the increase of image data. How to automatically identify, retrieve and classify massive image data by using an artificial intelligence technology has become a research focus in the field of computer vision identification at present.

The traditional image classification method, such as Scale Invariant Feature Transform (SIFT), Histogram of Oriented Gradients (HOG), etc., has shallow structure hierarchy and small calculation amount, and can complete model training and analysis without taking a large number of images as a basis. However, the traditional model cannot acquire semantic features and depth features of higher layers from the original image, and image features are not easy to extract under the condition of large data. With the rise of deep networks, many excellent image classification methods based on deep learning emerge, for example: AlexNet, VGGNet, google lenet, ResNet, etc. The deep learning identification method can obtain deeper image features, the image feature expression is richer, the image feature extraction is more accurate, excellent classification results are obtained, and the classification effect of partial deep networks even exceeds the precision of human beings.

The image classification is widely applied in many fields, and the application in the aspects of biological feature recognition, intelligent transportation, medical auxiliary diagnosis and the like brings great convenience to our lives, but the deep learning method still has the problems of large amount of images as a basis, large calculation amount, long model training time, high requirement on hardware environment, long classification process and the like, and along with the solution of the problems, the deep learning plays a greater role in the image classification field.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a depth network image classification method based on improved PCA, which comprises the following steps:

step 1: inputting m images in a CIFAR-100 image data set into a deep convolutional neural network model, carrying out gray scale and filtering pretreatment, and eliminating noise interference to obtain the original characteristics of each image;

step 2: constructing a feature extraction module in a deep convolutional neural network model, and extracting the image features of each image by using the constructed feature extraction module;

and step 3: carrying out improved PCA dimension reduction processing on the image features of each image, wherein the improved PCA dimension reduction processing means that firstly, the image features of each image are preliminarily screened through image information entropy, and then, the PCA dimension reduction processing is carried out on the preliminarily screened image features, and the method is specifically represented as follows:

step 3.1: calculating the image information entropy H of the extracted image features by using a formula (1), and primarily screening the image features according to an information entropy threshold;

where H denotes the entropy of the image information of the image feature, p_sRepresenting the probability value corresponding to the s-th gray value in each image, wherein n represents the total number of the input gray values of each image;

step 3.2: carrying out PCA (principal component analysis) dimension reduction processing on the image characteristics of each image obtained by primary screening;

and 4, step 4: inputting the image characteristics Y subjected to the dimension reduction processing into a Softmax classifier to finish classification processing, and outputting the result of image classification as a real classification result;

and 5: selecting a loss function of the deep convolutional neural network model during training as a cross entropy loss function, and calculating a difference value H (p, q) between a prediction classification result and a real result according to the cross entropy loss function, wherein the cross entropy loss function is expressed as:

wherein p (m) represents the predicted classification result, and q (m) represents the true classification result;

step 6: if the difference value H (p, q) between the predicted classification result and the real result is larger than the expected difference value H' (p, q), reversely propagating the cross entropy loss function through a deep convolutional network back propagation algorithm, and continuously adjusting the network weight w_uvUntil the difference H (p, q) between the calculated predicted classification result and the true result is less than or equal to the desired difference H' (p, q), or is reachedAnd until a preset training time M, wherein u represents the u-th neuron of the previous layer, and v represents the v-th neuron of the next layer.

The feature extraction module in the step 2 is specifically expressed as follows: designing the number of layers of a deep convolutional neural network model to be 13 layers according to a CIFAR-100 image dataset, wherein the 13 layers comprise L₂～L₁₄Layer of which L₁Represents an input layer, L₂～L₁₄The number of the layer convolution kernels is respectively 64, 64, 128, 128, 256, 256, 256, 512, 512, 512, 512, 512, 512 and L in sequence₂～L₁₄The convolution kernel size of the layers is 3 x 3, the pooling mode is chosen as Maxpool, and the activation function is chosen as ReLU.

The step 3.2 is specifically expressed as:

3.2.1) centralizing the image characteristic matrix of each image obtained by primary screening;

3.2.2) calculating the covariance of different dimensions of the image features after the centralization treatment, and forming a covariance matrix with the dimension m;

3.2.3) calculating the eigenvalue of each covariance matrix and the eigenvector corresponding to the eigenvalue;

3.2.4) firstly determining the value of the contribution rate f according to the actual situation, and then determining K characteristic values needing to be reserved by using a formula (3);

in the formula, λ_jRepresents the j-th eigenvalue to be preserved in each covariance matrix, K represents the total number of eigenvalues to be preserved in each covariance matrix, and lambda_iThe ith eigenvalue of each covariance matrix is represented, and m represents the total number of eigenvalues in each covariance matrix;

3.2.5) forming a transformation basis P by the preserved eigenvectors corresponding to the K eigenvalues, and finishing the dimension reduction processing by using a formula (4) according to the transformation basis P;

Y＝P^TX (4)

in the formula, Y represents the image feature after the dimension reduction processing，P^TA transpose matrix representing the exchange base P and X representing the image feature after the centering process.

The invention has the beneficial effects that:

1) the invention introduces PCA dimension reduction on the basis of the deep convolutional neural network, and reduces the characteristic dimension.

2) The invention further introduces the image information entropy, thereby effectively reducing the huge calculation amount and hardware occupation of the calculation covariance matrix in the PCA dimension reduction process.

3) According to the invention, Softmax is used for replacing a full connection layer of a traditional deep network, so that the classification speed is effectively improved.

Drawings

FIG. 1 is a schematic structural diagram of a deep convolutional neural network model based on improved PCA in the present invention.

FIG. 2 is a comparison graph of image classification time before and after the improved PCA dimensionality reduction process of the present invention.

FIG. 3 is a comparison graph of classification accuracy of 4 experimental images before and after the improved PCA dimensionality reduction process in the present invention.

Detailed Description

The invention is further described with reference to the following figures and specific examples.

As shown in fig. 1, according to a structural schematic diagram of a built depth convolution neural network model based on improved PCA, a depth network image classification method based on improved PCA first extracts features of an input image using a depth convolution neural network, then filters the features by calculating an image information entropy and setting an image information entropy threshold, dimensionality reduction is performed on the filtered features through PCA (principal component analysis), image features are further simplified, feature quality is improved, and finally the filtered features are sent to a Softmax classifier to realize classification, which specifically comprises the following steps:

step 1: 5000 images in a CIFAR-100 image data set are input into a deep convolution neural network model, gray level and filtering preprocessing is carried out, noise interference is eliminated, the original characteristics of each image are obtained, and different classification labels are attached to each image and used as prediction classification results of the original characteristics;

as shown in fig. 1, the feature extraction module is specifically expressed as: designing the number of layers of a deep convolutional neural network model to be 13 layers according to a CIFAR-100 image dataset, wherein the 13 layers comprise L₂～L₁₄Layer of which L₁Represents an input layer, L₂～L₁₄The number of the layer convolution kernels is respectively 64, 64, 128, 128, 256, 256, 256, 512, 512, 512, 512, 512, 512 and L in sequence₂～L₁₄The convolution kernel size of the layers is 3 x 3, the pooling mode is chosen as Maxpool, and the activation function is chosen as ReLU.

step 3.1: calculating the image information entropy H of the extracted image features by using a formula (1), and primarily screening the image features according to an information entropy threshold, wherein the information entropy threshold is set to be 0.15 in the embodiment;

step 3.2: carrying out PCA (principal component analysis) dimension reduction processing on the image features of each image obtained by primary screening, wherein the specific expression is as follows:

3.2.4) determining the contribution rate f to be 90% in the embodiment, and then determining K characteristic values needing to be reserved by using a formula (3);

Y＝P^TX (4)

wherein Y represents an image feature after dimension reduction processing, P^TA transpose matrix representing the exchange base P and X representing the image feature after the centering process.

step 6: if the difference value H (p, q) between the predicted classification result and the real result is larger than the expected difference value H' (p, q), reversely propagating the cross entropy loss function through a deep convolutional network back propagation algorithm, and continuously adjusting the network weight w_uvUntil the difference H (p, q) between the calculated predicted classification result and the real result is less than or equal to the expected difference H' (p, q), or a preset training number M is reached, wherein u represents the u-th neuron of the previous layer and v represents the v-th neuron of the next layer.

As shown in fig. 2 and fig. 3, the technical scheme can effectively reduce the feature dimension and the calculation amount in the feature extraction stage, improve the image classification effect, and obviously improve the image classification time and the accuracy after the improved PCA dimension reduction processing.

Claims

1. a deep network image classification method based on improved PCA, is characterized in that, comprises the following steps:

Step 1: Input m images in the CIFAR-100 image dataset into the deep convolutional neural network model, perform grayscale, filtering preprocessing, eliminate noise interference, and obtain the original features of each image;

Step 2: Build a feature extraction module in the deep convolutional neural network model, and use the built feature extraction module to extract the image features of each image;

Step 3: Perform improved PCA dimensionality reduction processing on the image features of each image. The improved PCA dimensionality reduction processing refers to firstly screening the image features of each image through the image information entropy, and then performing preliminary screening on the image features after the initial screening. Perform PCA dimensionality reduction processing, which is specifically expressed as:

Step 3.1: Calculate the image information entropy H of the extracted image features by using the formula (1), and preliminarily screen the image features according to the information entropy threshold;

In the formula, H represents the image information entropy of the image feature, ps represents the probability value corresponding to the _s -th gray value in each image, and n represents the total number of gray values of each input image;

Step 3.2: Perform PCA dimensionality reduction processing on the image features of each image obtained from the preliminary screening;

Step 4: Input the image feature Y after dimensionality reduction processing into the Softmax classifier to complete the classification process, and output the result of the image category as the real classification result;

Step 5: The loss function of the deep convolutional neural network model during training is selected as the cross entropy loss function, and the difference H(p, q) between the predicted classification result and the real result is calculated according to the cross entropy loss function. The loss function is expressed as:

In the formula, p(m) represents the predicted classification result, and q(m) represents the real classification result;

Step 6: If the difference H(p,q) between the predicted classification result and the real result is greater than the expected difference H'(p,q), the cross-entropy loss function is reversed by the deep convolutional network backpropagation algorithm. Propagation in the forward direction, and continuously adjust the network weight w _uv until the difference H(p,q) between the calculated predicted classification result and the real result is less than or equal to the expected difference H'(p,q), or reaches the preset Up to the number of training times M, where u represents the uth neuron in the previous layer, and v represents the vth neuron in the next layer.

2. a kind of deep network image classification method based on improved PCA according to claim 1, is characterized in that, the feature extraction module in described step 2, is specifically expressed as: according to CIFAR-100 image dataset, depth convolution The number of layers of the neural network model is designed to be 13 layers, and the 13 layers include L ₂ to L ₁₄ layers, where L ₁ represents the input layer, and the number of convolution kernels of the L ₂ to L ₁₄ layers is 64, 64, 128, respectively. 128, 256, 256, 256, 512, 512, 512, 512, 512, 512, the convolution kernel size of L ₂ ~ L ₁₄ layers are all 3*3, the pooling method is selected as Maxpool, and the activation function is selected as ReLU.

3. a kind of deep network image classification method based on improved PCA according to claim 1, is characterized in that, described step 3.2 is specifically expressed as:

3.2.1) Centralize the image feature matrix of each image obtained by preliminary screening;

3.2.2) Calculate the covariance between different dimensions of the image features after centralization, and form a covariance matrix with dimension m;

3.2.3) Calculate the eigenvalue of each covariance matrix and the eigenvector corresponding to the eigenvalue;

3.2.4) First determine the size value of the contribution rate f according to the actual situation, and then use the formula (3) to determine the K eigenvalues that need to be retained;

In the formula, λ _j represents the jth eigenvalue that needs to be retained in each covariance matrix, K represents the total number of eigenvalues that need to be retained in each covariance matrix, and λ _i represents the ith eigenvalue of each covariance matrix Eigenvalue, m represents the total number of eigenvalues in each covariance matrix;

3.2.5) The transformation basis P is formed by the eigenvectors corresponding to the reserved K eigenvalues, and the dimensionality reduction process is completed by formula (4) according to the exchange basis P;

Y=P ^T X (4)

In the formula, Y represents the image features after dimensionality reduction processing, P ^T represents the transpose matrix of the exchange base P, and X represents the image features after centralization.