[go: up one dir, main page]

CN112396123A - Image recognition method, system, terminal and medium based on convolutional neural network - Google Patents

Image recognition method, system, terminal and medium based on convolutional neural network Download PDF

Info

Publication number
CN112396123A
CN112396123A CN202011382932.9A CN202011382932A CN112396123A CN 112396123 A CN112396123 A CN 112396123A CN 202011382932 A CN202011382932 A CN 202011382932A CN 112396123 A CN112396123 A CN 112396123A
Authority
CN
China
Prior art keywords
neural network
convolutional neural
image
image recognition
path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011382932.9A
Other languages
Chinese (zh)
Inventor
方堃
杨杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiao Tong University
Original Assignee
Shanghai Jiao Tong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiao Tong University filed Critical Shanghai Jiao Tong University
Priority to CN202011382932.9A priority Critical patent/CN112396123A/en
Publication of CN112396123A publication Critical patent/CN112396123A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image identification method, a system, a terminal and a medium based on a convolutional neural network, wherein the method comprises the following steps: training a convolutional neural network model for executing an image recognition task by adopting a training image; inputting an image to be identified into the convolutional neural network model, and outputting an image identification result; the convolutional neural network model comprises a convolutional neural network, an orthogonal multipath block is embedded in the convolutional neural network, the orthogonal multipath block structure comprises a plurality of paths, and parameters on each path are orthogonal to each other, so that the robustness of the convolutional neural network is improved. The method solves the problem that the robustness of the current common neural network under the image recognition task is very fragile, and has very high model robustness while maintaining the high accuracy of the image recognition.

Description

Image recognition method, system, terminal and medium based on convolutional neural network
Technical Field
The invention belongs to the technical field of image processing and pattern recognition, and particularly relates to an image recognition method, an image recognition system, a terminal and a medium based on a convolutional neural network.
Background
In the field of image processing and pattern recognition, one of the most common tasks is the image recognition task. In a classical image recognition dataset, such as CIFAR10, the categories of images include 10 categories: airplanes, cars, birds, cats, deer, dogs, frogs, horses, boats and trucks, on larger data sets, such as IMAGENET, contain up to 2000 images in total of 1500 million categories. The image recognition task is essentially a classification task, and researchers need to solve for an effective classifier to accurately classify an image into the true category to which it belongs. Early researchers used simple classical image processing methods such as gaussian blur, feature pyramid extraction and the like in related researches of image recognition tasks, and often combined with the classical image processing methods and matched with a priori knowledge, only one image recognition method with limited performance can be obtained finally.
In recent years, with the advent of large-scale data sets and the advancement of computing power of graphic processing units, neural network models have begun to be applied more and more widely in various scientific research fields, including computer vision, natural language processing and recommendation systems, and the like, due to their powerful learning capabilities. After the introduction of a neural network model, the image recognition task has also been developed rapidly, and the neural network structure for image recognition is developed from the earliest multi-layer perceptron (MLP) to a cascaded Convolutional Neural Network (CNN) to a residual network (resnet) with a residual connection structure; the number of layers of the neural network also develops from a shallow-structured 5-layer network to a residual network as deep as 152 layers; on CIFAR10 and IMAGENET, researchers developed more novel structures and deeper neural networks, refreshing recognition accuracy on these data sets from time to time.
At present, in the engineering practice of an image recognition task, it is not complicated to train an image classifier based on a convolutional neural network model with excellent performance. However, researchers have found that the generalization performance of neural networks is very fragile in certain situations. Taking the image recognition task as an example, given a fully trained network, the network already has excellent generalization performance, i.e., the network can obtain a high recognition rate on training data and can also obtain good recognition accuracy on unseen test data. However, researchers have found that if some carefully designed modifications are made to the images in the training data or the test data, such modifications may be made with a little noise or even at a pixel level, and the modified images are visually indistinguishable from the original image, i.e., the human still can correctly recognize and classify the modified images, however, the neural network gives erroneous classification results to the modified images with a very high degree of confidence. These modified images are called confrontation samples (confrontation samples), the process of generating the confrontation samples is called confrontation attack (confrontation attack), the recognition capability of the neural network on the confrontation samples brings about the research on the robustness of the neural network, and the research on the robustness of the network also helps to explore the nature of the neural network, and the significance is very important.
Disclosure of Invention
Aiming at the problem that the convolutional neural network is generally fragile and stable in an image recognition task, the invention provides an image recognition method, a system, a terminal and a medium based on the convolutional neural network.
In a first aspect of the present invention, an image recognition method based on a convolutional neural network is provided, including:
training a convolutional neural network model for executing an image recognition task by adopting a training image;
inputting an image to be identified into the convolutional neural network model, and outputting an image identification result;
the convolutional neural network model comprises a convolutional neural network, an orthogonal multipath block is embedded in the convolutional neural network, the orthogonal multipath block structure comprises a plurality of paths, and parameters on each path are orthogonal to each other, so that the robustness of the convolutional neural network is improved.
Optionally, the training develops a convolutional neural network model that performs an image recognition task, including:
s11, acquiring a batch of training images with category labels;
s12, initializing a convolutional neural network, embedding an orthogonal multipath block in the convolutional neural network, and increasing the robustness of the convolutional neural network;
s13, randomly selecting a small batch of images from all the images in S11, inputting the small batch of images into a convolutional neural network, wherein each path in an orthogonal multipath block in the convolutional neural network outputs a predicted image type to the images;
s14, for each path, calculating the difference between the output predicted image category and the real category of the image, and taking weighted average to the calculated differences of all paths;
s15, updating the network parameters by a gradient descent method according to the calculated average difference;
and S16, repeating the steps S13 to S15 until the average difference converges, or setting a sufficient number of times of repetition, and stopping training after the number of times of repetition is reached, thereby obtaining a trained neural network model.
Optionally, the orthogonal multipath block is embedded in any position of the convolutional neural network, specifically determined according to specific service use requirements.
Optionally, the orthogonal multi-path block is embedded in the last linear layer of the convolutional neural network, each path in the block is a linear layer, linear layer parameters on the paths are orthogonal to each other, and the linear layers share the previous layer of the network.
Optionally, the orthogonal multipaths are embedded in convolutional layers of the convolutional neural network, each path in the block is a convolutional layer, convolutional layer parameters on the paths are orthogonal to each other, and the convolutional layers share the rest of the network.
Optionally, the inputting the image to be recognized into the convolutional neural network model, and outputting an image recognition result, includes:
s21, deploying the convolutional neural network model to a business machine;
s22, inputting the image to be recognized into the convolutional neural network model, wherein each path in the convolutional neural network model outputs a prediction result of the image;
s23, the prediction result with the largest number of occurrences among the prediction results of these paths is taken as the final prediction result of the image.
Optionally, the method further comprises: before training and recognition, preprocessing and/or image enhancement operations are performed on the training images and the images to be recognized, including:
the preprocessing comprises the normalization of scaling the image size to the same size and the size of the image pixel value;
the image enhancement operation comprises the steps of supplementing 0 pixel at the edge of an image, then cutting the image, and randomly horizontally turning the image.
In a second aspect of the present invention, there is provided an image recognition system based on a convolutional neural network, comprising:
a training module for training a convolutional neural network model for executing an image recognition task by using a training image;
the recognition module inputs the image to be recognized into the convolutional neural network model and outputs an image recognition result;
the convolutional neural network model comprises a convolutional neural network, an orthogonal multipath block is embedded in the convolutional neural network, the orthogonal multipath block structure comprises a plurality of paths, and parameters on each path are orthogonal to each other, so that the robustness of the convolutional neural network is improved.
In a third aspect of the present invention, there is provided an electronic terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the computer program to perform the image recognition method.
In a fourth aspect of the invention, a computer-readable storage medium is provided, on which a computer program is stored, which program, when being executed by a processor, is adapted to carry out the above-mentioned image recognition method.
Compared with the prior art, the embodiment of the invention has at least one of the following beneficial effects:
the embodiment of the invention solves the problem that the robustness of the current common neural network under the image recognition task is very fragile, and has very high model robustness while maintaining the high accuracy of the image recognition.
According to the embodiment of the invention, the orthogonal constraint is applied to the parameters on each path in the orthogonal multi-path block, so that the rest part in the neural network can be simultaneously adapted to the mutually orthogonal paths, the convolutional neural network can learn more stable characteristics, the image after malicious modification can still be kept at a higher identification accuracy rate, and the robustness of the network is enhanced.
The embodiment of the invention researches the influence of orthogonal multipath blocks placed at different positions in a convolutional neural network on the network robustness, the network robustness characteristics corresponding to the orthogonal multipath blocks at different positions are different, and the characteristics can guide the specific deployment and application of a convolutional neural network model under the requirements of different service scenes for image recognition.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is a flow chart of a method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a training process according to an embodiment of the present invention.
FIG. 3 is a flow chart of a testing process according to an embodiment of the present invention.
Fig. 4 is a partial comparison diagram of a conventional network and a network in which orthogonal multipath blocks are embedded in the present invention.
Fig. 5a, 5b, and 5c are schematic diagrams illustrating the embedding of orthogonal multipath blocks at different positions in a neural network according to an embodiment of the present invention.
FIG. 6 is a flowchart illustrating a method implementation of an embodiment of the invention.
Fig. 7 is a schematic diagram of a deployment manner of a specific application scenario according to an embodiment of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the present invention.
Referring to fig. 1, in the image recognition method in the embodiment of the present invention, a convolutional neural network model is used as an image classifier, an image to be classified is input, and a category of the image is output. Specifically, an image recognition method based on a convolutional neural network includes:
s100, training a convolutional neural network model for executing an image recognition task by adopting a training image;
s200, inputting the image to be identified into a convolutional neural network model, and outputting an image identification result;
the convolutional neural network model comprises a convolutional neural network, an orthogonal multi-path block is embedded in the convolutional neural network, the orthogonal multi-path block structure comprises a plurality of paths, and parameters on each path are orthogonal to each other, so that the robustness of the convolutional neural network is improved.
The embodiment of the invention solves the problem that the robustness of the current common neural network under the image recognition task is very fragile, and has very high model robustness while maintaining the high accuracy of the image recognition.
In another preferred embodiment, the image recognition method based on the convolutional neural network comprises a training phase and a testing phase. Firstly, training a convolutional neural network on image data with labels, greatly enhancing the robustness of the network by embedding an orthogonal multipath block structure in the network, then deploying the trained network into actual services, and executing an image recognition task on images needing to be classified.
Specifically, referring to fig. 2, the training phase in the preferred embodiment may include the following steps:
the method comprises the steps of firstly, acquiring a batch of training image data with category labels;
after the training image data is obtained, preprocessing operation and image enhancement operation can be carried out on the training data, wherein the preprocessing operation comprises the normalization of scaling the image size to the same size and the image pixel value size, and the image enhancement operation comprises the steps of supplementing 0 pixel at the edge of the image, cutting and randomly horizontally overturning the image;
secondly, initializing a convolutional neural network, and embedding an orthogonal multi-path block structure in the convolutional neural network according to specific service use requirements;
step three, randomly taking a small batch of images from all image data, inputting the small batch of images into a convolutional neural network, wherein each path in an orthogonal multi-path block in the network outputs a predicted image type to the images;
fourthly, for each path, calculating the difference between the output prediction category and the real category of the images, and taking weighted average of the calculated differences of all paths;
fifthly, updating network parameters by a gradient descent method according to the calculated average difference;
and sixthly, repeating the second step to the fifth step until the average difference converges, or setting a sufficient number of times of repetition, and stopping training after the number of times of repetition is reached, thereby obtaining a trained convolutional neural network model.
Referring to fig. 3, the testing phase in the preferred embodiment includes the following steps:
step one, deploying a convolutional neural network model obtained in a training stage to a business machine;
secondly, acquiring each image needing to identify a specific category; these images may be subjected to the same pre-processing operations as the first step in the training phase;
loading a convolutional neural network model trained in a training stage, inputting the preprocessed image to be recognized into the convolutional neural network model, wherein each path in the convolutional neural network model outputs a prediction result of the image;
and fourthly, taking the mode of the path prediction results, namely the prediction result with the largest occurrence frequency as the final prediction result of the image.
Referring to fig. 3, based on the above embodiment, preferably, the convolutional neural network structure including an Orthogonal Multi-Path block (OMP block) in the third step of the training stage is specifically: an orthogonal multipath block is embedded in a classical convolutional neural network structure, parameters on each path in the block are constrained to be orthogonal, and the orthogonal multipath block can be embedded at any position in the network. For example, if the orthogonal multi-path block is embedded in the last linear layer of the network, each path in the block is a linear layer, the linear layer parameters on the paths are orthogonal to each other, and the linear layers share the previous layer of the network; if orthogonal multipaths are embedded in convolutional layers of the network, each path in the block is a convolutional layer, convolutional layer parameters on the paths are orthogonal to each other, and the convolutional layers share the rest of the network. These specific locations are selected according to actual business requirements, determining the final neural network model structure. According to the embodiment of the invention, the orthogonal constraint is applied to the parameters on each path in the orthogonal multi-path block, so that the rest part in the neural network can be simultaneously adapted to the mutually orthogonal paths, the convolutional neural network can learn more stable characteristics, the image after malicious modification can still be kept at a higher identification accuracy rate, and the robustness of the network is enhanced.
Referring to fig. 4, a partial comparison of a conventional network and a network with embedded orthogonal multipath blocks, where the orthogonal multipath blocks comprise multiple paths and the parameters on each path are constrained to be orthogonal to each other.
Fig. 5a, 5b, 5c show the network structure after embedding orthogonal multipath blocks at three different locations of the convolutional neural network. The orthogonal multipath block is embedded in the first layer of convolution, the middle layer of convolution and the last linear layer of the convolutional neural network from top to bottom in sequence.
In another preferred embodiment, the detailed description is based on the situation that the orthogonal multipath block is placed at the last layer of the network, and the orthogonal multipath block is placed at other positions with similar training methods, which are not described herein again. First, some relevant notation is given: let the last linear classification layer be denoted as g (-) and the remaining part of the network be denoted as h (-) so that the entire network can be represented by g (h (-) Rd→RKWhere d and K represent the dimensions of the network input and output, respectively.
The embodiment of the invention researches the influence of orthogonal multipath blocks placed at different positions in a convolutional neural network on the network robustness, the network robustness characteristics corresponding to the orthogonal multipath blocks at different positions are different, and the characteristics can guide the specific deployment and application of a convolutional neural network model under the requirements of different service scenes for image recognition.
Referring to FIG. 6, a flowchart of an embodiment is shown, which includes data preparation, model training, and model testing (deployment). The data preparation mainly refers to the collection, labeling, preprocessing and data enhancement of training data, model training is to obtain a convolutional neural network model for image recognition, and model testing is the actual deployment application of the convolutional neural network. Fig. 7 is a schematic diagram of a deployment in a specific application scenario. Specifically, the model training and model testing in this embodiment are described in detail below.
In this embodiment, the model training includes:
s101, taking out a batch of image samples from a training image set every time, and recording the image samples as (x, y);
s102, inputting the batch of images into an orthogonal multipath block, placing the orthogonal multipath block in a convolutional neural network of the last layer, carrying out forward propagation of a model, and then calculating a loss function required by training as follows:
Figure BDA0002808952830000071
Figure BDA0002808952830000072
loss=lc+λ·lo
wherein L (·,) represents a loss function for measuring the difference between the predicted class result g (h (x)) and the true class y of the image x by the network, L is the number of paths, LcThe sum of the loss functions of the network corresponding to each path is calculated, and in practical application, more weighted average modes can be adopted, and the method is not limited to simple summation in a formula, namely loThe sum of the squares of the inner products of the parameters on any two paths is calculated. The parameter orthogonal means that the inner product of the parameters is 0, so l will be used hereoAs an objective function to be optimized, it is equivalent to constrain the orthogonality of the parameters on any two paths.
S103, calculating the gradient of the loss function relative to the parameter according to a random gradient descent algorithm, and updating the parameter:
Figure BDA0002808952830000081
where θ represents all parameters in the network and η represents the learning rate in the stochastic gradient descent algorithm.
S104, if the confrontation training is needed, generating a batch of corresponding confrontation samples (x) based on the current networkadv,y);
S105, calculating a loss function corresponding to the confrontation sample as follows:
Figure BDA0002808952830000082
Figure BDA0002808952830000083
lossadv=lc_adv+λ·lo
s106, calculating the gradient of the loss function on the antagonizing sample relative to the parameters according to a random gradient descent algorithm, and updating the parameters again:
Figure BDA0002808952830000084
and S107, repeating S101-S106 for a plurality of times until a trained convolutional neural network model M is obtained.
After the convolutional neural network model M is obtained, the next model test, that is, the test process, is performed. As shown in fig. 3:
s201, deploying the trained convolutional neural network model M to an image recognition service platform;
s202, when receiving an image needing to identify a specific category, firstly carrying out the same preprocessing operation as the preprocessing operation in the first step of the training stage, and not carrying out the image enhancement operation;
s203, loading the model M, and carrying out preprocessing on the image x to be recognizednewInputting the predicted result into a model M, wherein each path in M outputs a predicted result yi=gi(h(x)),i=1,...,L
S204, taking y1,y2,...,yLThe category result with the largest occurrence number is used as the final recognition result of the image to be recognized.
According to the embodiment of the invention, the orthogonal multi-path block structure is embedded in the network, so that the robustness of the network can be greatly enhanced, then the trained network is deployed in actual services, and the image recognition task is executed on the image to be classified, so that the problem that the robustness of the current common neural network under the image recognition task is very fragile is solved, and the high accuracy of image recognition can be maintained and the model robustness is very high.
Based on the image recognition method, in another embodiment of the present invention, there is provided an image recognition system based on a convolutional neural network, the system including:
a training module for training a convolutional neural network model for executing an image recognition task by using a training image; the convolutional neural network model comprises a convolutional neural network, an orthogonal multi-path block is embedded in the convolutional neural network, the orthogonal multi-path block structure comprises a plurality of paths, and parameters on each path are orthogonal to each other, so that the robustness of the convolutional neural network is improved;
and the identification module inputs the image to be identified into the convolutional neural network model and outputs an image identification result.
In the above embodiments of the present invention, the convolutional neural network model can provide a high recognition accuracy on the image data used for training, and at the same time, can have an excellent recognition performance on the test image data that has never been found.
The implementation of the modules in the above embodiments may specifically refer to the corresponding steps in the above embodiments of the image recognition method, and is not described herein again.
In another embodiment of the present invention, an electronic terminal is further provided, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and the processor executes the computer program to perform the image recognition method.
In another embodiment of the present invention, there is also provided a computer-readable storage medium having stored thereon a computer program for executing the above-mentioned image recognition method when executed by a processor.
It should be noted that, the steps in the method provided by the present invention may be implemented by using corresponding modules, devices, units, and the like in the system, and those skilled in the art may refer to the technical solution of the system to implement the step flow of the method, that is, the embodiment in the system may be understood as a preferred example for implementing the method, and details are not described herein.
Those skilled in the art will appreciate that, in addition to implementing the system and its various devices provided by the present invention in purely computer readable program code means, the method steps can be fully programmed to implement the same functions by implementing the system and its various devices in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices thereof provided by the present invention can be regarded as a hardware component, and the devices included in the system and various devices thereof for realizing various functions can also be regarded as structures in the hardware component; means for performing the functions may also be regarded as structures within both software modules and hardware components for performing the methods.
While the present invention has been described in detail with reference to the preferred embodiments, it should be understood that the above description should not be taken as limiting the invention. Various modifications and alterations to this invention will become apparent to those skilled in the art upon reading the foregoing description. Accordingly, the scope of the invention should be determined from the following claims.

Claims (10)

1.一种基于卷积神经网络的图像识别方法,其特征在于,包括:1. an image recognition method based on convolutional neural network, is characterized in that, comprises: 采用训练图像训练执行图像识别任务的卷积神经网络模型;Use training images to train a convolutional neural network model for image recognition tasks; 将待识别的图像输入所述卷积神经网络模型,输出图像识别结果;Input the image to be recognized into the convolutional neural network model, and output the image recognition result; 其中,所述卷积神经网络模型包括卷积神经网络,该卷积神经网络中嵌入一个正交多路径区块,所述正交多路径区块结构包含多条路径,每一条路径上的参数相互正交,增加所述卷积神经网络的稳健性。Wherein, the convolutional neural network model includes a convolutional neural network, an orthogonal multi-path block is embedded in the convolutional neural network, the orthogonal multi-path block structure includes multiple paths, and the parameters on each path are mutually orthogonal, increasing the robustness of the convolutional neural network. 2.根据权利要求1所述的基于卷积神经网络的图像识别方法,其特征在于,所述训练出执行图像识别任务的卷积神经网络模型,包括:2. the image recognition method based on convolutional neural network according to claim 1, is characterized in that, described training out the convolutional neural network model that performs image recognition task, comprises: S11,获取一批具有类别标记的训练图像;S11, obtain a batch of training images with category labels; S12,初始化一个卷积神经网络,在卷积神经网络中嵌入一个正交多路径区块,增加所述卷积神经网络的稳健性;S12, initialize a convolutional neural network, and embed an orthogonal multi-path block in the convolutional neural network to increase the robustness of the convolutional neural network; S13,从S11全部的图像中随机取一小批次图像,输入卷积神经网络,网络中正交多路径区块中的每条路径都会对图像输出一个预测的图像类别;S13, randomly select a small batch of images from all the images in S11, and input them into the convolutional neural network. Each path in the orthogonal multi-path block in the network will output a predicted image category to the image; S14,对于每条路径,分别计算其输出的预测的图像类别与这批图像的真实类别之间的差异,对全部路径计算出的差异取加权平均;S14, for each path, calculate the difference between the output predicted image category and the real category of the batch of images respectively, and take a weighted average of the calculated differences for all paths; S15,根据计算出的平均差异,用梯度下降法更新网络参数;S15, update network parameters by gradient descent method according to the calculated average difference; S16,重复S13到S15,直至平均差异收敛,或者设置一个足够多的重复次数,达到重复次数后便停止训练,从而获取到一个训练好的神经网络模型。S16, repeat S13 to S15 until the average difference converges, or set a sufficient number of repetitions, and stop training when the number of repetitions is reached, thereby obtaining a trained neural network model. 3.根据权利要求2所述的基于卷积神经网络的图像识别方法,其特征在于,所述正交多路径区块嵌入在所述卷积神经网络的任意位置,具体的嵌入位置根据实际使用业务需求确定。3. the image recognition method based on convolutional neural network according to claim 2, is characterized in that, described orthogonal multi-path block is embedded in the arbitrary position of described convolutional neural network, and concrete embedding position is according to actual use Business needs are determined. 4.根据权利要求3所述的基于卷积神经网络的图像识别方法,其特征在于,所述正交多路径区块嵌入在所述卷积神经网络的最后线性层,则该区块中的每条路径即为一个线性层,这些路径上的线性层参数相互正交,这些线性层共享网络的前层。4. The image recognition method based on convolutional neural network according to claim 3, is characterized in that, described orthogonal multi-path block is embedded in the last linear layer of described convolutional neural network, then the Each path is a linear layer, and the linear layer parameters on these paths are orthogonal to each other, and these linear layers share the previous layers of the network. 5.根据权利要求3所述的基于卷积神经网络的图像识别方法,其特征在于,所述正交多路径嵌入到在所述卷积神经网络的卷积层,则该区块中每条路径即为一个卷积层,这些路径上的卷积层参数相互正交,这些卷积层共享网络的剩余部分。5. The image recognition method based on a convolutional neural network according to claim 3, wherein the orthogonal multi-path is embedded in the convolutional layer of the convolutional neural network, then each A path is a convolutional layer, the parameters of the convolutional layers on these paths are mutually orthogonal, and these convolutional layers share the rest of the network. 6.根据权利要求1所述的基于卷积神经网络的图像识别方法,其特征在于,所述将待识别的图像输入所述卷积神经网络模型,输出图像识别结果,包括:6. The image recognition method based on a convolutional neural network according to claim 1, wherein the image to be recognized is input into the convolutional neural network model, and the output image recognition result comprises: S21,将所述卷积神经网络模型部署到业务机器上;S21, deploying the convolutional neural network model on a business machine; S22,将待识别的图像输入到所述卷积神经网络模型中,所述卷积神经网络模型中的每条路径都会输出对该图像的预测结果;S22, input the image to be identified into the convolutional neural network model, and each path in the convolutional neural network model will output the prediction result of the image; S23,取这些路径的预测结果中出现次数最多的预测结果,作为该图像最终的预测结果。S23, take the prediction result with the largest number of occurrences among the prediction results of these paths as the final prediction result of the image. 7.根据权利要求1所述的基于卷积神经网络的图像识别方法,其特征在于,还包括:在训练和识别之前,对所述训练图像、所述待识别的图像进行预处理和/或图像增强操作,其中,7. The image recognition method based on a convolutional neural network according to claim 1, further comprising: before training and recognition, preprocessing and/or preprocessing the training image and the image to be recognized image enhancement operations, where, 所述预处理包括将图像尺寸缩放到同样大小、图像像素值大小的归一化;The preprocessing includes scaling the image size to the same size and normalizing the image pixel value size; 所述图像增强操作包括在图像边缘补0像素再裁剪、随机水平翻转图像。The image enhancement operation includes adding 0 pixels to the edge of the image, then cropping, and randomly flipping the image horizontally. 8.一种基于卷积神经网络的图像识别系统,其特征在于,包括:8. An image recognition system based on convolutional neural network, characterized in that, comprising: 训练模块,该模块采用训练图像训练执行图像识别任务的卷积神经网络模型;A training module, which uses training images to train a convolutional neural network model for image recognition tasks; 识别模块,该模块将待识别的图像输入所述卷积神经网络模型,输出图像识别结果;a recognition module, which inputs the image to be recognized into the convolutional neural network model, and outputs the image recognition result; 其中,所述卷积神经网络模型包括卷积神经网络,该卷积神经网络中嵌入一个正交多路径区块,所述正交多路径区块结构包含多条路径,每一条路径上的参数相互正交,增加所述卷积神经网络的稳健性。Wherein, the convolutional neural network model includes a convolutional neural network, an orthogonal multi-path block is embedded in the convolutional neural network, the orthogonal multi-path block structure includes multiple paths, and the parameters on each path are mutually orthogonal, increasing the robustness of the convolutional neural network. 9.一种电子终端,包括存储器、处理器及存储在存储器上并能在处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时用于执行权利要求1-7任一所述的方法。9. An electronic terminal, comprising a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor is used to execute any of claims 1-7 when the processor executes the program. a described method. 10.一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时用于执行权利要求1-7任一所述的方法。10. A computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the program is used to execute the method of any one of claims 1-7.
CN202011382932.9A 2020-11-30 2020-11-30 Image recognition method, system, terminal and medium based on convolutional neural network Pending CN112396123A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011382932.9A CN112396123A (en) 2020-11-30 2020-11-30 Image recognition method, system, terminal and medium based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011382932.9A CN112396123A (en) 2020-11-30 2020-11-30 Image recognition method, system, terminal and medium based on convolutional neural network

Publications (1)

Publication Number Publication Date
CN112396123A true CN112396123A (en) 2021-02-23

Family

ID=74603971

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011382932.9A Pending CN112396123A (en) 2020-11-30 2020-11-30 Image recognition method, system, terminal and medium based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN112396123A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114861893A (en) * 2022-07-07 2022-08-05 西南石油大学 A method, system and terminal for generating adversarial samples with multi-pass aggregation
CN116630687A (en) * 2023-04-20 2023-08-22 南方科技大学 Multi-layer perceptron model image recognition method and related equipment with cascaded band mixing
CN116843605A (en) * 2022-12-09 2023-10-03 慧之安信息技术股份有限公司 A fruit and vegetable defect detection method and system based on AI algorithm

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100067795A1 (en) * 2008-05-27 2010-03-18 Falk Eberhard Method and apparatus for pattern processing
CN102298849A (en) * 2011-06-28 2011-12-28 天津成科传动机电技术股份有限公司 Wireless vehicle driving supervising device
CN108229379A (en) * 2017-12-29 2018-06-29 广东欧珀移动通信有限公司 Image recognition method and device, computer equipment and storage medium
CN110110591A (en) * 2015-06-16 2019-08-09 眼验股份有限公司 System and method for counterfeit detection and liveness analysis
CN110163234A (en) * 2018-10-10 2019-08-23 腾讯科技(深圳)有限公司 A kind of model training method, device and storage medium
CN110288573A (en) * 2019-06-13 2019-09-27 天津大学 A kind of automatic detection method of mammalian livestock disease
CN110472676A (en) * 2019-08-05 2019-11-19 首都医科大学附属北京朝阳医院 Stomach morning cancerous tissue image classification system based on deep neural network
CN110574050A (en) * 2017-05-31 2019-12-13 英特尔公司 Gradient-based training engine for quaternion-based machine learning systems
CN111191704A (en) * 2019-12-24 2020-05-22 天津师范大学 A ground-based cloud classification method based on task graph convolutional network

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100067795A1 (en) * 2008-05-27 2010-03-18 Falk Eberhard Method and apparatus for pattern processing
CN102298849A (en) * 2011-06-28 2011-12-28 天津成科传动机电技术股份有限公司 Wireless vehicle driving supervising device
CN110110591A (en) * 2015-06-16 2019-08-09 眼验股份有限公司 System and method for counterfeit detection and liveness analysis
CN110574050A (en) * 2017-05-31 2019-12-13 英特尔公司 Gradient-based training engine for quaternion-based machine learning systems
CN108229379A (en) * 2017-12-29 2018-06-29 广东欧珀移动通信有限公司 Image recognition method and device, computer equipment and storage medium
CN110163234A (en) * 2018-10-10 2019-08-23 腾讯科技(深圳)有限公司 A kind of model training method, device and storage medium
CN110288573A (en) * 2019-06-13 2019-09-27 天津大学 A kind of automatic detection method of mammalian livestock disease
CN110472676A (en) * 2019-08-05 2019-11-19 首都医科大学附属北京朝阳医院 Stomach morning cancerous tissue image classification system based on deep neural network
CN111191704A (en) * 2019-12-24 2020-05-22 天津师范大学 A ground-based cloud classification method based on task graph convolutional network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
KUN FANG, YINGWEN WU, TAO LI AND ET AL: ""Learn Robust Features via Orthogonal Multi-Path"", 《ARXIV:2010.12190V1》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114861893A (en) * 2022-07-07 2022-08-05 西南石油大学 A method, system and terminal for generating adversarial samples with multi-pass aggregation
CN116843605A (en) * 2022-12-09 2023-10-03 慧之安信息技术股份有限公司 A fruit and vegetable defect detection method and system based on AI algorithm
CN116630687A (en) * 2023-04-20 2023-08-22 南方科技大学 Multi-layer perceptron model image recognition method and related equipment with cascaded band mixing

Similar Documents

Publication Publication Date Title
CN108701210B (en) Method and system for CNN network adaptation and object online tracking
US20210295089A1 (en) Neural network for automatically tagging input image, computer-implemented method for automatically tagging input image, apparatus for automatically tagging input image, and computer-program product
WO2020177189A1 (en) Image refined shadow area segmentation system, method and apparatus
CN111480169B (en) Method, system and apparatus for pattern recognition
CN107636691A (en) Method and apparatus for recognizing text in image
CN112927209B (en) A CNN-based saliency detection system and method
CN109840531A (en) The method and apparatus of training multi-tag disaggregated model
JP2020502665A (en) Convert source domain image to target domain image
CN112396123A (en) Image recognition method, system, terminal and medium based on convolutional neural network
CN111488880A (en) Method apparatus for improving segmentation performance using edge loss to detect events
Khaw et al. High‐density impulse noise detection and removal using deep convolutional neural network with particle swarm optimisation
RU2665273C2 (en) Trained visual markers and the method of their production
Sharma et al. Automatic identification of bird species using audio/video processing
CN111626184A (en) Crowd density estimation method and system
CN114299358B (en) Image quality assessment method, device, electronic device and machine-readable storage medium
CN116704431A (en) On-line Monitoring System and Method for Water Pollution
CN112418261A (en) A multi-attribute classification method of human images based on a priori prototype attention mechanism
Fan et al. Classification of imbalanced data using deep learning with adding noise
CN115393675B (en) Adversarial robustness evaluation method and related device for deep learning models
CN117218434A (en) Concrete structure surface defect classification method and system based on hybrid neural network
Sarigül et al. Comparison of different deep structures for fish classification
CN116844032A (en) A method, device, equipment and medium for target detection and recognition in a marine environment
Beijing et al. A quaternion two‐stream R‐CNN network for pixel‐level color image splicing localization
CN116863194A (en) Foot ulcer image classification method, system, equipment and medium
CN112861601B (en) Method and related device for generating adversarial samples

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210223