CN106339719A

CN106339719A - Image identification method and image identification device

Info

Publication number: CN106339719A
Application number: CN201610703925.1A
Authority: CN
Inventors: 杜康华; 王崇; 任文越
Original assignee: Weibo Internet Technology China Co Ltd
Current assignee: Weibo Internet Technology China Co Ltd
Priority date: 2016-08-22
Filing date: 2016-08-22
Publication date: 2017-01-18

Abstract

The present application discloses an image recognition method and device. Firstly, normalize the image area of each sample image that matches the specified color tone, so as to obtain each image in the first image set, and then use the first image set to train the image classification device to obtain a trained image classifier, input the image to be recognized into the trained image classifier, and obtain a recognition result for the image to be recognized. Since the images in the first image set are obtained by processing the sample images, the proportion of the image area matching the preset image category in each image in the first image set is relatively increased. When zooming, the information loss of the image area matching the preset image category in the image is reduced. It can be seen that the method provided by this application can effectively reduce the number of images in the first image set, and improve the image classification while reducing costs. The training efficiency of the device.

Description

A method and device for image recognition

技术领域technical field

本申请涉及信息技术领域，尤其涉及一种图像识别方法及装置。The present application relates to the field of information technology, in particular to an image recognition method and device.

背景技术Background technique

随着信息化社会的发展以及网络社交活动的增长，人们在进行网络社交活动时更倾向于使用不受地域和语言限制的图像取代文字作为传词达意的主要媒介，这使得网络中图像快速增加。如何利用网络中的海量图像就成为了近年来人们关注的热点之一。With the development of the information society and the growth of online social activities, people are more inclined to use images that are not restricted by region and language instead of words as the main medium of conveying words and expressions when conducting online social activities, which makes the rapid increase of images on the Internet. . How to utilize the massive images in the network has become one of the hot spots that people pay attention to in recent years.

由于图像区别于文字信息，其内容无法直接通过关键字进行检索、分类等操作，所以对于如何利用图像来说，首先要解决的问题就是对图像内容的识别，也即图像识别技术。Since images are different from text information, their content cannot be directly retrieved and classified through keywords, so for how to use images, the first problem to be solved is the identification of image content, that is, image recognition technology.

现有图像识别技术主要采用机器学习的方法，具体的，首先需要人工对图像进行分类，确定由不同内容的图像分别构成的图像集(如，风景图像构成的图像集、人脸图像构成的图像集、色情图像构成的图像集等等)，之后针对每一种内容的图像集，提取该图像集包含的各图像之间的共同特征(往往为特征向量)，并通过训练最终得到该图像集的特征模型，最后根据各种图像集分别对应的特征模型，对接收到的待识别图像进行图像识别，并确定该待识别图像所属类别。The existing image recognition technology mainly adopts the method of machine learning. Specifically, it is first necessary to manually classify images to determine image sets composed of images with different contents (such as image sets composed of landscape images and images composed of face images). set, image sets composed of pornographic images, etc.), and then for each image set of content, extract the common features (often feature vectors) between the images contained in the image set, and finally obtain the image set through training Finally, according to the feature models corresponding to various image sets, image recognition is performed on the received image to be recognized, and the category of the image to be recognized is determined.

由于相对于人工设置并提取特征向量进行图像识别，通过机器学习和训练得到的特征模型避免了人的主观因素的影响，并且可以通过训练不断优化，所以使得图像识别的准确率更高。Compared with manually setting and extracting feature vectors for image recognition, the feature model obtained through machine learning and training avoids the influence of human subjective factors, and can be continuously optimized through training, so the accuracy of image recognition is higher.

但是，对于机器学习的方法来说，若想要图像识别的准确率较高，首先需要大量的图像用于学习和训练不同内容的图像集对应的特征模型，若是用于学习和训练的图像太少，则确定的特征模型的准确度就会降低，影响图像识别的鲁棒性，而训练用的图像太多，又会导致机器学习的方法的资源增加，影响机器学习的效率。However, for machine learning methods, if you want to have a higher accuracy of image recognition, you first need a large number of images for learning and training feature models corresponding to image sets with different contents. If the images used for learning and training are too large If there are too few images, the accuracy of the determined feature model will decrease, which will affect the robustness of image recognition. If too many images are used for training, it will increase the resources of the machine learning method and affect the efficiency of machine learning.

其次，由于在于训练特征模型时，对于用于训练的图像尺寸有统一要求(如，统一图像尺寸为分辨率：100×100)，所以还需要对用于训练的图像尺寸进行调整(包括：放大、缩小、拉伸等操作)，如图1所示，而导致图像中包含的特征的损失，从而影响机器学习的准确性(即，影响最终得到的特征模型的准确性)，使得为了保证机器学习的准确性需要进一步增加训练用的图像。Secondly, since there is a unified requirement for the size of the image used for training when training the feature model (for example, the unified image size is resolution: 100×100), it is also necessary to adjust the size of the image used for training (including: zooming in) , reduction, stretching, etc.), as shown in Figure 1, which leads to the loss of the features contained in the image, thus affecting the accuracy of machine learning (that is, affecting the accuracy of the final feature model), so that in order to ensure that the machine Learning accuracy requires further augmentation of training images.

图1为对高分辨率图像进行图像缩放导致的图像中包含的特征损失示意图。FIG. 1 is a schematic diagram of feature loss contained in an image caused by image scaling of a high-resolution image.

其中，左侧为原始尺寸大小的图像，右侧为缩小图像尺寸之后的图像，为了体现该缩小图像尺寸之后的图像中特征的损失，将该缩小图像尺寸之后的图像再次放大到该图像的原始尺寸大小。可见，其中叶脉纹理已经模糊，若以叶脉纹理为需要提取的特征的话，该缩小图像尺寸之后的图像的特征已经出现了损失。Among them, the image on the left side is the original size, and the right side is the image after the reduced image size. In order to reflect the loss of features in the image after the reduced image size, the image after the reduced image size is enlarged again to the original size of the image. Size. It can be seen that the vein texture has been blurred, and if the vein texture is the feature to be extracted, the features of the reduced image size have been lost.

可见由于上述问题，现有的图像识别技术需要用于训练的图像数量较多，导致图像识别的成本高。It can be seen that due to the above problems, the existing image recognition technology requires a large number of images for training, resulting in high cost of image recognition.

发明内容Contents of the invention

本申请实施例提供一种图像识别方法，用于解决现有技术中，在采用机器学习的方法进行图像识别时，需要大量用于训练的图像，导致图像识别的成本增加的问题。An embodiment of the present application provides an image recognition method, which is used to solve the problem in the prior art that a large number of images are required for training when a machine learning method is used for image recognition, resulting in an increase in the cost of image recognition.

本申请实施例提供一种图像识别装置，用于解决现有技术中，在采用机器学习的方法进行图像识别时，需要大量用于训练的图像，导致图像识别的成本增加的问题。An embodiment of the present application provides an image recognition device, which is used to solve the problem in the prior art that when a machine learning method is used for image recognition, a large number of images are required for training, resulting in an increase in the cost of image recognition.

本申请实施例采用下述技术方案：The embodiment of the application adopts the following technical solutions:

一种图像识别方法，包括：A method for image recognition, comprising:

确定待识别图像；Determine the image to be recognized;

将所述待识别图像输入预先训练完成的的图像分类器，得到所述图像分类器输出的针对所述待识别图像的识别结果，其中，所述图像分类器进行训练所用的第一图像集中的图像，是对样本图像中与指定色调匹配的图像区域进行规范化处理而得到的；Inputting the image to be recognized into a pre-trained image classifier to obtain a recognition result output by the image classifier for the image to be recognized, wherein the first image set used by the image classifier for training is an image obtained by normalizing the image regions in the sample image that match the specified hue;

所述指定色调，根据预设图像类别的图像的色调确定。The specified hue is determined according to the hue of the image of the preset image category.

一种图像识别装置，包括：An image recognition device, comprising:

确定模块，确定待识别图像；Determine the module and determine the image to be recognized;

识别模块，将所述待识别图像输入预先训练完成的图像分类模块，得到所述图像分类器输出的针对所述待识别图像的识别结果，其中，所述图像分类模块进行训练所用的第一图像集中的图像，是对样本图像中与指定色调匹配的图像区域进行规范化处理而得到的；A recognition module, inputting the image to be recognized into a pre-trained image classification module to obtain a recognition result output by the image classifier for the image to be recognized, wherein the first image used by the image classification module for training A concentrated image, obtained by normalizing the image regions in the sample image that match the specified hue;

本申请实施例采用的上述至少一个技术方案能够达到以下有益效果：The above at least one technical solution adopted in the embodiment of the present application can achieve the following beneficial effects:

先对各样本图像的与指定色调匹配的图像区域进行规范化处理，从而得到第一图像集中的各图像，然后通过该第一图像集训练该图像分类器，以得到训练完成的图像分类器，当对待识别图像进行图像识别时，将该待识别图像输入该训练完成的该图像分类器中，以得到该图像分类器输出的针对该待识别图像的识别结果。其中，由于该第一图像集中的图像为对样本图像进行处理而得到的，使得该第一图像集各图像中与指定色调匹配的图像区域在该图像中所占比例相对提升，使得即使需要对该图像进行缩放，也可减少图像中与指定色调匹配的图像区域的特征损失，增加了该图像分类器的训练结果的准确性，可见通过本申请提供的方法，可以在不影响训练效果的情况下，有效地降低对第一图像集中图像的数量的需求，在减少成本的同时也提高了该图像分类器的训练效率。Firstly, normalize the image area matching the specified tone of each sample image, so as to obtain each image in the first image set, and then train the image classifier through the first image set to obtain the trained image classifier, when When performing image recognition on the image to be recognized, the image to be recognized is input into the trained image classifier, so as to obtain a recognition result output by the image classifier for the image to be recognized. Wherein, since the images in the first image set are obtained by processing the sample images, the proportion of the image area matching the specified tone in each image in the first image set is relatively increased, so that even if it is necessary to Scaling the image can also reduce the feature loss of the image area matching the specified tone in the image, and increase the accuracy of the training result of the image classifier. It can be seen that the method provided by this application can be used without affecting the training effect. In this way, the requirement for the number of images in the first image set is effectively reduced, and the training efficiency of the image classifier is improved while reducing the cost.

附图说明Description of drawings

此处所说明的附图用来提供对本申请的进一步理解，构成本申请的一部分，本申请的示意性实施例及其说明用于解释本申请，并不构成对本申请的不当限定。在附图中：The drawings described here are used to provide a further understanding of the application and constitute a part of the application. The schematic embodiments and descriptions of the application are used to explain the application and do not constitute an improper limitation to the application. In the attached picture:

图1为对高分辨率图像进行图像缩放导致的图像中包含的特征损失示意图；Figure 1 is a schematic diagram of the feature loss contained in the image caused by image scaling on a high-resolution image;

图2为本申请实施例提供的图像识别过程；Fig. 2 is the image recognition process provided by the embodiment of the present application;

图3为本申请实施例提供的对该卷积神经网络模型进行训练的过程；Fig. 3 is the process of training the convolutional neural network model provided by the embodiment of the present application;

图4为本申请实施例提供的确定该样本图像中与所述指定色调匹配的图像区域，作为中间图像的示意图；Fig. 4 is a schematic diagram of determining the image area matching the specified tone in the sample image provided by the embodiment of the present application as an intermediate image;

图5为本申请实施例提供的待训练的卷积神经网络模型的结构示意图；FIG. 5 is a schematic structural diagram of a convolutional neural network model to be trained provided by an embodiment of the present application;

图6是本申请实施例提供一种图像识别装置的结构示意图。FIG. 6 is a schematic structural diagram of an image recognition device provided by an embodiment of the present application.

具体实施方式detailed description

为使本申请的目的、技术方案和优点更加清楚，下面将结合本申请具体实施例及相应的附图对本申请技术方案进行清楚、完整地描述。显然，所描述的实施例仅是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。In order to make the purpose, technical solution and advantages of the present application clearer, the technical solution of the present application will be clearly and completely described below in conjunction with specific embodiments of the present application and corresponding drawings. Apparently, the described embodiments are only some of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

如前所述，由于采用机器学习的方法需要大量的用于训练的图像，并且在需要对用于训练的图像进行图像尺寸统一的情况下，所需的图像数量还需要进一步增加，所以导致现有技术中所需用于训练的图像数量大大增加。As mentioned above, because the method of machine learning requires a large number of images for training, and in the case of unifying the image size of the images used for training, the number of required images needs to be further increased, so the current The number of images required for training is greatly increased in existing techniques.

进一步地，由于采用机器学习的方法，实际上是训练得到不同类的图像分别对应的特征模型，所以在现有技术中还需要预先将用于训练的图像进行分类，才可根据预先分类过的图像对特征模型进行调整，并最终得到图像识别正确率符合要求的特征模型(即，对特征模型进行训练的过程)。其中，预先对用于训练的图像进行分类通常依赖于人工进行，也就是说，需要由人工根据图像的内容对用于训练的图像集中的各图像进行分类。Furthermore, since the method of machine learning is used, the feature models corresponding to different types of images are actually trained, so in the prior art, it is necessary to classify the images used for training in advance, so that the images can be classified according to the pre-classified The image is adjusted to the feature model, and finally the feature model with the correct rate of image recognition meeting the requirements is obtained (that is, the process of training the feature model). Wherein, the pre-classification of the images used for training usually relies on manual work, that is, it is necessary to manually classify each image in the image set used for training according to the content of the images.

但是，由于现有技术需要用于训练的图像较多，所以需要大量的人工进行图像分类工作，增加了运行成本。However, since the prior art requires many images for training, a large amount of manual work for image classification is required, which increases the operating cost.

更进一步地，由于人工在对图像进行分类时，主要依靠人的主观感觉，在对于有些可以同时分类至不同类别的图像，不同人对该图像的分类结果可能并不一致，若通过这种可以同时分类至不同类别的图像对特征模型进行训练，则可能对训练效果产生负面影响，但是由于现有技术需要用于训练的图像较多，要么增加人工对这种可以同时分类至不同类别的图像进行筛查，排除出用于训练的图像集中，要么需要进一步增加用于训练的图像。Furthermore, since humans mainly rely on human subjective feelings when classifying images, for some images that can be classified into different categories at the same time, the classification results of different people may not be consistent. Classifying images of different categories to train the feature model may have a negative impact on the training effect. However, since the existing technology requires more images for training, it is necessary to manually increase the number of images that can be classified into different categories at the same time. Screening, exclusion of the set of images used for training, or the need to further increase the images used for training.

基于以上内容，本申请实施例提供一种可以减少训练用图像，且不影响对特征模型训练效果的用于图像识别的技术方案。以下结合附图，详细说明本申请各实施例提供的技术方案。Based on the above content, the embodiment of the present application provides a technical solution for image recognition that can reduce training images without affecting the effect of feature model training. The technical solutions provided by various embodiments of the present application will be described in detail below in conjunction with the accompanying drawings.

图2为本申请实施例提供的图像识别过程，具体包括以下步骤：Fig. 2 is the image recognition process provided by the embodiment of the present application, which specifically includes the following steps:

S101：确定待识别图像。S101: Determine an image to be recognized.

在现有技术中，由于通常采用机器学习的方法训练特征模型所消耗的资源较多，所以一般由服务器进行训练该特征模型的过程，并可由终端或者该服务器根据训练完成的特征模型进行图像识别过程，在本申请中，以服务器进行图像识别过程为例进行说明。In the prior art, since the machine learning method usually consumes more resources to train the feature model, the process of training the feature model is generally performed by the server, and the terminal or the server can perform image recognition according to the trained feature model In this application, the image recognition process performed by the server is taken as an example for illustration.

于是，在本申请实施例中，该服务器可确定待识别图像，以便后续进行图像识别。其中，所述待识别图像可以是该服务器从本地存储的图像中确定的，也可是该服务器接收到的图像，例如，用户在通过终端发布图像时，需要通过该终端将该图像传输到该服务器中，再由该服务器将该图像发布上线，则此时该服务器可以在接收到该图像时，将该图像确定为待识别图像，并进行后续的图像识别过程。Therefore, in the embodiment of the present application, the server may determine the image to be recognized for subsequent image recognition. Wherein, the image to be recognized may be determined by the server from locally stored images, or may be an image received by the server. For example, when a user publishes an image through a terminal, the image needs to be transmitted to the server through the terminal. , and then the server publishes the image online, then the server can determine the image as the image to be recognized when receiving the image at this time, and perform the subsequent image recognition process.

进一步地，由于现在社会的信息化程度很高，每一时刻产生的图像也很多，所以该服务器在每一时刻接收到的图像数量也很庞大，所以在本申请中，该服务器也可对接收到的图像进行随机抽取，并将随机抽取到图像确定为该待识别图像，并进行后续的图像识别过程。并且，由于该服务器在接收到图像之后，可将图像存储在该服务器本地或者该服务器对应存储数据的数据库中，所以该服务器还可以在运行压力较小时，将已存储且未进行过图像识别的图像确定为待识别图像，并进行后续的图像识别过程，即，该服务器从本地存储的图像中确定的待识别图像，通过上述方法，可更好的利用该服务器接收到的以及存储的各类图像资源。Furthermore, due to the high degree of informatization in the current society, there are also many images generated at each moment, so the number of images received by the server at each moment is also very large, so in this application, the server can also receive Randomly extract the obtained image, and determine the randomly extracted image as the image to be recognized, and carry out the subsequent image recognition process. Moreover, since the server can store the image locally on the server or in the server's database corresponding to the stored data after receiving the image, the server can also store the stored and unrecognized images when the operating pressure is low. The image is determined as the image to be recognized, and the subsequent image recognition process is carried out, that is, the server determines the image to be recognized from the locally stored images. Through the above method, the various types of images received and stored by the server can be better utilized. Image resource.

当然，上述确定该待识别图像的方法仅为本申请提供的实施例，在实际使用过程中，还可以使用与现有技术相同的多种确定该待识别图像的方法，并不限定采用本申请实施例中提供的方法，本申请对如何确定待识别图像不做具体限定。Of course, the above-mentioned method for determining the image to be recognized is only an embodiment provided by this application. In actual use, the same methods for determining the image to be recognized as in the prior art can also be used, which does not limit the use of this application. For the methods provided in the embodiments, this application does not specifically limit how to determine the image to be recognized.

更进一步地，由于对该待识别图像进行图像识别，主要出于以下两种目的：一是进行风险控制，二是对图像资源的利用，其中，由于进行风险控制相对来说对网络安全的影响更大，所以相对较为重要，所以本申请实施例后续以对该待识别图像进行图像识别以进行风险控制为例进行说明，具体的，本申请以对该图像是否是色情内容的图像进行图像识别为例进行后续的说明。Furthermore, since the image recognition of the image to be recognized is mainly for the following two purposes: one is to carry out risk control, and the other is to utilize image resources, among which, due to the relative impact of risk control on network security larger, so it is relatively more important, so the embodiment of this application will be described by performing image recognition on the image to be recognized for risk control as an example. Specifically, this application will use image recognition to determine whether the image is pornographic Take it as an example for subsequent descriptions.

需要说明的是，在本申请中，也可以由终端进行图像识别过程，该终端可以是手机、个人电脑、平板电脑的设备，当通过服务器进行图像识别过程时，该服务器可以是单独的一台设备，或者由多台设备组成的网络，即，分布式服务器。为方便描述，后续以服务器进行图像识别过程为例进行说明。It should be noted that in this application, the image recognition process can also be performed by a terminal, which can be a device such as a mobile phone, a personal computer, or a tablet computer. When the image recognition process is performed through a server, the server can be a separate device, or a network of multiple devices, that is, a distributed server. For the convenience of description, the image recognition process performed by the server will be used as an example for illustration later.

例如，假设服务器A接收到了某终端发送的图像，且该服务器A确定该图像为待识别图像，其中该待识别图像尺寸的大小为：分辨率1000×1000。For example, assume that server A receives an image sent by a certain terminal, and the server A determines that the image is an image to be recognized, wherein the size of the image to be recognized is: resolution 1000×1000.

S102：将所述待识别图像输入预先训练完成的图像分类器，得到所述图像分类器输出的针对所述待识别图像的识别结果。S102: Input the image to be recognized into a pre-trained image classifier, and obtain a recognition result output by the image classifier for the image to be recognized.

在本申请实施例中，当该服务器在确定了该待识别图像之后，便可将该待识别图像输入训练完成的图像分类器中，使得该服务器可根据该图像分类器的输出，确定针对该待识别图像的识别结果。并且，该图像分类器进行训练所用的第一图像集中的图像，是对样本图像的与指定色调匹配的图像区域进行规范化处理而得到的。In the embodiment of the present application, after the server determines the image to be recognized, it can input the image to be recognized into the trained image classifier, so that the server can determine the image for the image classifier according to the output of the image classifier The recognition result of the image to be recognized. Moreover, the images in the first image set used by the image classifier for training are obtained by performing normalization processing on the image regions of the sample images that match the specified tone.

具体的，在本申请中，由于该图像分类器可包括：卷积神经网络模型，所以当该图像分类器为卷积神经网络模型时，该服务器可通过将该待识别图像输入训练完成的卷积神经网络模型，以得到对该待识别图像的图像识别结果。Specifically, in this application, since the image classifier may include: a convolutional neural network model, when the image classifier is a convolutional neural network model, the server may input the image to be recognized into the trained volume The product neural network model is used to obtain the image recognition result of the image to be recognized.

并且，如前所述的，现有技术中存在需要训练图像多，训练成本高的问题，而本申请本实施例提供的图像识别过程中，用于图像识别的图像分类器所需的用于训练用图像相对较少，并且不影响训练完成的该图像分类器的图像识别正确率，所以可以避免现有技术中存在的问题。后续，本申请以该图像分类器为卷积神经网络模型为例进行说明。Moreover, as mentioned above, there are many training images and high training costs in the prior art. However, in the image recognition process provided by this embodiment of the present application, the image classifier required for image recognition is used for There are relatively few images for training, and the correct rate of image recognition of the trained image classifier is not affected, so the problems existing in the prior art can be avoided. In the following, the present application takes the image classifier as a convolutional neural network model as an example for illustration.

具体的，在本申请中，对该卷积神经网络模型进行训练主要通过如图3所述的过程。Specifically, in this application, the training of the convolutional neural network model is mainly through the process as shown in FIG. 3 .

图3为本申请是实施例供的对该卷积神经网络模型进行训练的过程，包括：Fig. 3 is the process of training the convolutional neural network model provided by the embodiment of the present application, including:

S1021：确定样本图像组成的第二图像集。S1021: Determine a second image set composed of sample images.

在本申请实施例中，该服务器可以先确定初始的用于训练该卷积神经网络模型的样本图像，并将由各样本图像构成的图像集作为该第二图像集。其中，以该卷积神经网络模型用于对色情图像进行图像识别为例，该第二图像集可以是由三种图像内容的样本图像构成的图像集，包括：色情内容的样本图像、非色情人物内容的样本图像以及非人物内容的样本图像。In this embodiment of the present application, the server may first determine an initial sample image used for training the convolutional neural network model, and use an image set composed of each sample image as the second image set. Wherein, taking the use of the convolutional neural network model for image recognition of pornographic images as an example, the second image set may be an image set composed of sample images of three types of image content, including: sample images of pornographic content, non-pornographic Sample images for people content and sample images for non-people content.

其中，以人物内容图像的内容是否涉及色情内容来看，人物内容图像可分类为：色情内容图像以及非色情人物内容图像。也就是说，色情内容图像以及非色情人物内容图像，从图像内容上来看均是人物内容的图像，其区别仅在于图像内容是否涉及色情内容，而这一点区别正是训练该待训练的卷积神经网络模型时，希望该待训练的卷积神经网络模型可以学习到的，并在得到训练完成的卷积神经网络模型时，该训练完成的卷积神经网络模型可以对图像内容是否涉及色情做出识别，所以在本申请中，该第二图像中的样本图像可包括：色情内容的样本图像以及非色情人物内容的样本图像。Wherein, in terms of whether the content of the character content image involves pornographic content, the character content images can be classified into: pornographic content images and non-pornographic character content images. That is to say, pornographic content images and non-pornographic character content images are images of character content from the perspective of image content. The only difference is whether the image content involves pornographic content, and this difference is exactly the training of the convolution to be trained When using a neural network model, it is hoped that the convolutional neural network model to be trained can be learned, and when the trained convolutional neural network model is obtained, the trained convolutional neural network model can determine whether the image content involves pornography Therefore, in this application, the sample images in the second image may include: sample images of pornographic content and sample images of non-pornographic character content.

进一步地，由于若仅使用人物内容图像训练该待训练的卷积神经网络模型，则最终得到该训练完成的卷积神经网络模型在对图像进行识别时，仅对人物内容的待识别图像具有较高的识别正确率，而对于非人物内容的待识别图像的识别正确率则不可预知，所以在本申请中该第二图像集中的样本图像还可包括：非人物内容的样本图像。Further, since if only the image of the character content is used to train the convolutional neural network model to be trained, the finally obtained convolutional neural network model that has been trained will only have a relatively strong effect on the image to be recognized of the character content High recognition accuracy rate, but the recognition accuracy rate of non-person content images to be recognized is unpredictable, so in this application, the sample images in the second image set may also include: non-person content sample images.

进一步地，在本申请中，还可对该第二图像集中样本图像的数量进行限定，使该第二图像集中的样本图像的数量不大于预设的数量，例如，该预设数量为3000，则该服务器可确定3000张样本图像。Further, in this application, the number of sample images in the second image set may also be limited so that the number of sample images in the second image set is not greater than a preset number, for example, the preset number is 3000, Then the server can determine 3000 sample images.

当然，具体的该第二图像集中的样本图像的数量可以根据需要确定，本申请实施例仅提供一种方案，并不构成对本申请的限定，同时需要说明的是，若选定该第二图像集中的样本数量太多(如，10000张、100000张)，则本申请实施例提供的方法就难以减少训练用的样本图像，并且本申请也无需大量的样本图像。Of course, the specific number of sample images in the second image set can be determined according to needs. The embodiment of this application only provides a solution, which does not constitute a limitation to this application. It should also be noted that if the second image is selected If the number of samples in the collection is too large (for example, 10,000, 100,000), it is difficult to reduce the sample images for training by the method provided in the embodiment of the present application, and this application does not require a large number of sample images.

继续沿用上例，假设，该服务器A在训练用于识别色情图像的卷积神经网络模型时，需要先确定3000张样本图像，其中包括：色情内容的样本图像、非色情人物内容的样本图像以及非人物内容的样本图像。Continuing with the above example, assume that when the server A trains the convolutional neural network model for identifying pornographic images, it needs to first determine 3000 sample images, including: sample images of pornographic content, sample images of non-pornographic character content, and Sample image for non-people content.

S1022：对该第二图像集进行分类。S1022: Classify the second image set.

在本申请实施例中，与现有技术相同，该服务器需要对该第二图像集中的各样本图向进行分类，并根据各样本图像的分类结果对各样本图像添加标识。In the embodiment of the present application, the same as the prior art, the server needs to classify each sample image orientation in the second image set, and add a mark to each sample image according to the classification result of each sample image.

具体的，由于在申请中该第二图像集可包含三种图像内容的样本图像，所以该服务器可以将该第二图像集中的各样本图像，以各样本图像的内容分为三类，如期望训练完成的卷积神经网络模型能够识别色情图像，那么，可将样本图像分为：色情内容的样本图像、非色情人物内容的样本图像以及非人物内容的样本图像三类图像。并且，该服务器还可以根据对该第二图像集中的各样本图像的分类结果，分别对每一类样本图像添加不同的标识，以使得在后续训练该卷积神经网络模型时，使该卷积神经网络模型可以根据不同的标识确定输入的图像的内容，并执行对应的操作(如，计算误差值、计算正确率以及反向调整参数等)。Specifically, since the second image set may contain sample images of three types of image content in the application, the server may classify each sample image in the second image set into three categories according to the content of each sample image, as desired The trained convolutional neural network model can identify pornographic images, so the sample images can be divided into three types: sample images of pornographic content, sample images of non-pornographic character content, and sample images of non-personal content. In addition, the server can also add different labels to each type of sample image according to the classification results of each sample image in the second image set, so that when the convolutional neural network model is subsequently trained, the convolution The neural network model can determine the content of the input image according to different identifiers, and perform corresponding operations (such as calculating error values, calculating accuracy rates, and reversely adjusting parameters, etc.).

进一步地，在本申请实施例中，由于对用于训练的样本图像的数量进行了限制，所以区别于现有技术，在本申请中该第二图像集中的三种图像内容的样本图像的数量需要保证一致，若在该第二图像集中的三种图像内容的样本图像的数量不一致，并且数量差异较大，则可能使得该卷积神经网络模型对某一种图像内容的特征学习不够完整，从而导致该卷积神经网络模型的图像识别正确率降低。例如，假设，该第二图像集中包含3000张样本图像，其中，三种图像内容的样本图像的数量分别为：1500、1200、300，则可见以该300张样本图像来说，该300张样本图像属于同种图像内容的样本图像，进一步假设该同种图像内容具有a、b、c、d一共4个特征，而由于该300张样本图像的数量较少，所以导致该图像内容的样本图像覆盖全部a、b、c、d特征的几率较小，也就是说，由于该样本图像的数量少，导致该图像内容的图像所包含的特征可能被遗漏，所以导致对该种内容图像的特征学习的不完整几率较大，容易造成该卷积神经网络模型的图像识别正确率的降低。所以在本申请中，沿用上例，该第二图像集中三种图像内容的样本图像的数量可分别为：1000、1000、1000，从而可以在训练该卷积神经网络模型时，使该卷积神经网络模型对该三种图像内容的特征得到充分的学习。Further, in the embodiment of the present application, since the number of sample images used for training is limited, different from the prior art, in this application, the number of sample images of the three image contents in the second image set Consistency needs to be ensured. If the number of sample images of the three image contents in the second image set is inconsistent, and there is a large difference in quantity, it may make the feature learning of the convolutional neural network model incomplete for a certain image content. As a result, the image recognition accuracy rate of the convolutional neural network model is reduced. For example, assuming that the second image set contains 3000 sample images, wherein the numbers of sample images of the three image contents are: 1500, 1200, and 300 respectively, then it can be seen that in terms of the 300 sample images, the 300 sample images The image belongs to the sample image of the same kind of image content. It is further assumed that the same kind of image content has a total of 4 features a, b, c, and d. Since the number of the 300 sample images is small, the sample image of the image content The probability of covering all a, b, c, and d features is small, that is to say, due to the small number of sample images, the features contained in the image content of the image may be missed, so the features of the content image The probability of incomplete learning is high, which can easily lead to a decrease in the accuracy of image recognition of the convolutional neural network model. Therefore, in this application, following the above example, the number of sample images of the three kinds of image content in the second image set can be respectively: 1000, 1000, 1000, so that when training the convolutional neural network model, the convolution The characteristics of the three kinds of image contents are fully learned by the neural network model.

更进一步地，在对各样本图像添加标识时，该标识可以以统一的规则添加至该样本图像的文件名中，如，在文件名后添加3位数字标识，并以符号“-”作为与原文件名的分隔符，或者在文件名前添加3位英文字母标识等等，具体如何添加标识本申请并不做具体限定。Furthermore, when adding a logo to each sample image, the logo can be added to the file name of the sample image according to a unified rule, for example, add a 3-digit logo after the file name, and use the symbol "-" as the The delimiter of the original file name, or adding a 3-digit English letter logo before the file name, etc., how to add the logo is not specifically limited in this application.

继续沿用上例，假设在步骤S1021中，该服务器A确定的第二图像集中的色情内容的样本图像、非色情人物内容的样本图像以及非人物内容的样本图像的数量均为1000张，则该服务器A可以根据各样本图像的内容对该第二图像集中的各样本图像进行分类，并且根据分类结果，对以上三类样本图像分别添加不同的标识，如表1所示。Continuing to use the above example, assuming that in step S1021, the number of sample images of pornographic content, sample images of non-pornographic character content, and sample images of non-personal content in the second image set determined by server A is 1000, then the Server A may classify each sample image in the second image set according to the content of each sample image, and add different identifiers to the above three types of sample images according to the classification results, as shown in Table 1.

样本图像类别Sample image category 样本图像添加的标识Logo added by sample image 色情内容的样本图像Sample Image for Sexual Content 001001 非色情人物内容的样本图像Sample image for non-pornographic content 002002 非人物内容的样本图像Sample images for non-people content 003003

表1Table 1

其中，该标识可以添加至样本图像的文件名中，例如，某样本图像名为：92e8647ajw1exg20dc07hx6x.jpg，并且该样本图像的内容为色情内容，则添加标记后的该样本图像的文件便变为：92e8647ajw1exg20dc07hx6x-001.jpg.Wherein, the logo can be added to the file name of the sample image, for example, if the name of a sample image is: 92e8647ajw1exg20dc07hx6x.jpg, and the content of the sample image is pornographic content, then the file of the sample image after adding the mark becomes: 92e8647ajw1exg20dc07hx6x-001.jpg.

S1023：对样本图像的与指定色调匹配的图像区域进行规范化处理，得到进行训练所用的该第一图像集中的图像。S1023: Perform normalization processing on the image region of the sample image that matches the specified tone, to obtain images in the first image set used for training.

其中，所述指定色调，通过现有技术或者人工经验，根据预设图像类别的图像的色调确定。Wherein, the specified hue is determined according to the hue of the image of the preset image category through the prior art or manual experience.

在本申请实施例中，该服务器在确定第二图像集并对该第二图像集中的各样本图像进行分类、添加标识之后，由于此时该第二图像集中的各样本图像的尺寸大小并未符合用于训练的输入图像的尺寸要求，还不能用于训练，所以该服务器还需要对该第二图像集中的各样本图像进行处理，以得到符合用于训练的输入图像的尺寸要求的，可进行训练用的第一图像集中的各图像。In the embodiment of the present application, after the server determines the second image set and classifies and adds labels to each sample image in the second image set, since the size of each sample image in the second image set does not meet the size requirements of the input image used for training, and cannot be used for training, so the server also needs to process each sample image in the second image set to obtain the image that meets the size requirements of the input image used for training. Each image in the first set of images used for training.

具体的，由于现有技术中在统一各样本图像的尺寸大小时，只是对各样本图像进行拉伸以及缩放处理，如图1所示，可能导致各样本图像包含的特征损失，所以在本申请中，首先，该服务器可以针对该第二图像集中的每一个样本图像，确定该样本图像的色调饱和度明度(HueSaturationValue，HSV)颜色模型，即，确定该样本图像每一个像素点的色调、饱和度以及明度。其中，由于通常图像都是通过红色绿色蓝色(RedGreenBlue，RGB)颜色模型以红、绿、蓝三元色表示每一个像素点的值，所以为了得到该样本图像的HSV颜色模型，该服务器可以通过以下公式将该样本图像由RGB颜色模型转换为HSV颜色模型。Specifically, because in the prior art, when the size of each sample image is unified, only the stretching and zooming processing is performed on each sample image, as shown in Figure 1, which may lead to the loss of features contained in each sample image, so in this application In this method, firstly, for each sample image in the second image set, the server can determine the HueSaturationValue (HSV) color model of the sample image, that is, determine the hue, saturation value of each pixel of the sample image degree and lightness. Wherein, since the image usually expresses the value of each pixel with red, green, and blue ternary colors through the red, green, and blue (RedGreenBlue, RGB) color model, so in order to obtain the HSV color model of the sample image, the server can Use the following formula to convert the sample image from the RGB color model to the HSV color model.

$\{\begin{matrix} V V = = \frac{R R + + G G + + B B}{33} \\ S S = = 11 - - \frac{33 \times \times [[min min ((R R,, G G,, B B))]]}{R R + + G G + + B B} \\ H h = = arccos arccos {{\frac{[[((R R - - G G)) + + ((R R - - B B))]] / / 22}{\sqrt{{((R R - - G G))}^{22} + + ((R R - - B B)) ((G G - - B B))}}}} \end{matrix}$

其中，min(R,G,B)表示，针对该样本图像中每一个像素点，取该像素点R、G、B三值中的最小值。Among them, min(R, G, B) indicates that for each pixel in the sample image, the minimum value among the three values of R, G, and B of the pixel is taken.

其次，根据该样本图像的HSV颜色模型，确定该样本图像的所述预设图像类别对应的色调匹配的图像区域，作为中间图像。其中，当该预设图像类别为色情内容图像类别时，与该色情内容图像类别对应的色调可根据人工经验设置为H∈[0，116]，则，确定与该色情内容图像类别对应的色调匹配的图像区域，就是确定该样本图像的色调值在0～116范围的像素点所对应的图像区域，并将该图像区域作为中间图像。并且，在确定该样本图像的该图像区域时，可以先确定的各色调值在0～116范围的各像素点的坐标值，并以确定的各像素点的x轴最大坐标值、x轴最大坐标值、x轴最大坐标值以及x轴最大坐标值，确定该样本图像对应的中间图像。如，图4所示。Secondly, according to the HSV color model of the sample image, determine an image area corresponding to the preset image category of the sample image with a tone matching, as an intermediate image. Wherein, when the preset image category is the pornographic content image category, the hue corresponding to the pornographic content image category can be set as H∈[0, 116] according to artificial experience, then, the hue corresponding to the pornographic content image category is determined The image area to be matched is to determine the image area corresponding to the pixels whose tone value of the sample image is in the range of 0-116, and use this image area as an intermediate image. Moreover, when determining the image area of the sample image, the coordinate values of each pixel point whose hue value is in the range of 0 to 116 can be determined first, and the maximum coordinate value of the x-axis and the maximum value of the x-axis of each pixel point can be determined The coordinate value, the maximum coordinate value of the x-axis, and the maximum coordinate value of the x-axis determine the intermediate image corresponding to the sample image. As shown in Figure 4.

图4为本申请实施例提供的确定该样本图像中与所述指定色调匹配的图像区域，作为中间图像的示意图。FIG. 4 is a schematic diagram of determining an image region matching the specified tone in the sample image as an intermediate image provided by an embodiment of the present application.

可见，图4中，最大的矩形框为样本图像的图像边界，灰色区域为色调值在0～116范围的各像素点，最小的虚线矩形框为确定的中间图像，其中，该中间图像的边界，通过各色调值在0～116范围的各像素点的x轴最大坐标值、x轴最大坐标值、x轴最大坐标值以及x轴最大坐标值确定。It can be seen that in Figure 4, the largest rectangular frame is the image boundary of the sample image, the gray area is each pixel point with a tone value ranging from 0 to 116, and the smallest dashed rectangular frame is the determined intermediate image, where the boundary of the intermediate image , determined by the x-axis maximum coordinate value, the x-axis maximum coordinate value, the x-axis maximum coordinate value, and the x-axis maximum coordinate value of each pixel point whose hue value is in the range of 0 to 116.

需要说明的是，当通过该样本图像确定对应的该中间图像时，可以视为该服务器根据该样本图像的与所述指定色调匹配的该图像区域，对该样本图像进行了截图操作，并得到该中间图像，即，如图4所示的该服务器截取了该样本图像中的最小的虚线矩形框的区域作为中间图像。It should be noted that, when the corresponding intermediate image is determined through the sample image, it can be considered that the server has taken a screenshot of the sample image according to the image area of the sample image that matches the specified tone, and obtained The intermediate image, that is, as shown in FIG. 4 , the server intercepts the area of the smallest dotted rectangular frame in the sample image as the intermediate image.

最后，当确定了该样本图像对应的该中间图像之后，假设此时该中间图像的图像尺寸存在不符合用于训练的输入图像的尺寸要求的情况，所以该服务器还可对各中间图像进行规范化处理，并将进行规范化处理后的各中间图像作为该第一图像集中的各图像，即，将中间图像进行规范化处理后得到用于训练的第一图像集中的图像。其中，进行规范化处理包括：根据预设的图像尺寸，采用与现有技术相同的方法对该中间图像进行缩放和拉伸，是该中间图像的图像尺寸符合预设的图像尺寸，例如，假设该预设的图像尺寸为分辨率256×256，而该中间图像的图像尺寸为分辨率300×400，则该服务器可对该中间图像进行缩放和拉伸将该中间图像的图像尺寸规范为分辨率256×256。Finally, after determining the intermediate image corresponding to the sample image, assuming that the image size of the intermediate image does not meet the size requirements of the input image used for training, the server can also normalize each intermediate image processing, and use the normalized intermediate images as images in the first image set, that is, normalize the intermediate images to obtain images in the first image set for training. Wherein, the normalization processing includes: according to the preset image size, adopting the same method as the prior art to scale and stretch the intermediate image, so that the image size of the intermediate image conforms to the preset image size, for example, assuming the The preset image size is a resolution of 256×256, and the image size of the intermediate image is a resolution of 300×400, then the server can scale and stretch the intermediate image and standardize the image size of the intermediate image to the resolution 256×256.

需要说明的是，在本申请中该色情内容图像类别，即是，色情内容的样本图像所对应的图像类别，也就是图像添加了标识为001的样本图像。It should be noted that, in this application, the image category of pornographic content, that is, the image category corresponding to the sample image of pornographic content, that is, the sample image marked as 001 is added to the image.

由于该第一图像集中的各图像都是分别通过该第二图像集中的各样本图像得到的，所以该第一图像集中的各图像与该第二图像集中的各样本图像存在一一对应的关系，所以可见通过本步骤S1023的处理，该服务器先将各样本图像中，与该预设图像类别(如，色情内容图像类别)对应的指定色调匹配的图像区域截取出来，使得对于该第一图像集中的每一个图像，扩大了该图像中与该指定色调匹配的图像区域，相比于与该图像一一对应的样本图像中与该指定色调匹配的图像区域，所占的比例。如图4所示，可见相对于最大的矩形框，该最小的虚线矩形框中该灰色区域所占的比例相对较大，所以即使该服务器再通过后续的规范化处理，对该中间图像进行拉伸以及缩放时，也可减少用于训练的各图像所包含的特征的损失。从而避免了现有技术中存在的，由于需要进行规范化处理而使得图像特征损失，而需要增加训练用的图像数量的弊端，使得该服务器即使使用少量的图像用于训练，也可以达到较好的训练效果。Since each image in the first image set is obtained from each sample image in the second image set, there is a one-to-one correspondence between each image in the first image set and each sample image in the second image set , so it can be seen that through the processing of this step S1023, the server first intercepts the image area matching the specified color tone corresponding to the preset image category (such as pornographic content image category) in each sample image, so that for the first image Each image in the set is enlarged by the proportion of the image area matching the specified hue in the image compared to the image area matching the specified hue in the sample image corresponding to the image one-to-one. As shown in Figure 4, it can be seen that compared with the largest rectangular frame, the proportion of the gray area in the smallest dotted rectangular frame is relatively large, so even if the server passes subsequent normalization processing, the intermediate image is stretched And when scaling, the loss of features contained in each image used for training can also be reduced. Thereby avoiding the disadvantages existing in the prior art, due to the loss of image features due to the need for normalization processing, and the need to increase the number of images used for training, even if the server uses a small number of images for training, it can also achieve better results. training effect.

进一步地，由于该服务器从该样本图像中提取出了该中间图像，所以可视为该服务器刨除了大量的干扰图像区域，如图4所示，在进行训练时，输入的图像为最小的虚线矩形框，相对于最大的矩形框，该最小的虚线矩形框刨除了大量的无用背景(即，干扰图像区域)，即该最大的矩形框比该最小的虚线矩形框多出的部分，使得进行训练时，可以减少了不与该指定色调匹配的图像区域对训练效果的影响。Furthermore, since the server extracts the intermediate image from the sample image, it can be considered that the server has removed a large number of interfering image regions. As shown in Figure 4, when training, the input image is the smallest dotted line Rectangular frame, relative to the largest rectangular frame, the smallest dotted rectangular frame has removed a large amount of useless background (that is, the interference image area), that is, the part of the largest rectangular frame that is more than the smallest dashed rectangular frame, so that the When training, the influence of image regions that do not match the specified hue on the training effect can be reduced.

例如，假设用于训练的色情内容图像均是白色背景的图像，若没有如本申请中所述的确定中间图像的操作，则在训练时，可能得到白色背景与色情内容图像关联很强的结果，而我们都知道图像的背景颜色与该图像是否为色情图像没有直接关联，从而使得训练得到该卷积神经网络模型的图像识别正确率降低。但是，当该服务器根据该指定色调匹配的图像区域，确定各中间图像后，用于训练的图像中白色背景的区域减少了很多，从而使得白色背景不再是一种主要的特征，从而不会影响对该卷积神经网络模型的训练，从而使得训练得到该卷积神经网络模型的图像识别正确率更高。For example, assuming that the images of pornographic content used for training are all images with white background, if there is no operation for determining the intermediate image as described in this application, then during training, the result that the correlation between white background and pornographic content images may be very strong , and we all know that the background color of an image is not directly related to whether the image is a pornographic image, so that the correct rate of image recognition obtained by training the convolutional neural network model is reduced. However, when the server determines each intermediate image according to the image area matched by the specified tone, the area of the white background in the image used for training is reduced a lot, so that the white background is no longer a main feature, so that it will not It affects the training of the convolutional neural network model, so that the image recognition accuracy rate of the convolutional neural network model obtained through training is higher.

S1024：根据该第一图像集，对待训练的卷积神经网络模型进行训练，以得到该训练完成的卷积神经网络模型。S1024: According to the first image set, train the convolutional neural network model to be trained to obtain the trained convolutional neural network model.

在本申请实施例中，当确定好该第一图像集之后，该服务器便可根据该第一图像集，训练该待训练的卷积神经网络模型，以得到该训练完成的卷积神经网络模型。In the embodiment of the present application, after the first image set is determined, the server can train the convolutional neural network model to be trained according to the first image set to obtain the trained convolutional neural network model .

具体的，采用以下方法训练该卷积神经网络模型：Specifically, the following methods are used to train the convolutional neural network model:

首先，该服务器确定待训练的卷积神经网络模型中的各层对应的初始化参数，作为该卷积神经网络模型的初始化模型。通常，该初始化参数为随机确定的，当然，也可由人工根据经验进行确定，本申请对此并不限定。First, the server determines the initialization parameters corresponding to each layer in the convolutional neural network model to be trained as the initialization model of the convolutional neural network model. Usually, the initialization parameter is determined randomly, and of course, it can also be determined manually based on experience, which is not limited in the present application.

其次，该服务器循环执行下述步骤，直至该待训练的卷积神经网络模型输出的误差值达到第一阈值以及图像识别正确率达到第二阈值为止：Secondly, the server executes the following steps in a loop until the error value output by the convolutional neural network model to be trained reaches the first threshold and the image recognition accuracy reaches the second threshold:

将该第一图像集中的各图像依次输入该待训练的卷积神经网络模型，使得通过该待训练的卷积神经网络模型对输入的该训练图像的特征进行向前传播至输出层，计算输出的该误差值以及该图像识别正确率，根据该误差值从输出层反向调整该待训练的卷积神经网络模型中的各层对应的参数。Each image in the first image set is sequentially input into the convolutional neural network model to be trained, so that the features of the input training image are propagated forward to the output layer through the convolutional neural network model to be trained, and the output layer is calculated. According to the error value and the correct rate of image recognition, the parameters corresponding to each layer in the convolutional neural network model to be trained are reversely adjusted from the output layer according to the error value.

于是，当该服务器确定计算输出的该误差值达到该第一阈值以及该图像识别正确率达到该第二阈值时，确定该待训练的卷积神经网络模型训练结束，得到该训练完成的卷积神经网络模型。Therefore, when the server determines that the error value of the calculation output reaches the first threshold and the image recognition accuracy reaches the second threshold, it is determined that the training of the convolutional neural network model to be trained is completed, and the convolutional neural network model that has been trained is obtained. neural network model.

通过如图2所示的图像识别方法，由于该服务器训练该图像分类器所用的该第一图像集中的图像，是对样本图像的与指定色调匹配的图像区域进行规范化处理，而得到的，所以即使用于训练的该第一图像集中的图像经过了规范化处理(即，经过了缩放和拉伸处理)，该第一图像集中的各图像包含的特征损失也可大幅减少，使得即使用于训练的该第一图像集中的图像数量较少，也可训练得到图像识别正确率较高的该图像分类器，并且，由于用于训练的该第一图像集中的各图像数量的减少，使得图像识别的成本降低。Through the image recognition method as shown in Figure 2, since the images in the first image set used by the server to train the image classifier are obtained by performing normalization processing on the image area of the sample image that matches the specified tone, so Even if the images in the first image set used for training have been normalized (that is, have been scaled and stretched), the feature loss contained in each image in the first image set can be greatly reduced, so that even if used for training The number of images in the first image set is small, and the image classifier with higher image recognition accuracy can also be trained, and, due to the reduction of the number of images in the first image set used for training, the image recognition cost reduction.

另外，在步骤S101中，本申请对于该待识别图像的尺寸并不做具体限定，但是对于图像尺寸较小的待识别图像，由于图像尺寸较小(例如，分辨率5×5)所以该待识别图像中包含的信息太少，难以利用，于是在本申请中，该服务器还可以根据图像的图像尺寸，将图像尺寸大于门限值的图像确定为待识别图像。In addition, in step S101, the application does not specifically limit the size of the image to be recognized, but for an image to be recognized with a smaller image size, the size of the image to be recognized is small (for example, resolution 5×5), so the size of the image to be recognized The information contained in the recognition image is too little to be used, so in this application, the server can also determine the image whose size is larger than the threshold value as the image to be recognized according to the image size of the image.

进一步地，在步骤S102中，具体在步骤S1024中该卷积神经网络模型的结构可如图5所示。Further, in step S102, specifically in step S1024, the structure of the convolutional neural network model may be as shown in FIG. 5 .

图5为本申请实施例提供的待训练的卷积神经网络模型的结构示意图。FIG. 5 is a schematic structural diagram of a convolutional neural network model to be trained provided by an embodiment of the present application.

需要说明的是，在图5中仅显示了一个激活层，但在实际应用过程中，每一个卷积层输出的数据还需要经过激活层进行激活后，才可通过该激活层进入下一层，如图5中第一卷积层、激活层、第一池化层的结构形式所示，输入该第一卷积层的数据(即，第一图像集中的图像)，在由该第一卷积层输出后，还可输入该激活层中进行激活并由该激活层中输出后，再输入该第一池化层。同理，该第一至第四卷积降维层以及该第一至第六卷积特征提取层输出的数据，也可先输入各自对应的激活层并由各自对应的激活层输出后，再输入后续的各层，而在图5中为了简化该待训练的卷积神经网络模型的结构，并未将各激活层全部显示出来。It should be noted that only one activation layer is shown in Figure 5, but in the actual application process, the data output by each convolution layer needs to be activated by the activation layer before entering the next layer through the activation layer , as shown in the structural form of the first convolutional layer, the activation layer, and the first pooling layer in Figure 5, the data input to the first convolutional layer (ie, the images in the first image set), are generated by the first After the output of the convolutional layer, it can also be input into the activation layer for activation and output from the activation layer, and then input into the first pooling layer. In the same way, the data output by the first to fourth convolutional dimensionality reduction layers and the first to sixth convolutional feature extraction layers can also be input into their corresponding activation layers and output by their respective activation layers, and then Input the subsequent layers, but in Figure 5, in order to simplify the structure of the convolutional neural network model to be trained, not all activation layers are displayed.

另外，在图5中可见，该输入层用于将该第一图像集中的各图像依次输入该待训练的卷积神经网络模型中的各层，其中，各卷积降维层用于将输入该卷积层的数据进行参数降维，并输出至下一层。例如，输入的数据为128张分辨率为32×32的特征图像，则若直接对该输入进行卷积，并提取特征，假设用32个3×3的卷积核进行提取特征，则需要的配置的参数为128×32×3×3，而若采用一个卷积降维层的话，就可通过32个1×1的卷积核进行参数降维，则该卷积降维层需要参数为128×32×1×1，输出32个特征图，之后再用32个3×3的卷积核进行提取特征时，需要配置的参数就可降为32×32×3×3，对比如表2所示。In addition, it can be seen in FIG. 5 that the input layer is used to sequentially input each image in the first image set to each layer in the convolutional neural network model to be trained, wherein each convolutional dimensionality reduction layer is used to input The data of this convolutional layer is subjected to parameter dimensionality reduction and output to the next layer. For example, if the input data is 128 feature images with a resolution of 32×32, if the input is directly convolved and the features are extracted, assuming that 32 3×3 convolution kernels are used to extract features, the required The configured parameters are 128×32×3×3, and if a convolutional dimensionality reduction layer is used, the parameter dimensionality reduction can be performed through 32 1×1 convolution kernels, and the convolutional dimensionality reduction layer requires parameters of 128×32×1×1, output 32 feature maps, and then use 32 3×3 convolution kernels to extract features, the parameters that need to be configured can be reduced to 32×32×3×3, as shown in the comparison table 2.

卷积层结构Convolutional layer structure 需要配置参数Need configuration parameters 直接进行特征提取feature extraction directly 128×32×3×3128×32×3×3 先降维再进行特征提取Dimensionality reduction before feature extraction 128×32×1×1+32×32×3×3128×32×1×1+32×32×3×3

表2Table 2

则通过图5所示的待训练的卷积神经网络模型结构示意图可见，用于对输入该卷积层的数据进行参数降维的卷积层与用于对输入该卷积层的数据进行特征提取的卷积层相邻，使得该待训练的卷积神经网络模型中总体所需的参数降低，可提高训练效率。Then, it can be seen from the structural diagram of the convolutional neural network model to be trained as shown in Figure 5 that the convolutional layer used to perform parameter dimensionality reduction on the data input to the convolutional layer and the feature used to perform feature analysis on the data input to the convolutional layer The extracted convolutional layers are adjacent to each other, so that the overall required parameters in the convolutional neural network model to be trained are reduced, and the training efficiency can be improved.

另外，在图5中可见第一至第四池化层，用于减少输入数据(如，特征图)进过卷积操作之后产生的信息冗余，以提高待训练的卷积神经网络模型算法的运行效率以及鲁棒性。In addition, the first to fourth pooling layers can be seen in Figure 5, which are used to reduce the information redundancy generated after the input data (such as feature maps) undergo convolution operations, so as to improve the convolutional neural network model algorithm to be trained operating efficiency and robustness.

进一步地，该损失层用于计算该第一图像集全部输入该待训练的卷积神经网络模型后，该待训练的卷积神经网络模型输出的图像识别结果的误差值，该正确率层用于计算该第一图像集全部输入该待训练的卷积神经网络模型后，该待训练的卷积神经网络模型输出的图像识别结果的正确率，其中，该损失层以及该正确率层均需要根据该输入图像中添加的标识计算该误差值和该正确率，例如，在计算该正确率时，需要根据该待训练的卷积神经网络模型对每一张输入图像的图像识别结果与该输入图像的标识进行对比，若一致，则正确，若不一致，则错误，并最后确定所有输入的图像中识别正确的图像所占比例，即为该待训练的卷积神经网络模型的图像识别正确率。Further, the loss layer is used to calculate the error value of the image recognition result output by the convolutional neural network model to be trained after all the first image set is input into the convolutional neural network model to be trained, and the accuracy layer uses After calculating the accuracy rate of the image recognition result output by the convolutional neural network model to be trained after all the first image set is input into the convolutional neural network model to be trained, both the loss layer and the accuracy layer need Calculate the error value and the correct rate according to the logo added in the input image. For example, when calculating the correct rate, it is necessary to compare the image recognition results of each input image with the input Compare the logos of the images, if they are consistent, it is correct, if they are inconsistent, it is wrong, and finally determine the proportion of correctly recognized images in all input images, which is the image recognition accuracy rate of the convolutional neural network model to be trained .

更进一步地，该损失层根据误差值利用与现有技术一致的梯度下降法结合学习率(即，步长)，反向调整该待训练的卷积神经网络模型中各层的参数。Furthermore, the loss layer reversely adjusts the parameters of each layer in the convolutional neural network model to be trained by using the gradient descent method consistent with the prior art combined with the learning rate (ie, step size) according to the error value.

进一步地，该待训练的卷积神经网络模型可以判断该损失层计算出的误差值是否达到第一阈值以及该正确率层计算出的图像识别正确率达是否到第二阈值，若均是，则确定该待训练的卷积神经网络模型已经训练完成，得到该训练完成的卷积神经网络模型，若至少一项为否，则可通过该损失层调整反向调整各层参数，并再次将该第一图像集中的各图像输入该待训练的卷积神经网络模型中，直到该损失层计算出的误差值达到该第一阈值以及该正确率层计算出的图像识别正确率达到该第二阈值为止，其中该第一阈值以及该第二阈值均可根据需求进行设定，本申请对此并不做具体限定。Further, the convolutional neural network model to be trained can judge whether the error value calculated by the loss layer reaches the first threshold and whether the image recognition accuracy calculated by the accuracy layer reaches the second threshold, if both, Then it is determined that the convolutional neural network model to be trained has been trained, and the trained convolutional neural network model is obtained. If at least one item is negative, the parameters of each layer can be adjusted in reverse through the loss layer adjustment, and the Each image in the first image set is input into the convolutional neural network model to be trained until the error value calculated by the loss layer reaches the first threshold and the image recognition accuracy calculated by the accuracy layer reaches the second threshold. threshold, where the first threshold and the second threshold can be set according to requirements, which is not specifically limited in this application.

进一步地，如图5所示，该待训练的卷积神经网络模型通过以下公式调整学习率：Further, as shown in Figure 5, the convolutional neural network model to be trained adjusts the learning rate by the following formula:

lr＝base_lr×γ×(floor(iter/stepsize))，其中，该lr为每次反向传播时的学习率(即，步长)，base_lr为初始化的学习率参数，stepsize和γ为常量，iter为迭代次数。lr=base_lr×γ×(floor(iter/stepsize)), where the lr is the learning rate (ie, step size) for each backpropagation, base_lr is the initialized learning rate parameter, stepsize and γ are constants, iter is the number of iterations.

另外，在本申请中，当该服务器通过步骤S102，得到该待识别图像的图像识别结果之后，为了增加该第一图像集中预设图像类别的图像的数量，该服务器还可以根据该图像识别结果，当确定该待识别图像的图像识别结果为该预设图像类别时，根据该待识别图像的与该指定色调匹配的图像区域，确定该待识别图像对应的中间图像，对该待识别图像对应的中间图像进行规范化处理后，将进行规范化处理后的该待识别图像对应的中间图像添加至该第一图像集中。In addition, in this application, after the server obtains the image recognition result of the image to be recognized through step S102, in order to increase the number of images of the preset image category in the first image set, the server can also , when it is determined that the image recognition result of the image to be recognized is the preset image category, according to the image area of the image to be recognized that matches the specified tone, determine the intermediate image corresponding to the image to be recognized, and the corresponding image to the image to be recognized After performing the normalization processing on the intermediate image, the intermediate image corresponding to the image to be recognized after the normalization processing is added to the first image set.

当然，确定图像识别结果后，如何利用该图像识别结果本申请并不做具体限定，上述仅是一种实施方式，并不构成对本申请的限定。Of course, after the image recognition result is determined, how to use the image recognition result is not specifically limited in this application, and the above is only an implementation manner, and does not constitute a limitation to this application.

进一步地，在本申请中并不限定该卷积神经网络模型中和各层的参数结构，如卷积层中的卷积核大小、卷积层的通道数量、池化层的通道数量、以及池化步长等等，也不限定具体的池化的方式，即，该卷积神经网络模型中各层参数结构可以根据需要进行设置，并申请对此并不限定。Further, this application does not limit the parameter structure of the convolutional neural network model and each layer, such as the size of the convolution kernel in the convolutional layer, the number of channels in the convolutional layer, the number of channels in the pooling layer, and The pooling step size and the like do not limit the specific pooling method, that is, the parameter structure of each layer in the convolutional neural network model can be set as required, and the application does not limit this.

需要说明的是，如图1所示的本申请实施例所提供方法的各步骤的执行主体均可以是同一设备，或者，该方法也可由不同设备作为执行主体。比如，步骤S1021和步骤S1022的执行主体可以为设备1，步骤S1023的执行主体可以为设备2；又比如，步骤S1021的执行主体可以为设备1，步骤S1022和步骤S1023的执行主体可以为设备2；等等。It should be noted that, as shown in FIG. 1 , each step of the method provided by the embodiment of the present application may be executed by the same device, or the method may be executed by different devices. For example, the execution subject of step S1021 and step S1022 may be device 1, and the execution subject of step S1023 may be device 2; for another example, the execution subject of step S1021 may be device 1, and the execution subject of step S1022 and step S1023 may be device 2 ;etc.

基于图2所示的图像识别过程，本申请实施例还对应提供一种图像识别装置，如图6所示。Based on the image recognition process shown in FIG. 2 , the embodiment of the present application also provides an image recognition device correspondingly, as shown in FIG. 6 .

图6是本申请实施例提供一种图像识别装置的结构示意图，包括：Fig. 6 is a schematic structural diagram of an image recognition device provided by an embodiment of the present application, including:

确定模块201，确定待识别图像；Determining module 201, determining the image to be recognized;

识别模块202，将所述待识别图像输入预先训练完成的图像分类器，得到所述图像分类器输出的针对所述待识别图像的识别结果，其中，所述图像分类器进行训练所用的第一图像集中的图像，是对样本图像中与指定色调匹配的图像区域进行规范化处理而得到的；The recognition module 202, inputs the image to be recognized into a pre-trained image classifier, and obtains a recognition result output by the image classifier for the image to be recognized, wherein the image classifier uses the first The images in the image set are obtained by normalizing the image regions in the sample image that match the specified hue;

所述装置还包括：The device also includes:

图像集确定模块203，确定由样本图像组成的第二图像集，针对每一个样本图像，根据该样本图像的HSV颜色模型，确定该样本图像中与所述指定色调匹配的图像区域，作为中间图像，对所有中间图像进行规范化处理，将所有进行规范化处理后的中间图像的集合，作为所述图像分类器进行训练所用的所述第一图像集。The image set determination module 203 determines a second image set composed of sample images, and for each sample image, according to the HSV color model of the sample image, determines an image area in the sample image that matches the specified tone as an intermediate image , performing normalization processing on all intermediate images, and using a set of all intermediate images after normalization processing as the first image set used for training the image classifier.

所述图像分类器包括：卷积神经网络模型。The image classifier includes: a convolutional neural network model.

所述装置还包括：训练模块204，训练所述卷积神经网络模型：确定待训练的卷积神经网络对应的各层对应的初始化参数，循环执行下述步骤，直至所述待训练的卷积神经网络模型输出的误差值达到第一阈值以及图像识别正确率达到第二阈值为止，所述卷积神经网络模型训练完成：将所述第一图像集中的各图像依次输入所述待训练的卷积神经网络模型，使得通过所述待训练的卷积神经网络模型对输入的所述训练图像的特征进行向前传播至输出层，计算输出的所述误差值以及所述图像识别正确率，根据所述误差值从输出层反向调整所述待训练的卷积神经网络模型中的各层对应的参数。The device also includes: a training module 204, training the convolutional neural network model: determining the initialization parameters corresponding to each layer of the convolutional neural network to be trained, and performing the following steps in a loop until the convolutional neural network to be trained Until the error value output by the neural network model reaches the first threshold and the correct rate of image recognition reaches the second threshold, the training of the convolutional neural network model is completed: each image in the first image set is sequentially input into the volume to be trained A product neural network model, so that the features of the input training image are propagated forward to the output layer through the convolutional neural network model to be trained, and the error value and the image recognition accuracy rate of the output are calculated, according to The error value reversely adjusts parameters corresponding to each layer in the convolutional neural network model to be trained from the output layer.

所述卷积神经网络模型中包含至少一个卷积层，用于对输入所述卷积层的数据进行参数降维，针对每一个所述卷积层，与当前卷积层相邻的卷积层用于对输入本卷积层的数据进行特征提取。The convolutional neural network model includes at least one convolutional layer, which is used to perform parameter dimensionality reduction on the data input to the convolutional layer. For each convolutional layer, the convolutional layer adjacent to the current convolutional layer The layer is used to extract features from the data input to this convolutional layer.

所述图像集确定模块204，当所述识别模块202确定所述待识别图像的图像识别结果为所述预设图像类别时，将所述待识别图像中与所述指定色调匹配的图像区域进行规范化处理后，添加至所述第一图像集中，并使用添加了所述待识别图像的第一图像集重新训练所述图像分类器。The image set determination module 204, when the recognition module 202 determines that the image recognition result of the image to be recognized is the preset image category, performs an image region matching the specified color tone in the image to be recognized After normalization processing, add to the first image set, and use the first image set added with the image to be recognized to retrain the image classifier.

所述预设图像类别为色情图像类别，所述色情图像类别对应的所述指定色调的色调值范围为0至116。The preset image category is a pornographic image category, and the hue value range of the specified hue corresponding to the pornographic image category is 0 to 116.

具体的，上述如图6所示的图像识别装置可以位于一台设备中，也位于由多台设备组成的系统中。Specifically, the above-mentioned image recognition apparatus as shown in FIG. 6 may be located in one device, or in a system composed of multiple devices.

本领域内的技术人员应明白，本发明的实施例可提供为方法、系统、或计算机程序产品。因此，本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention may be provided as methods, systems, or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

在一个典型的配置中，计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

内存可能包括计算机可读介质中的非永久性存储器，随机存取存储器(RAM)和/或非易失性内存等形式，如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read only memory (ROM) or flash RAM. Memory is an example of computer readable media.

计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括，但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带，磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质，可用于存储可以被计算设备访问的信息。按照本文中的界定，计算机可读介质不包括暂存电脑可读媒体(transitory media)，如调制的数据信号和载波。Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer-readable media excludes transitory computer-readable media, such as modulated data signals and carrier waves.

还需要说明的是，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus comprising a set of elements includes not only those elements, but also includes Other elements not expressly listed, or elements inherent in the process, method, commodity, or apparatus are also included. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or apparatus comprising said element.

本领域技术人员应明白，本申请的实施例可提供为方法、系统或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems or computer program products. Accordingly, the present application can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

以上所述仅为本申请的实施例而已，并不用于限制本申请。对于本领域技术人员来说，本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等，均应包含在本申请的权利要求范围之内。The above descriptions are only examples of the present application, and are not intended to limit the present application. For those skilled in the art, various modifications and changes may occur in this application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application shall be included within the scope of the claims of the present application.

Claims

1. An image recognition method, characterized in that, comprising:

Determine the image to be recognized;

Inputting the image to be recognized into a pre-trained image classifier to obtain a recognition result output by the image classifier for the image to be recognized, wherein the images in the first image set used by the image classifier for training , which is obtained by normalizing image regions in the sample image that match the specified hue;

The specified hue is determined according to the hue of the image of the preset image category.

2. The method according to claim 1, wherein the first set of images is obtained by the following method:

determining a second image set consisting of sample images;

For each sample image in the second image set, according to the hue-saturation-brightness HSV color model of the sample image, determine an image region in the sample image that matches the specified hue; and

determining an intermediate image corresponding to the sample image according to an image area in the sample image that matches the specified tone;

Normalize all intermediate images;

A collection of all intermediate images after normalization processing is used as the first image set.

3. The method of claim 1, wherein the image classifier comprises a convolutional neural network model.

4. The method according to claim 3, wherein the convolutional neural network model is trained in the following manner:

Determine the initialization parameters of each layer corresponding to the convolutional neural network model to be trained;

The following steps are cyclically executed until the error value output by the convolutional neural network model model to be trained reaches the first threshold and the image recognition accuracy reaches the second threshold, and the training of the convolutional neural network model is completed:

Each image in the first image set is sequentially input into the convolutional neural network model to be trained, so that the features of the input images are propagated forward to the output layer through the convolutional neural network model to be trained, Calculate and output the error value and the image recognition accuracy rate, and reversely adjust the parameters corresponding to each layer in the convolutional neural network model to be trained from the output layer based on the initialization parameters according to the error value.

5. the method for claim 4, is characterized in that, comprises at least one convolutional layer in the described convolutional neural network model, is used to carry out parameter dimensionality reduction to the data of inputting described convolutional layer;

For each convolutional layer, the convolutional layer adjacent to the current convolutional layer is used to extract features from the data input to this convolutional layer.

6. The method of claim 1, further comprising:

When it is determined that the image recognition result of the image to be recognized is that the image to be recognized belongs to the preset image category, the image area in the image to be recognized that matches the specified tone is subjected to normalization processing, and then added to the said first set of images; and

The image classifier is retrained using the first image set added with the image to be recognized.

7. The method according to claim 1, wherein the preset image category is a pornographic image category;

The hue value range of the specified hue corresponding to the pornographic image category is 0 to 116.

8. An image recognition device, characterized in that it comprises:

Determine the module and determine the image to be recognized;

A recognition module, inputting the image to be recognized into a pre-trained image classifier to obtain a recognition result output by the image classifier for the image to be recognized, wherein the first image used by the image classifier for training A concentrated image, obtained by normalizing the image regions in the sample image that match the specified hue;

9. The device of claim 1, further comprising:

The image set determination module determines a second image set composed of sample images, and for each sample image, according to the hue-saturation-brightness HSV color model of the sample image, determines the image area in the sample image that matches the specified hue, and determining an intermediate image corresponding to the sample image according to the image area in the sample image that matches the specified tone, performing normalization processing on all intermediate images, and using a set of all normalized intermediate images as the image The first image set used by the classifier for training.

10. The apparatus according to claim 1, wherein the image classifier comprises: a convolutional neural network model.