CN111862061A

CN111862061A - Methods, systems, devices and media for assessing the aesthetic quality of pictures

Info

Publication number: CN111862061A
Application number: CN202010730785.3A
Authority: CN
Inventors: 梅陈; 蔡曙光; 申思; 肖铨武
Original assignee: Ctrip Travel Network Technology Shanghai Co Ltd
Current assignee: Ctrip Travel Network Technology Shanghai Co Ltd
Priority date: 2020-07-27
Filing date: 2020-07-27
Publication date: 2020-10-30

Abstract

The invention discloses an evaluation method, system, equipment and medium of picture aesthetic quality, wherein the evaluation method of picture aesthetic quality includes the following steps: integrating and labeling picture data sets; extracting deep learning features of pictures according to the picture data sets; Character features and puzzle features; Semantic segmentation of pictures, and extraction of centroid features and angle features according to the results of semantic segmentation; Obtaining the color saturation of pictures; Saturation is trained as a training feature to obtain a prediction model; the aesthetic quality of the picture is evaluated according to the prediction model. The invention improves the accuracy of aesthetic quality evaluation.

Description

Methods, systems, devices and media for assessing the aesthetic quality of pictures

技术领域technical field

本发明属于图片的美学质量评估技术领域，尤其涉及一种图片美学质量的评估方法、系统、设备和介质。The invention belongs to the technical field of picture aesthetic quality assessment, and in particular relates to a picture aesthetic quality assessment method, system, device and medium.

背景技术Background technique

随着时代的发展，科技的进步，拍照已经是普通人都可以随时随地都能做到的一件事情，而网络时代的发展也使得照片能够以数字化的形式在世界的各个角落传播。那么选出一张好看的照片，筛掉一部分不好看的照片是一个的应用方向，尤其是在app(应用程序)的旅拍界面当中。通过选择好看的图片筛掉不好看的图片，并将好看的图片置于首页，就可以向浏览者展示一些能够吸引人的美景或者美食，进而吸引浏览者前往这些地方消费。推送图片正确性以及筛选图片的效率决定了app的工作效率。With the development of the times and the advancement of science and technology, taking pictures has become a thing that ordinary people can do anytime, anywhere, and the development of the Internet age has also enabled photos to be disseminated in all corners of the world in a digital form. Then selecting a good-looking photo and filtering out some unsightly photos is an application direction, especially in the travel photography interface of the app (application). By selecting good-looking pictures to filter out unsightly pictures, and placing good-looking pictures on the homepage, it is possible to show visitors some attractive scenery or food, and then attract visitors to these places for consumption. The correctness of pushing pictures and the efficiency of filtering pictures determine the work efficiency of the app.

其实图片美感度这一项研究在计算机视觉或者机器学习领域有一个通俗的说法，即计算机美学。让计算机来对图片进行审美，听起来这是一个比较主观的事情，让计算机来做这样一件人类都有些不太确定的事情看起来是不太现实的，但是现实当中确实有很多课题对这方面进行了研究，并且有着不错的成果。In fact, the study of image aesthetics has a popular saying in the field of computer vision or machine learning, that is, computer aesthetics. It sounds like it is a relatively subjective thing to let a computer do aesthetics on pictures. It seems unrealistic to let a computer do such a thing that humans are a little uncertain about, but in reality, there are indeed many topics on this subject. Research has been carried out in this regard, and has achieved good results.

计算机美学即计算机对图片进行审美，可以是输出一个分数，也可以是对图片分好看或者不好看。还有另外一个课题叫做图片质量评估与计算机审美也是相关的，广义上来说图片质量评估也是计算机美学的一部分。图片质量评估主要对图片的噪声、饱和度等因素进行评估，计算机美学评分的话则跟数据集的已有评分进行比较。摄影当中讲究对称和黄金分割定律，黄金分割定律即在摄影的时候考虑将图片中的物体位置置于黄金分割点，这样拍出来的照片符合人们的审美。2010年Pere Obrador(人名)的研究通过提取图片的质心等相关的特征并搭建模型，最高准确率只能达到70％左右。随着卷积神经网络大行其道之后，美学评分开始渐渐突破瓶颈。2017年由布法罗大学和天津大学联合研究的一项课题，他们发表的一篇名为A-Lamp(文章名)的文章中采取随机扣取图片中的patch(斑点)的方法提取特征，使用卷积网络VGG(一种模型)搭建模型，最终可达到的最高准确率有81％左右。2018年Hossein Talebi(人名)搭建了一个叫做NIMA(神经图像评估)的模型，该模型一反常态而从数据集入手，选择AVA(一种数据集)数据集，该数据集中的每张图片都被分为10个等级分数，并由250位以上的标注员进行打分，每个人可以选择1到10分中的合适的分数，并且每张图片的分数最终都可以用一个柱状图来表示，他们在损失函数上进行改进，模型搭建上面依然采用卷积神经网络，最终最高准确率可以达到82％左右。Computer aesthetics means that the computer aesthetics the picture, which can output a score, or it can be divided into good-looking or bad-looking. There is another topic called image quality assessment that is also related to computer aesthetics. In a broad sense, image quality assessment is also a part of computer aesthetics. The image quality evaluation mainly evaluates the noise, saturation and other factors of the image, and the computer aesthetics score is compared with the existing score of the data set. In photography, we pay attention to symmetry and the law of golden ratio. The law of golden ratio is to consider the position of the object in the picture at the golden ratio point, so that the photos taken are in line with people's aesthetics. In 2010, Pere Obrador's research (person's name) extracted relevant features such as the centroid of the image and built a model, and the highest accuracy rate could only reach about 70%. With the popularity of convolutional neural networks, aesthetic scores began to gradually break through the bottleneck. A topic jointly researched by the University of Buffalo and Tianjin University in 2017, they published an article called A-Lamp (article name) by randomly deducting patches (spots) in the picture to extract features, using The convolutional network VGG (a model) builds the model, and the final highest accuracy rate that can be achieved is about 81%. In 2018, Hossein Talebi (person's name) built a model called NIMA (Neural Image Assessment), which uncharacteristically starts with the data set and selects the AVA (a data set) data set, in which each image in the data set is classified. Scored for 10 levels and scored by more than 250 annotators, each person can choose an appropriate score from 1 to 10 points, and the score of each image can finally be represented by a histogram, they are in loss The function is improved, and the convolutional neural network is still used in the model construction, and the final highest accuracy rate can reach about 82%.

发明内容SUMMARY OF THE INVENTION

本发明要解决的技术问题是为了克服现有技术中图片的美学质量评估准确率较低的缺陷，提供一种图片美学质量的评估方法、系统、设备和介质。The technical problem to be solved by the present invention is to provide a method, system, device and medium for evaluating the aesthetic quality of a picture in order to overcome the defect of low accuracy in evaluating the aesthetic quality of a picture in the prior art.

本发明是通过下述技术方案来解决上述技术问题：The present invention solves the above-mentioned technical problems through the following technical solutions:

本发明提供一种图片美学质量的评估方法，包括以下步骤：The present invention provides a method for evaluating the aesthetic quality of a picture, comprising the following steps:

整合并标注图片数据集；Integrate and label image datasets;

根据图片数据集提取图片的深度学习特征；Extract the deep learning features of pictures according to the picture dataset;

提取图片的压字特征和拼图特征；Extract the embossed features and puzzle features of the pictures;

对图片进行语义分割，根据语义分割的结果提取质心特征和角度特征；Semantic segmentation is performed on the image, and centroid features and angle features are extracted according to the results of semantic segmentation;

获取图片的色彩饱和度；Get the color saturation of the picture;

将深度学习特征、压字特征、拼图特征、质心特征、角度特征和色彩饱和度作为训练特征进行训练，以得到预测模型；Train deep learning features, embossing features, puzzle features, centroid features, angle features, and color saturation as training features to obtain a prediction model;

根据预测模型对图片的美感质量进行评估。Assess the aesthetic quality of the image based on the predictive model.

较佳地，根据图片数据集提取图片的深度学习特征的步骤包括：Preferably, the step of extracting the deep learning feature of the picture according to the picture data set includes:

基于Resnet(一种模型)模型、xception(一种模型)模型、VGG模型提取深度学习特征；Extract deep learning features based on Resnet (a model) model, xception (a model) model, and VGG model;

提取图片的压字特征和拼图特征的步骤包括：The steps of extracting the embossed features and puzzle features of the pictures include:

基于yolo(一种模型)模型和resnet(一种模型)模型提取压字和拼图特征。The embossed and puzzle features are extracted based on the yolo (one model) model and the resnet (one model) model.

较佳地，对图片进行语义分割包括：Preferably, performing semantic segmentation on the picture includes:

基于pspnet(一种模型)模型对图片进行语义分割。Semantic segmentation of images based on pspnet (a model) model.

较佳地，深度学习特征还包括基于NIMA(一种模型)模型提取的图片的预测分数均值和偏差。Preferably, the deep learning feature further includes the mean and deviation of prediction scores of pictures extracted based on the NIMA (a kind of model) model.

本发明还提供一种图片美学质量的评估系统，包括数据集生成单元、特征提取单元、训练单元、评估单元；The present invention also provides an evaluation system for image aesthetic quality, including a data set generation unit, a feature extraction unit, a training unit, and an evaluation unit;

数据集生成单元用于整合并标注图片数据集；The dataset generation unit is used to integrate and label image datasets;

特征提取单元用于根据图片数据集提取图片的深度学习特征；The feature extraction unit is used to extract the deep learning features of the picture according to the picture data set;

特征提取单元还用于提取图片的压字特征和拼图特征；The feature extraction unit is also used to extract the embossing feature and puzzle feature of the picture;

特征提取单元还用于对图片进行语义分割，根据语义分割的结果提取质心特征和角度特征；The feature extraction unit is also used to perform semantic segmentation on the image, and extract the centroid feature and the angle feature according to the result of the semantic segmentation;

特征提取单元还用于获取图片的色彩饱和度；The feature extraction unit is also used to obtain the color saturation of the picture;

训练单元用于将深度学习特征、压字特征、拼图特征、质心特征、角度特征和色彩饱和度作为训练特征进行训练，以得到预测模型；The training unit is used to train deep learning features, embossing features, puzzle features, centroid features, angle features and color saturation as training features to obtain a prediction model;

评估单元用于根据预测模型对图片的美感质量进行评估。The evaluation unit is used to evaluate the aesthetic quality of the picture according to the prediction model.

较佳地，特征提取单元还用于基于Resnet模型、xception模型、VGG模型提取深度学习特征；Preferably, the feature extraction unit is also used to extract deep learning features based on the Resnet model, the xception model, and the VGG model;

特征提取单元还用于基于yolo模型和resnet模型提取压字和拼图特征。The feature extraction unit is also used to extract embossed and puzzle features based on the yolo model and the resnet model.

较佳地，特征提取单元还用于基于pspnet模型对图片进行语义分割。Preferably, the feature extraction unit is further configured to perform semantic segmentation on the picture based on the pspnet model.

较佳地，深度学习特征还包括基于NIMA模型提取的图片的预测分数均值和偏差。Preferably, the deep learning feature further includes the mean and deviation of prediction scores of pictures extracted based on the NIMA model.

本发明还提供一种电子设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，处理器执行计算机程序时实现本发明的图片美学质量的评估方法。The present invention also provides an electronic device, comprising a memory, a processor and a computer program stored in the memory and running on the processor, when the processor executes the computer program, the method for evaluating the aesthetic quality of a picture of the present invention is implemented.

本发明还提供一种计算机可读存储介质，其上存储有计算机程序，计算机程序被处理器执行时实现本发明的图片美学质量的评估方法的步骤。The present invention also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps of the method for evaluating the aesthetic quality of a picture of the present invention.

本发明的积极进步效果在于：本发明提高了美学质量评估准确率。The positive progressive effect of the present invention is that the present invention improves the accuracy of aesthetic quality assessment.

附图说明Description of drawings

图1为本发明的实施例1的图片美学质量的评估方法的流程图。FIG. 1 is a flowchart of a method for evaluating the aesthetic quality of a picture according to Embodiment 1 of the present invention.

图2为本发明的实施例2的图片美学质量的评估系统的结构示意图。FIG. 2 is a schematic structural diagram of a picture aesthetic quality evaluation system according to Embodiment 2 of the present invention.

图3为本发明的实施例3的电子设备的结构示意图。FIG. 3 is a schematic structural diagram of an electronic device according to Embodiment 3 of the present invention.

具体实施方式Detailed ways

下面通过实施例的方式进一步说明本发明，但并不因此将本发明限制在所述的实施例范围之中。The present invention is further described below by way of examples, but the present invention is not limited to the scope of the described examples.

实施例1Example 1

本实施例提供一种图片美学质量的评估方法。参照图1，该图片美学质量的评估方法包括以下步骤：This embodiment provides a method for evaluating the aesthetic quality of a picture. Referring to Figure 1, the evaluation method for the aesthetic quality of the picture includes the following steps:

步骤S101、整合并标注图片数据集。Step S101, integrating and labeling the image data set.

步骤S102、根据图片数据集提取图片的深度学习特征。Step S102, extracting the deep learning feature of the picture according to the picture data set.

步骤S103、提取压字特征和拼图特征。Step S103, extracting embossing features and puzzle features.

步骤S104、对图片进行语义分割，根据语义分割的结果提取质心特征和角度特征。Step S104 , perform semantic segmentation on the picture, and extract centroid features and angle features according to the results of the semantic segmentation.

步骤S105、获取图片的色彩饱和度。Step S105, acquiring the color saturation of the picture.

步骤S106、将深度学习特征、压字特征、拼图特征、质心特征、角度特征和色彩饱和度作为训练特征进行训练，以得到预测模型。Step S106 , using deep learning features, embossing features, puzzle features, centroid features, angle features, and color saturation as training features for training to obtain a prediction model.

步骤S107、根据预测模型对图片的美感质量进行评估。Step S107: Evaluate the aesthetic quality of the picture according to the prediction model.

具体实施时，在步骤S101中，整合并标注图片数据集。在一种可选的实施方式中，图片数据集来自计算机网络。图片数据集所包含的图片为旅拍图片。During specific implementation, in step S101, the image data set is integrated and marked. In an optional embodiment, the image dataset comes from a computer network. The pictures included in the picture dataset are travel pictures.

作为一种可选的实施方式，在步骤S102中，使用Resnet、xception、VGG等模型提取深度学习特征。As an optional implementation manner, in step S102, deep learning features are extracted using models such as Resnet, xception, and VGG.

作为一种可选的实施方式，在步骤S103中，使用yolo和resnet提取压字和拼图特征。As an optional implementation manner, in step S103, yolo and resnet are used to extract embossing and puzzle features.

作为一种可选的实施方式，在步骤S104中，使用pspnet进行语义分割，使用语义分割的结果使用数学公式提取质心和角度等特征。利用提取出来的质心位置计算距离黄金分割点的距离，距离小的说明符合美学的黄金分割定律认为是质量较高的图片，比较质心的重要度并删除重要度较低的特征和空值较多的特征以简化模型训练过程。As an optional implementation manner, in step S104, semantic segmentation is performed using pspnet, and features such as centroid and angle are extracted using mathematical formulas using the results of the semantic segmentation. Use the extracted centroid position to calculate the distance from the golden section point. A small distance indicates that it conforms to the golden section law of aesthetics and is considered to be a picture with higher quality. Compare the importance of the centroid and delete the features with low importance and many null values. features to simplify the model training process.

在一种可选的实施方式中，直接对原始图片进行处理，使用数学公式计算得到图片的色彩饱和度，认为色彩饱和度高的图片质量较高。In an optional implementation manner, the original picture is directly processed, and the color saturation of the picture is calculated by using a mathematical formula, and it is considered that a picture with high color saturation is of higher quality.

在一种可选的实施方式中，深度学习特征还包括基于NIMA模型提取的图片的预测分数均值和偏差。NIMA是近几年来评估图片质量较高的深度学习模型。In an optional embodiment, the deep learning feature further includes the mean and deviation of prediction scores of pictures extracted based on the NIMA model. NIMA is a deep learning model that evaluates image quality with high quality in recent years.

在步骤S106中，使用Resnet训练特征并进行特征筛选，得到最优预测模型，在进行多次调参得到最优参数。具体实施时，采用finetune(微调)的方式提取预训练模型的倒数第二层特征。测试模型效果，选择最优模型为预测模型。In step S106, use Resnet to train features and perform feature screening to obtain an optimal prediction model, and perform multiple parameter adjustments to obtain optimal parameters. In specific implementation, the penultimate layer features of the pre-training model are extracted by means of finetune. Test the model effect and select the optimal model as the prediction model.

在步骤S107中，根据预测模型对图片的美感质量进行评估。In step S107, the aesthetic quality of the picture is evaluated according to the prediction model.

现有技术中，旅拍的图片与开源的图片数据集(例如AVA)中图片的类型及质量等有很大的差异，直接对开源的数据集构建的模型进行迁移，效果并不好。本实施例的图片美学质量的评估方法，对旅拍图片进行人工标注，重新训练模型，并将开源数据集训练的模型结果进行迁移，作为训练模型的特征。In the prior art, there are great differences in the type and quality of pictures taken by travel and open-source picture datasets (such as AVA), and the effect of directly migrating the models constructed from the open-source datasets is not good. In the method for evaluating the aesthetic quality of pictures in this embodiment, the travel photos are manually marked, the model is retrained, and the model results trained on the open source data set are transferred as the features of the training model.

现有技术中，离线提取深度学习图片特征的时候，人工构造的数据集量较小，不能较好的提供有效的特征。本实施例的图片美学质量的评估方法采用finetune的方式提取预训练模型的倒数第二层特征。In the prior art, when extracting deep learning image features offline, the amount of artificially constructed data sets is small and cannot provide effective features well. The method for evaluating the aesthetic quality of an image in this embodiment uses a finetune method to extract the features of the penultimate layer of the pre-training model.

现有技术中，图片中的物体颜色各异，直接利用像素计算物体的质心难以实现。本实施例的图片美学质量的评估方法调用PSP net模型进行语义分割，使图片中的物体颜色统一，计算物体质心位置和角度，作为训练模型的特征。In the prior art, objects in a picture have different colors, and it is difficult to directly calculate the centroid of an object by using pixels. The method for evaluating the aesthetic quality of a picture in this embodiment calls the PSP net model to perform semantic segmentation, so that the colors of the objects in the picture are unified, and the position and angle of the centroid of the object are calculated as the features of the training model.

现有技术中，旅拍图片中存在部分压字、拼图等类别的图片，需要进行特殊处理。本实施例的图片美学质量的评估方法建立压字、拼图模型，对这些类别图片进行判别，得到判别分，作为训练模型的特征。In the prior art, there are some pictures in categories such as embossed characters and jigsaw puzzles in the travel pictures, which need to be specially processed. The method for evaluating the aesthetic quality of a picture in this embodiment establishes embossing and jigsaw models, and discriminates these categories of pictures to obtain a discriminant score, which is used as a feature of the training model.

本实施例的图片美学质量的评估方法结合精确提取传统特征和深度学习卷积神经网络两个重点，采用数据库进行模型训练，力求突破现存的瓶颈，提升模型的分类预测准确率。本实施例的图片美学质量的评估方法的准确率最高可达92.4％。The image aesthetic quality evaluation method of this embodiment combines the two focuses of accurate extraction of traditional features and deep learning of convolutional neural networks, and uses a database for model training, striving to break through the existing bottleneck and improve the classification and prediction accuracy of the model. The accuracy rate of the method for evaluating the aesthetic quality of a picture in this embodiment can reach up to 92.4%.

实施例2Example 2

本实施例提供一种图片美学质量的评估系统。参照图2，该图片美学质量的评估系统包括数据集生成单元201、特征提取单元202、训练单元203、评估单元204。This embodiment provides an evaluation system for the aesthetic quality of pictures. Referring to FIG. 2 , the image aesthetic quality evaluation system includes a data set generation unit 201 , a feature extraction unit 202 , a training unit 203 , and an evaluation unit 204 .

数据集生成单元201用于整合并标注图片数据集；The dataset generating unit 201 is used for integrating and labeling the image dataset;

特征提取单元202用于根据图片数据集提取图片的深度学习特征；The feature extraction unit 202 is used to extract the deep learning feature of the picture according to the picture data set;

特征提取单元202还用于提取图片的压字特征和拼图特征；The feature extraction unit 202 is also used to extract the embossing feature and the puzzle feature of the picture;

特征提取单元202还用于对图片进行语义分割，根据语义分割的结果提取质心特征和角度特征；The feature extraction unit 202 is also used to perform semantic segmentation on the picture, and extract the centroid feature and the angle feature according to the result of the semantic segmentation;

特征提取单元202还用于获取图片的色彩饱和度；The feature extraction unit 202 is also used to obtain the color saturation of the picture;

训练单元203用于将深度学习特征、压字特征、拼图特征、质心特征、角度特征和色彩饱和度作为训练特征进行训练，以得到预测模型；The training unit 203 is used to train the deep learning feature, embossing feature, puzzle feature, centroid feature, angle feature and color saturation as training features to obtain a prediction model;

评估单元204用于根据预测模型对图片的美感质量进行评估。The evaluation unit 204 is configured to evaluate the aesthetic quality of the picture according to the prediction model.

具体实施时，数据集生成单元201整合并标注图片数据集。在一种可选的实施方式中，图片数据集来自计算机网络。图片数据集所包含的图片为旅拍图片。During specific implementation, the data set generating unit 201 integrates and annotates the image data set. In an optional embodiment, the image dataset comes from a computer network. The pictures included in the picture dataset are travel pictures.

作为一种可选的实施方式，特征提取单元202202基于Resnet、xception、VGG等模型提取深度学习特征。As an optional implementation manner, the feature extraction unit 202202 extracts deep learning features based on models such as Resnet, xception, and VGG.

作为一种可选的实施方式，特征提取单元202202基于yolo和resnet提取压字和拼图特征。As an optional implementation manner, the feature extraction unit 202202 extracts embossed and puzzle features based on yolo and resnet.

作为一种可选的实施方式，特征提取单元202202基于pspnet进行语义分割，使用语义分割的结果使用数学公式提取质心和角度等特征。利用提取出来的质心位置计算距离黄金分割点的距离，距离小的说明符合美学的黄金分割定律认为是质量较高的图片，比较质心的重要度并删除重要度较低的特征和空值较多的特征以简化模型训练过程。As an optional implementation manner, the feature extraction unit 202202 performs semantic segmentation based on pspnet, and uses the result of the semantic segmentation to extract features such as centroid and angle using mathematical formulas. Use the extracted centroid position to calculate the distance from the golden section point. A small distance indicates that it conforms to the golden section law of aesthetics and is considered to be a picture with higher quality. Compare the importance of the centroid and delete the features with low importance and many null values. features to simplify the model training process.

训练单元203使用Resnet训练特征并进行特征筛选，得到最优预测模型，在进行多次调参得到最优参数。具体实施时，采用finetune(微调)的方式提取预训练模型的倒数第二层特征。测试模型效果，选择最优模型为预测模型。The training unit 203 uses Resnet to train features and perform feature screening to obtain an optimal prediction model, and to obtain optimal parameters after performing multiple parameter adjustments. In specific implementation, the penultimate layer features of the pre-training model are extracted by means of finetune. Test the model effect and select the optimal model as the prediction model.

评估单元204根据预测模型对图片的美感质量进行评估。The evaluation unit 204 evaluates the aesthetic quality of the picture according to the prediction model.

本实施例的图片美学质量的评估系统结合精确提取传统特征和深度学习卷积神经网络两个重点，采用数据库进行模型训练，力求突破现存的瓶颈，提升模型的分类预测准确率。本实施例的图片美学质量的评估系统的准确率最高可达92.4％。The image aesthetic quality evaluation system of this embodiment combines the two focuses of accurately extracting traditional features and deep learning convolutional neural networks, and uses a database for model training, striving to break through existing bottlenecks and improve the accuracy of model classification and prediction. The accuracy rate of the image aesthetic quality evaluation system of this embodiment can reach up to 92.4%.

实施例3Example 3

图3为本实施例提供的一种电子设备的结构示意图。所述电子设备包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现实施例1的图片美学质量的评估方法。图3显示的电子设备30仅仅是一个示例，不应对本发明实施例的功能和使用范围带来任何限制。FIG. 3 is a schematic structural diagram of an electronic device provided in this embodiment. The electronic device includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, the method for evaluating the aesthetic quality of a picture in Embodiment 1 is implemented. The electronic device 30 shown in FIG. 3 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present invention.

电子设备30可以以通用计算设备的形式表现，例如其可以为服务器设备。电子设备30的组件可以包括但不限于：上述至少一个处理器31、上述至少一个存储器32、连接不同系统组件(包括存储器32和处理器31)的总线33。The electronic device 30 may take the form of a general-purpose computing device, which may be, for example, a server device. Components of the electronic device 30 may include, but are not limited to, the above-mentioned at least one processor 31 , the above-mentioned at least one memory 32 , and a bus 33 connecting different system components (including the memory 32 and the processor 31 ).

总线33包括数据总线、地址总线和控制总线。The bus 33 includes a data bus, an address bus and a control bus.

存储器32可以包括易失性存储器，例如随机存取存储器(RAM)321和/或高速缓存存储器322，还可以进一步包括只读存储器(ROM)323。Memory 32 may include volatile memory, such as random access memory (RAM) 321 and/or cache memory 322 , and may further include read only memory (ROM) 323 .

存储器32还可以包括具有一组(至少一个)程序模块324的程序/实用工具325，这样的程序模块324包括但不限于：操作系统、一个或者多个应用程序、其它程序模块以及程序数据，这些示例中的每一个或某种组合中可能包括网络环境的实现。The memory 32 may also include a program/utility 325 having a set (at least one) of program modules 324 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, which An implementation of a network environment may be included in each or some combination of the examples.

处理器31通过运行存储在存储器32中的计算机程序，从而执行各种功能应用以及数据处理，例如本发明实施例1的图片美学质量的评估方法。The processor 31 executes various functional applications and data processing by running the computer program stored in the memory 32, such as the evaluation method for the aesthetic quality of the picture in Embodiment 1 of the present invention.

电子设备30也可以与一个或多个外部设备34(例如键盘、指向设备等)通信。这种通信可以通过输入/输出(I/O)接口35进行。并且，模型生成的设备30还可以通过网络适配器36与一个或者多个网络(例如局域网(LAN)，广域网(WAN)和/或公共网络，例如因特网)通信。如图所示，网络适配器36通过总线33与模型生成的设备30的其它模块通信。应当明白，尽管图中未示出，可以结合模型生成的设备30使用其它硬件和/或软件模块，包括但不限于：微代码、设备驱动器、冗余处理器、外部磁盘驱动阵列、RAID(磁盘阵列)系统、磁带驱动器以及数据备份存储系统等。The electronic device 30 may also communicate with one or more external devices 34 (eg, keyboards, pointing devices, etc.). Such communication may take place through input/output (I/O) interface 35 . Also, the model-generating device 30 may also communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 36 . As shown, the network adapter 36 communicates with the other modules of the model generation device 30 via the bus 33 . It should be understood that, although not shown in the figures, other hardware and/or software modules may be used in conjunction with the model-generated device 30, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk) array) systems, tape drives, and data backup storage systems.

应当注意，尽管在上文详细描述中提及了电子设备的若干单元/模块或子单元/模块，但是这种划分仅仅是示例性的并非强制性的。实际上，根据本发明的实施方式，上文描述的两个或更多单元/模块的特征和功能可以在一个单元/模块中具体化。反之，上文描述的一个单元/模块的特征和功能可以进一步划分为由多个单元/模块来具体化。It should be noted that although several units/modules or sub-units/modules of the electronic device are mentioned in the above detailed description, this division is merely exemplary and not mandatory. Indeed, the features and functions of two or more units/modules described above may be embodied in one unit/module according to embodiments of the present invention. Conversely, the features and functions of one unit/module described above may be further subdivided to be embodied by multiple units/modules.

实施例4Example 4

本实施例提供了一种计算机可读存储介质，其上存储有计算机程序，所述程序被处理器执行时实现实施例1的图片美学质量的评估方法的步骤。This embodiment provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the steps of the method for evaluating the aesthetic quality of a picture in Embodiment 1.

其中，可读存储介质可以采用的更具体可以包括但不限于：便携式盘、硬盘、随机存取存储器、只读存储器、可擦拭可编程只读存储器、光存储器件、磁存储器件或上述的任意合适的组合。Wherein, the readable storage medium may include, but is not limited to, a portable disk, a hard disk, a random access memory, a read-only memory, an erasable programmable read-only memory, an optical storage device, a magnetic storage device, or any of the above suitable combination.

在可能的实施方式中，本发明还可以实现为一种程序产品的形式，其包括程序代码，当所述程序产品在终端设备上运行时，所述程序代码用于使所述终端设备执行实现实施例1的图片美学质量的评估方法的步骤。In a possible implementation manner, the present invention can also be implemented in the form of a program product, which includes program codes, when the program product runs on a terminal device, the program code is used to cause the terminal device to execute the implementation Steps of the method for evaluating the aesthetic quality of pictures in Example 1.

其中，可以以一种或多种程序设计语言的任意组合来编写用于执行本发明的程序代码，所述程序代码可以完全地在用户设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户设备上部分在远程设备上执行或完全在远程设备上执行。Wherein, the program code for executing the present invention can be written in any combination of one or more programming languages, and the program code can be completely executed on the user equipment, partially executed on the user equipment, as an independent The software package executes on the user's device, partly on the user's device, partly on the remote device, or entirely on the remote device.

虽然以上描述了本发明的具体实施方式，但是本领域的技术人员应当理解，这仅是举例说明，本发明的保护范围是由所附权利要求书限定的。本领域的技术人员在不背离本发明的原理和实质的前提下，可以对这些实施方式做出多种变更或修改，但这些变更和修改均落入本发明的保护范围。Although the specific embodiments of the present invention are described above, those skilled in the art should understand that this is only an illustration, and the protection scope of the present invention is defined by the appended claims. Those skilled in the art can make various changes or modifications to these embodiments without departing from the principle and essence of the present invention, but these changes and modifications all fall within the protection scope of the present invention.

Claims

1. A method for evaluating the aesthetic quality of a picture is characterized by comprising the following steps:

integrating and labeling the picture data set;

extracting deep learning features of the picture according to the picture data set;

extracting the character pressing feature and the picture splicing feature of the picture;

performing semantic segmentation on the picture, and extracting a centroid feature and an angle feature according to a result of the semantic segmentation;

acquiring the color saturation of the picture;

training the deep learning feature, the character pressing feature, the picture splicing feature, the centroid feature, the angle feature and the color saturation as training features to obtain a prediction model;

and evaluating the aesthetic quality of the picture according to the prediction model.

2. The method of claim 1, wherein the step of extracting the deep learning features of the picture according to the picture data set comprises:

extracting the deep learning features based on a Resnet model, an xception model and a VGG model;

the steps of extracting the character pressing feature and the picture splicing feature of the picture comprise:

and extracting the embossed characters and the jigsaw puzzle features based on a yolo model and a resnet model.

3. The method of claim 1, wherein semantically segmenting the picture comprises:

and performing semantic segmentation on the picture based on a pspnet model.

4. The method for evaluating the aesthetic quality of a picture according to claim 1, wherein the deep learning features further comprise a prediction score mean and a deviation of the picture extracted based on a NIMA model.

5. The system for evaluating the aesthetic quality of the picture is characterized by comprising a data set generation unit, a feature extraction unit, a training unit and an evaluation unit;

the data set generating unit is used for integrating and labeling the picture data set;

the feature extraction unit is used for extracting deep learning features of the picture according to the picture data set;

the feature extraction unit is also used for extracting the character pressing feature and the picture splicing feature of the picture;

the feature extraction unit is also used for carrying out semantic segmentation on the picture and extracting a centroid feature and an angle feature according to the result of the semantic segmentation;

the feature extraction unit is further used for acquiring the color saturation of the picture;

the training unit is used for training the deep learning feature, the character pressing feature, the picture splicing feature, the centroid feature, the angle feature and the color saturation as training features to obtain a prediction model;

the evaluation unit is used for evaluating the aesthetic quality of the picture according to the prediction model.

6. The system for evaluating the aesthetic quality of a picture according to claim 5, wherein the feature extraction unit is further configured to extract the deep learning feature based on a Resnet model, an xception model, and a VGG model;

the feature extraction unit is further used for extracting the embossed characters and the jigsaw feature based on a yolo model and a resnet model.

7. The system for evaluating the aesthetic quality of a picture according to claim 5, wherein the feature extraction unit is further configured to semantically segment the picture based on a pspnet model.

8. The system for evaluating the aesthetic quality of a picture according to claim 5, wherein the deep learning features further comprise a prediction score mean and deviation of the picture extracted based on a NIMA model.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method for assessing the aesthetic quality of a picture according to any one of claims 1 to 4 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for assessing the aesthetic quality of a picture according to any one of claims 1 to 4.