CN116258697B

CN116258697B - Device and method for automatic classification of children's skin disease images based on coarse annotation

Info

Publication number: CN116258697B
Application number: CN202310150365.1A
Authority: CN
Inventors: 俞刚; 李竞; 郑惠文; 沈忱; 齐国强
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2023-02-22
Filing date: 2023-02-22
Publication date: 2023-11-24
Anticipated expiration: 2043-02-22
Also published as: CN116258697A

Abstract

The invention discloses an automatic classification device and method for children's skin disease images based on coarse annotation. After coarse annotation of the lesion area of the acquired child skin disease image, the coarse annotation of the lesion area is preprocessed to establish a mask mask. Label images; build a classification model including U‑Net, texture feature extraction module, color feature extraction module, shape feature extraction module, first correlation analysis module, second correlation analysis module, and feature fusion and classification module, using mask masks Annotated images are used for supervised learning of the classification model to optimize the parameters of the classification model; the parameter-optimized classification model is used to automatically classify children's skin disease images. The device and method build a model that can accurately and automatically classify children's skin diseases based on coarse annotations, and improve the classification accuracy of children's skin disease images.

Description

Device and method for automatic classification of children's skin disease images based on coarse annotation

技术领域Technical field

本发明属于图像分类技术领域，具体涉及一种基于粗标注的儿童皮肤病图像自动分类装置和方法。The invention belongs to the technical field of image classification, and specifically relates to an automatic classification device and method for children's skin disease images based on coarse annotation.

背景技术Background technique

皮肤病是儿童发病率最高的疾病之一。目前，在儿童皮肤病中发病率最高也是最常见的皮肤疾病包括：特应性皮炎(Atopic Dermatitis,AD)、荨麻疹(Urticaria)、血管瘤(Hemangioma)、尿布皮炎(Diaper Dermatitis)等。该四种疾病严重影响患儿的生活质量，危害生理和心理健康，甚至危及生命，造成沉重的经济和社会负担。Skin diseases are among the most common diseases in children. At present, the skin diseases with the highest incidence rate and the most common among children's skin diseases include: atopic dermatitis (AD), urticaria (Urticaria), hemangioma, diaper dermatitis (Diaper Dermatitis), etc. These four diseases seriously affect the quality of life of children, harm their physical and mental health, and even endanger their lives, causing heavy economic and social burdens.

目前大多数皮肤病患儿由基层非皮肤科医生进行首诊，受限于基层医生经验、能力及医院医疗设施不完备，导致儿童皮肤病基层误诊率高。短期内难以通过传统临床方法解决这一难题。因此，探索新的方法，准确地对儿童皮肤病进行快速诊断，迅速提高广大基层医生的首诊正确率，成为亟待解决的重要问题。At present, most children with skin diseases are first diagnosed by non-dermatologists at the grassroots level. However, limited by the experience and ability of grassroots doctors and the incomplete medical facilities of hospitals, the misdiagnosis rate of children's skin diseases at the grassroots level is high. It is difficult to solve this problem through traditional clinical methods in the short term. Therefore, exploring new methods to accurately and quickly diagnose children's skin diseases and quickly improve the accuracy of first diagnosis by grassroots doctors has become an important issue that needs to be solved urgently.

儿童皮肤与成人皮肤存在着显著区别，主要包括：1.皮肤结构和屏障差异：儿童角质层平均厚度(约7μm)比成人角质层(约10μm)薄30％，表皮层薄20％，具有更小体积的角质细胞和更薄的角质层，皮肤屏障功能尚未成熟，水分流失速度快，抵抗力弱，对外界刺激反应强；2.皮肤组成差异：儿童皮肤角质细胞中天然保湿因子的浓度明显比成人低且皮脂水平较低，脂质总含量和分泌皮脂的量都比成人少，因此儿童皮肤更容易无缺水干燥。同时，儿童皮肤中黑色素的含量比成人低，更容易受到太阳的伤害；3.微生物屏障弱：儿童皮肤pH值接近中性(6.6-7.7)；微生物群还不稳定，容易受影响。根据以上特征，可以对儿童皮肤图像进行更多的探究。There are significant differences between children's skin and adult skin, mainly including: 1. Skin structure and barrier differences: the average thickness of children's stratum corneum (about 7 μm) is 30% thinner than that of adults (about 10 μm), and the epidermis is 20% thinner. With small-volume keratinocytes and thinner stratum corneum, the skin barrier function is immature, the water loss rate is fast, the resistance is weak, and the reaction to external stimuli is strong; 2. Differences in skin composition: The concentration of natural moisturizing factors in children's skin keratinocytes is obvious It is lower than adults and has lower sebum levels. The total lipid content and the amount of sebum secreted are less than those of adults, so children's skin is more likely to be dry and dehydrated. At the same time, the melanin content in children's skin is lower than that of adults, making them more susceptible to sun damage; 3. Weak microbial barrier: the pH value of children's skin is close to neutral (6.6-7.7); the microbiome is not yet stable and easily affected. Based on the above characteristics, more exploration of children's skin images can be conducted.

目前基于深度学习的皮肤病图像识别模型已在被广泛研究和测试，但识别能力有限，且大多数研究方法都还集中在处理成人的皮肤病问题，并不涉及到儿童皮肤并问题。此外，基于深度学习的研究方法需要大量人力进行详尽标注，这对训练数据的精准程度提出了很高的要求，但皮肤病问题具有破损性或非连续的标注特征，故需要详尽标注，需要花费大量人力物力。且此类还存在边缘不清晰的情况，人为标注存在识别不完善的可能。由此，提出一种利用粗糙标注就能达到精细标注的儿童皮肤病图像自动分类方法就尤为重要。Currently, skin disease image recognition models based on deep learning have been widely researched and tested, but the recognition capabilities are limited, and most research methods are still focused on dealing with adult skin disease problems and do not involve children's skin problems. In addition, research methods based on deep learning require a large amount of manpower for detailed annotation, which places high requirements on the accuracy of training data. However, skin disease problems have damaging or discontinuous annotation characteristics, so detailed annotation is required, which is costly. A lot of manpower and material resources. In addition, there are still situations where the edges are not clear, and artificial annotation may lead to imperfect recognition. Therefore, it is particularly important to propose an automatic classification method for children's skin disease images that can achieve fine annotation using rough annotation.

发明内容Contents of the invention

鉴于上述，本发明的目的是提供一种基于粗标注的儿童皮肤病图像自动分类装置和方法，基于粗标注构建能够顾准确自动分类儿童皮肤病的模型，并提升儿童皮肤病图像的分类准确性。In view of the above, the purpose of the present invention is to provide an automatic classification device and method for children's skin disease images based on rough annotation, build a model that can accurately and automatically classify children's skin diseases based on rough annotation, and improve the classification accuracy of children's skin disease images. .

为实现上述发明目的，实施例提供的基于粗标注的儿童皮肤病图像自动分类方法，包括以下步骤：In order to achieve the above-mentioned object of the invention, the automatic classification method of children's skin disease images based on coarse annotation provided by the embodiment includes the following steps:

对获取的儿童皮肤病图像进行病灶区域的粗标注后，对粗标注的病灶区域进行预处理以建立掩码蒙版标注图像；After coarse annotation of the lesion area on the acquired children's skin disease images, preprocess the coarse annotation of the lesion area to establish a mask annotation image;

构建包括U-Net、纹理特征提取模块、颜色特征提取模块、形状特征提取模块、第一相关分析模块、第二相关分析模块以及特征融合和分类模块的分类模型，其中，U-Net、纹理特征提取模块、颜色特征提取模块、形状颜色特征提取模块分别用于从掩码蒙版标注图像中提取深度特征、纹理特征、颜色特征以及形状特征，第一相关分析模块用于对纹理特征和颜色特征进行一次特征融合，第二相关分析模块用于对形状特征和一次特征融合结果进行二次特征融合，特征融合和分类模块用于对二次特征融合结果与深度特征的拼接结果进行特征融合后再进行儿童皮肤病分类；Construct a classification model including U-Net, texture feature extraction module, color feature extraction module, shape feature extraction module, first correlation analysis module, second correlation analysis module, and feature fusion and classification module. Among them, U-Net, texture feature The extraction module, color feature extraction module, and shape color feature extraction module are respectively used to extract depth features, texture features, color features, and shape features from the mask annotation image. The first correlation analysis module is used to extract texture features and color features. Perform primary feature fusion, the second correlation analysis module is used to perform secondary feature fusion on the shape feature and the primary feature fusion result, and the feature fusion and classification module is used to perform feature fusion on the splicing result of the secondary feature fusion result and the depth feature. perform classification of childhood dermatoses;

利用掩码蒙版标注图像对分类模型进行监督学习，以优化分类模型的参数；Use mask annotation images to perform supervised learning on the classification model to optimize the parameters of the classification model;

利用参数优化的分类模型进行儿童皮肤病图像自动分类。Automatic classification of children's skin disease images using a parameter-optimized classification model.

在一个可选的实施例中，所述对粗标注的病灶区域进行预处理以建立掩码蒙版标注图像，包括：In an optional embodiment, preprocessing the coarsely annotated lesion area to create a mask annotation image includes:

利用基于儿童皮肤与成人皮肤在RGB图像上的通道差异，采用CLAHE算法对儿童皮肤病图像处理，得到预处理图像；Using the channel differences between children's skin and adult skin on RGB images, the CLAHE algorithm is used to process children's skin disease images to obtain preprocessed images;

利用SLIC算法对预处理图像中粗标注的病灶区域进行超像素分割后，对超像素块内每个像素点的三通道数值相加获得单像素点色谱值；After using the SLIC algorithm to perform superpixel segmentation on the roughly labeled lesion area in the preprocessed image, the three-channel values of each pixel in the superpixel block are added to obtain the single-pixel chromatographic value;

利用聚类算法将单像素点色谱值分为两类，即将病灶区域内的像素分为0或1的像素点，得到二值图像；The clustering algorithm is used to divide the chromatographic value of a single pixel into two categories, that is, the pixels in the lesion area are divided into 0 or 1 pixels to obtain a binary image;

基于二值图像搜索所有边界，并计算每个边界内的区域面积后再填充最大连通域得到精细颗粒度的掩码蒙版标注图像。Search all boundaries based on binary images, calculate the area within each boundary, and then fill the maximum connected domain to obtain a fine-grained mask annotation image.

在一个可选的实施例中，所述聚类算法包括K-means聚类算法，即采用K-means聚类算法将单像素点色谱值分为两类。In an optional embodiment, the clustering algorithm includes a K-means clustering algorithm, that is, the K-means clustering algorithm is used to divide the chromatographic value of a single pixel into two categories.

在一个可选的实施例中，在所述纹理特征提取模块中，采用基于贝叶斯分类器的颜色直方图特征提取方式提取颜色特征，包括：In an optional embodiment, in the texture feature extraction module, the color histogram feature extraction method based on Bayesian classifier is used to extract color features, including:

基于掩码蒙版标注图像中每个像素点的HIS颜色模型值得到X^HS、X^HI和X^SI三个向量，并通过贝叶斯分类器将三个向量归类到9个颜色类别直方图，统计每个颜色类别直方图中像素点个数，以得到颜色特征。Based on the HIS color model value of each pixel in the mask mask annotation image, three vectors X ^HS , X ^HI and X ^SI are obtained, and the three vectors are classified into 9 color category histograms through the Bayesian classifier , count the number of pixels in the histogram of each color category to obtain color features.

在一个可选的实施例中，在所述纹理特征提取模块中，采用ULBP算法提取掩码蒙版标注图像的纹理特征。In an optional embodiment, in the texture feature extraction module, the ULBP algorithm is used to extract texture features of the mask annotation image.

在一个可选的实施例中，在所述形状特征提取模块中，对掩码蒙版标注图像提取基于区域边缘的形状特征，包括形状参数、弯曲能量、矩形度、圆形度。In an optional embodiment, in the shape feature extraction module, shape features based on regional edges are extracted from the mask annotation image, including shape parameters, bending energy, rectangularity, and circularity.

在一个可选的实施例中，所述第一相关分析模块和所述第二相关分析模块采用CCA算法进行特征融合。In an optional embodiment, the first correlation analysis module and the second correlation analysis module use a CCA algorithm to perform feature fusion.

在一个可选的实施例中，所述特征融合和分类模块包括至少一个卷积层和全连接层，通过卷积层实现特征融合，通过全连接层实现皮肤病分类。In an optional embodiment, the feature fusion and classification module includes at least one convolutional layer and a fully connected layer. Feature fusion is implemented through the convolutional layer, and skin disease classification is implemented through the fully connected layer.

在一个可选的实施例中，利用掩码蒙版标注图像对分类模型进行监督学习时，以分类模型输出的儿童皮肤病分类结果和分类真值标签的交叉熵作为损失函来更新模型参数。In an optional embodiment, when using mask annotation images to perform supervised learning on a classification model, the cross-entropy of the children's skin disease classification results output by the classification model and the true classification label is used as the loss function to update the model parameters.

为实现上述发明目的，实施例提供了一种基于粗标注的儿童皮肤病图像自动分类装置，包括预处理单元、模型构建单元、参数优化单元以及应用单元；In order to achieve the above object of the invention, the embodiment provides an automatic classification device for children's skin disease images based on coarse annotation, including a preprocessing unit, a model building unit, a parameter optimization unit and an application unit;

所述预处理单元用于对获取的儿童皮肤病图像进行病灶区域的粗标注后，对粗标注的病灶区域进行预处理以建立掩码蒙版标注图像；The preprocessing unit is used to perform rough annotation of the lesion area on the acquired children's skin disease images, and then preprocess the roughly annotated lesion area to establish a mask annotation image;

所述模型构建单元用于构建包括U-Net、纹理特征提取模块、颜色特征提取模块、形状特征提取模块、第一相关分析模块、第二相关分析模块以及特征融合和分类模块的分类模型，其中，U-Net、纹理特征提取模块、颜色特征提取模块、形状颜色特征提取模块分别用于从掩码蒙版标注图像中提取深度特征、纹理特征、颜色特征以及形状特征，第一相关分析模块用于对纹理特征和颜色特征进行一次特征融合，第二相关分析模块用于对形状特征和一次特征融合结果进行二次特征融合，特征融合和分类模块用于对二次特征融合结果与深度特征的拼接结果进行特征融合后再进行儿童皮肤病分类；The model building unit is used to build a classification model including U-Net, texture feature extraction module, color feature extraction module, shape feature extraction module, first correlation analysis module, second correlation analysis module and feature fusion and classification module, where , U-Net, texture feature extraction module, color feature extraction module, and shape color feature extraction module are respectively used to extract depth features, texture features, color features, and shape features from the mask mask annotation image. The first correlation analysis module uses The second correlation analysis module is used for primary feature fusion of texture features and color features. The second correlation analysis module is used for secondary feature fusion of shape features and primary feature fusion results. The feature fusion and classification module is used for secondary feature fusion results and depth features. The splicing results are used for feature fusion before classifying children's skin diseases;

所述参数优化单元用于利用掩码蒙版标注图像对分类模型进行监督学习，以优化分类模型的参数；The parameter optimization unit is used to perform supervised learning on the classification model using mask annotation images to optimize the parameters of the classification model;

所述应用单元用于利用参数优化的分类模型进行儿童皮肤病图像自动分类。The application unit is used to automatically classify children's skin disease images using a parameter-optimized classification model.

与现有技术相比，本发明具有的有益效果至少包括：Compared with the prior art, the beneficial effects of the present invention include at least:

通过对粗标注的病灶区域进行预处理以建立精细颗粒度的掩码蒙版标注图像，这样只需对较大病灶区域进行粗标注，就能有效的让分类模型获得足够的语义信息，无需大量的精细标注就能实现准确的儿童皮肤病图像自动分类，解决了皮肤病由于边界不清导致的难以精细标注的难题。再者，通过分类模型中结合纹理特征、颜色特征以及形状特征进行儿童皮肤病分类，更能提升儿童皮肤病图像的分类准确性，可为医生提供辅助诊断的能力。By preprocessing the coarsely annotated lesion area to establish a fine-grained mask annotation image, only coarse annotation of the larger lesion area can effectively allow the classification model to obtain sufficient semantic information without the need for a large amount of The precise annotation can achieve accurate automatic classification of children's skin disease images, solving the problem of difficult precise annotation of skin diseases due to unclear boundaries. Furthermore, by combining texture features, color features and shape features in the classification model to classify children's skin diseases, it can improve the classification accuracy of children's skin disease images and provide doctors with the ability to assist in diagnosis.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图做简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动前提下，还可以根据这些附图获得其他附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings in the following description are only These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without exerting creative efforts.

图1是实施例提供的基于粗标注的儿童皮肤病图像自动分类方法的流程图；Figure 1 is a flow chart of the automatic classification method of children's skin disease images based on coarse annotation provided by the embodiment;

图2是实施例提供的分类模型的结构示意图；Figure 2 is a schematic structural diagram of the classification model provided by the embodiment;

图3是实施例提供的儿童皮肤病颜色通道直方图的示例图；Figure 3 is an example diagram of a color channel histogram of children's skin diseases provided by the embodiment;

图4是实施例提供的儿童皮肤病图像各类预处理方法对比图；Figure 4 is a comparison chart of various preprocessing methods for children's skin disease images provided by the embodiment;

图5是实施例提供的儿童皮肤病图像粗标注区域超像素分割的示例图；Figure 5 is an example diagram of super-pixel segmentation of coarse annotation areas of children's skin disease images provided by the embodiment;

图6是实施例提供的基于粗标注的儿童皮肤病图像自动分类装置的结构示意图。Figure 6 is a schematic structural diagram of an automatic classification device for children's skin disease images based on coarse annotation provided by the embodiment.

具体实施方式Detailed ways

为使本发明的目的、技术方案及优点更加清楚明白，以下结合附图及实施例对本发明进行进一步的详细说明。应当理解，此处所描述的具体实施方式仅仅用以解释本发明，并不限定本发明的保护范围。In order to make the purpose, technical solutions and advantages of the present invention more clear, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention and do not limit the scope of the present invention.

为了克服现有技术中需要大量精准儿童皮肤病标注而造成极大代价的不足，同时也为了实现对儿童皮肤病图像的准确分类。实施例提供了一种基于粗标注的儿童皮肤病图像自动分类方法和装置，利用儿童皮肤不同于成人皮肤的图像特征因素进行初步预处理，再利用图形学方法，把粗标注与实际图像像素进行匹配，获得具有高精度的语义分割图像，并建立掩码(mask)蒙版标注图像。利用皮肤病图像多呈片状、点状的图像特征，采用基于图像特征融合的深度学习训练方法，使用Unet作为主干网络，再利用传统图像处理方法，对mask蒙版标注图像的颜色特征、纹理特征以及形状特征进行典型相关分析CCA(CanonicalCorrelation Analysis)，得到图像融合特征并加入深度学习网络，建立完整的分类网络。利用基于粗标注构建的mask蒙版标注图像进行分类模型的训练即可以获得较高训练精度，可以大大节省医生进行精细标注时间和精力的同时，保证了分类模型的有效识别能力。In order to overcome the shortcomings in the existing technology that require a large number of accurate children's skin disease annotations, which results in great cost, and at the same time, in order to achieve accurate classification of children's skin disease images. The embodiment provides a method and device for automatically classifying children's skin disease images based on rough annotation. The image feature factors of children's skin that are different from adult skin are used for preliminary preprocessing, and then the graphics method is used to compare the rough annotation with the actual image pixels. Match, obtain semantic segmentation images with high accuracy, and establish mask annotation images. Taking advantage of the image features of skin disease images, which are mostly flaky and dot-like, a deep learning training method based on image feature fusion is used, using Unet as the backbone network, and then using traditional image processing methods to label the color features and texture of the image on the mask. Features and shape features are subjected to canonical correlation analysis (CCA) to obtain image fusion features and added to the deep learning network to establish a complete classification network. Using mask annotation images constructed based on coarse annotation to train the classification model can achieve higher training accuracy, which can greatly save doctors' time and energy for fine annotation while ensuring the effective recognition ability of the classification model.

如图1所示，实施例提供的基于粗标注的儿童皮肤病图像自动分类方法，包括以下步骤：As shown in Figure 1, the automatic classification method of children's skin disease images based on coarse annotation provided by the embodiment includes the following steps:

S110，对获取的儿童皮肤病图像进行病灶区域的粗标注后，对粗标注的病灶区域进行预处理以建立掩码蒙版标注图像。S110. After rough annotation of the lesion area on the acquired children's skin disease image, preprocess the rough annotation of the lesion area to establish a mask annotation image.

粗标注是指医生对较大病灶区域进行矩形框拖选标注，该粗标注框选的病灶区域比较粗糙，可能包含非病灶区域，也可能病灶区域框选不全面。病灶区域的粗标注后，对粗标注的病灶区域进行预处理以建立掩码蒙版标注图像，具体包括：Coarse annotation means that the doctor drags and selects a rectangular frame to annotate a larger lesion area. The lesion area selected by this rough annotation is relatively rough and may include non-lesion areas, or the lesion area may not be comprehensively selected. After the rough annotation of the lesion area, preprocess the rough annotation of the lesion area to establish a mask annotation image, including:

(a)利用基于儿童皮肤与成人皮肤在RGB图像上的通道差异，在通道上采用限制对比度自适应直方图均衡化(Contrast Limited Adaptive Histogram Equalization，CLAHE)算法对儿童皮肤病图像处理，得到预处理图像。(a) Using the channel difference between children's skin and adult skin on the RGB image, using the Contrast Limited Adaptive Histogram Equalization (CLAHE) algorithm on the channel to process children's skin disease images to obtain preprocessing image.

通过试验研究发现，由于儿童皮肤黑色素较少，其正常皮肤图片的RGB色谱，在G通道上的数值会比明显比其他两通道的数值要高，如图3所示，这会造成在后续超像素分割时，由于与病灶区域的RGB三通道像素总和的差异较大，从而丢失病灶边缘信息。此外，也是由于皮肤病图像一般都是由普通相机拍摄，场景与光线变化差异较大，使用CLAHE算法可以增加后续病灶区域处理的对比度，也可以对图像中的对比度放大进行限制，从而减少噪声的放大问题，操作的对比情况如图4所示。Through experimental research, it was found that because children's skin has less melanin, the value of the RGB spectrum of normal skin pictures on the G channel will be significantly higher than the values on the other two channels, as shown in Figure 3, which will cause subsequent ultrasonic During pixel segmentation, the edge information of the lesion is lost due to the large difference from the sum of RGB three-channel pixels in the lesion area. In addition, because skin disease images are generally taken by ordinary cameras, the scene and light changes are greatly different. Using the CLAHE algorithm can increase the contrast of subsequent lesion area processing, and can also limit the contrast amplification in the image, thereby reducing noise. Zooming in on the problem, the comparison of operations is shown in Figure 4.

(b)利用SLIC(simple linear iterative clustering)算法对预处理图像中粗标注的病灶区域进行超像素分割(例如将病灶区域分割成未200单位的超像素块)后，根据粗标注的标注数据，对超像素块内每个像素点的三通道数值相加获得单像素点色谱值。(b) After using the SLIC (simple linear iterative clustering) algorithm to perform superpixel segmentation on the coarsely annotated lesion area in the preprocessed image (for example, dividing the lesion area into superpixel blocks of less than 200 units), based on the coarsely annotated annotation data, The three-channel values of each pixel in the superpixel block are added to obtain the single-pixel chromatographic value.

本发明采用了超像素分割方法对粗标注的图像进行细化处理。超像素是指由一系列位置相邻且颜色、亮度、纹理等特征相似的像素点组成的小区域。这些小区域大多保留了进一步进行图像分割的有效信息，且一般不会破坏图像中物体的边界信息，用少量的超像素代替大量像素表达图像特征，降低了图像处理的复杂度。超像素分割SLIC算法生成的超像素块相对紧凑，领域特征容易表达，同时其需要设置调整的参数少，操作简单，速度快，对于图像的紧凑度、轮廓保持拥有很好的效果，另外该SLIC算法兼容灰度图和彩色图的分割。一般来说，超像素分割方法是利用精细颗粒图进行模糊化处理的一种手段，但本发明尝试利用其原理进行反向的细化操作，尝试把粗标注框内的图像作为整块像素，利用超像素分割方法进行高单元数量的像素分割，实现对粗糙边界标注的精细分割，为粗标注转化为精细标注奠定细颗粒度的实现基础。粗标注区域的超像素分割结果可如图5所示。The present invention adopts a superpixel segmentation method to refine the coarsely labeled images. A superpixel refers to a small area composed of a series of adjacent pixels with similar characteristics such as color, brightness, texture, etc. Most of these small areas retain effective information for further image segmentation, and generally do not destroy the boundary information of objects in the image. A small number of superpixels are used instead of a large number of pixels to express image features, which reduces the complexity of image processing. The superpixel blocks generated by the superpixel segmentation SLIC algorithm are relatively compact, and domain features are easy to express. At the same time, it requires few parameters to be set and adjusted, is simple to operate, and is fast. It has a good effect on image compactness and contour maintenance. In addition, SLIC The algorithm is compatible with segmentation of grayscale images and color images. Generally speaking, the superpixel segmentation method is a means of blurring using fine grain images, but this invention attempts to use its principle to perform a reverse refinement operation, trying to treat the image within the coarse annotation frame as a whole block of pixels. The super-pixel segmentation method is used to perform pixel segmentation with a high number of units to achieve fine segmentation of rough boundary annotations, laying the foundation for fine-grained implementation of converting coarse annotations into fine annotations. The superpixel segmentation results of the coarsely labeled area can be shown in Figure 5.

(c)利用K-means等聚类算法将单像素点色谱值分为两类，即将病灶区域内的像素分为0或1的像素点，得到二值图像。(c) Use clustering algorithms such as K-means to divide the chromatographic value of a single pixel into two categories, that is, divide the pixels in the lesion area into 0 or 1 pixels to obtain a binary image.

(d)基于二值图像搜索所有边界，并计算每个边界内的区域面积后再填充最大连通域得到精细颗粒度的掩码蒙版标注图像。(d) Search all boundaries based on binary images, calculate the area within each boundary, and then fill in the maximum connected domain to obtain a fine-grained mask annotation image.

为了更好的处理精细标注的边界信息，本发明通过使用聚类算法以及寻找连通域的方法进一步处理粗标注的图像。通过使用K-means聚类算法，通过对每个像素点的三通道数值相加获得单像素点色谱值，将超像素分割得到的超像素块聚类分成2个大类，即将粗标注框内的数据分为0/1的像素点。二值化后的粗标注数据，由于存在图像峰值以及临界条件等问题，会存在非连续性分布的情况。因此，本发明进一步利用寻找最大连通域的方法，利用findContours函数找到二值图像中的所有边界，然后通过contourArea函数计算每个边界内的面积，最后通过fillConvexPoly函数填充最大连通域。通过以上的几个步骤实现基于粗标注图像，根据病灶边界图像自动细化的目标。In order to better process finely labeled boundary information, the present invention further processes coarsely labeled images by using a clustering algorithm and a method of finding connected domains. By using the K-means clustering algorithm, the chromatographic value of a single pixel is obtained by adding the three-channel values of each pixel, and the superpixel blocks obtained by superpixel segmentation are clustered into two major categories, that is, within the rough label box The data is divided into 0/1 pixels. Due to problems such as image peaks and critical conditions, the binarized rough annotation data will have discontinuous distribution. Therefore, the present invention further uses the method of finding the maximum connected domain, using the findContours function to find all boundaries in the binary image, then calculating the area within each boundary through the contourArea function, and finally filling the maximum connected domain through the fillConvexPoly function. Through the above steps, the goal of automatically refining the lesion boundary image based on the coarse annotation image is achieved.

S120，构建分类模型。S120, build a classification model.

实施例中，分类模型基于U-Net网络进行设计。为了矫正粗标注精细化后造成的误差，在U-Net网络基础上增加图像特征融合方式，即将图像特征作为训练参数之一，建立了如图2所示的整体分类模型。In the embodiment, the classification model is designed based on the U-Net network. In order to correct the errors caused by the refinement of rough annotations, an image feature fusion method is added based on the U-Net network, that is, image features are used as one of the training parameters, and an overall classification model as shown in Figure 2 is established.

如图2所示，分类模型包括U-Net、纹理特征提取模块、颜色特征提取模块、形状特征提取模块、第一相关分析模块、第二相关分析模块以及特征融合和分类模块。其中，U-Net、纹理特征提取模块、颜色特征提取模块、形状颜色特征提取模块分别用于从掩码蒙版标注图像中提取深度特征、纹理特征、颜色特征以及形状特征，第一相关分析模块用于对纹理特征和颜色特征进行一次特征融合，第二相关分析模块用于对形状特征和一次特征融合结果进行二次特征融合，特征融合和分类模块用于对二次特征融合结果与深度特征的拼接结果进行特征融合后再进行儿童皮肤病分类。As shown in Figure 2, the classification model includes U-Net, texture feature extraction module, color feature extraction module, shape feature extraction module, first correlation analysis module, second correlation analysis module, and feature fusion and classification module. Among them, U-Net, texture feature extraction module, color feature extraction module, and shape color feature extraction module are respectively used to extract depth features, texture features, color features, and shape features from mask mask annotation images. The first correlation analysis module It is used for primary feature fusion of texture features and color features. The second correlation analysis module is used for secondary feature fusion of shape features and primary feature fusion results. The feature fusion and classification module is used for secondary feature fusion results and depth features. The splicing results are feature fused and then classified into children's skin diseases.

在纹理特征提取模块中，采用基于贝叶斯分类器的颜色直方图特征提取方式提取颜色特征，包括：In the texture feature extraction module, the color histogram feature extraction method based on Bayesian classifier is used to extract color features, including:

扫描掩码蒙版标注图像中的每个像素点，基于掩码蒙版标注图像中每个像素点的HIS颜色模型值(h,s,i)得到X^HS、X^HI和X^SI三个向量，其中，X^HS＝[h,s]，X^HI＝[h,i]，X^SI＝[s,i]，并通过贝叶斯分类器将三个向量归类到相应的皮肤病颜色中，即根据贝叶斯分类器将三个向量归类到9个颜色类别直方图，统计每个颜色类别直方图中像素点个数，以得到颜色特征。其中，9个颜色类别分别包括了白色、红色、浅棕色、深棕色、浅蓝灰色、深蓝灰色、黑色以及未定义色。Scan each pixel in the mask annotated image, and obtain three vectors X ^HS , X ^HI and X ^SI based on the HIS color model value (h, s, i) of each pixel in the mask annotated image. , where, X ^HS = [h ^{, s], X HI} ⁼ [h, i], , that is, classify the three vectors into 9 color category histograms according to the Bayesian classifier, and count the number of pixels in each color category histogram to obtain the color features. Among them, the 9 color categories include white, red, light brown, dark brown, light blue gray, dark blue gray, black and undefined color.

在纹理特征提取模块中，采用ULBP(Uniform Local Binary Pattern)算法提取掩码蒙版标注图像的纹理特征，其中，ULBP算法定义为：In the texture feature extraction module, the ULBP (Uniform Local Binary Pattern) algorithm is used to extract the texture features of the mask annotation image. The ULBP algorithm is defined as:

其中，g_c是中心像素点的灰度值，R是领域半径，g_i是g_c领域内的第i个像素点的灰度值，P是中心像素点的邻域内像素点个数，统一化算子U来表示局部二进制模式中0和1的转换次数。上标riu2反映了旋转不变“均匀”模式的U值最多为2。Among them, g _c is the gray value of the central pixel, R is the radius of the area, g _i is the gray value of the i-th pixel in the area of g _c , and P is the number of pixels in the neighborhood of the central pixel, uniform The operator U is used to represent the number of conversions between 0 and 1 in the local binary pattern. The superscript riu2 reflects that the rotationally invariant "uniform" mode has a U value of at most 2.

ULBP算法是一种度量像素点与邻域像素点之间关系的算法，通过分析像素点与邻域像素点之间的关系来提取纹理特征。而这些局部纹理的分布可以形成图像的整体纹理，最后统计每个邻域结构即可得到图像的纹理特征向量。The ULBP algorithm is an algorithm that measures the relationship between pixels and neighboring pixels. It extracts texture features by analyzing the relationship between pixels and neighboring pixels. The distribution of these local textures can form the overall texture of the image, and finally the texture feature vector of the image can be obtained by counting each neighborhood structure.

在形状特征提取模块中，对掩码蒙版标注图像提取基于区域边缘的形状特征，包括形状参数、弯曲能量、矩形度、圆形度。In the shape feature extraction module, shape features based on regional edges are extracted from the mask annotation image, including shape parameters, bending energy, rectangularity, and circularity.

其中，形状参数F的表达式为：Among them, the expression of the shape parameter F is:

其中，C为区域边界周长，A为区域面积。Among them, C is the perimeter of the region boundary, and A is the area of the region.

弯曲能量B的表达式为：The expression of bending energy B is:

其中，p为弧长参数，P为曲线总长度，k(p)为曲率函数。Among them, p is the arc length parameter, P is the total length of the curve, and k(p) is the curvature function.

矩形度R是区域面积与最小外接矩形面积的比值，表示对外接矩形的饱和程度，其表达式为：Rectangularity R is the ratio of the area of the region to the area of the smallest circumscribed rectangle, indicating the degree of saturation of the circumscribed rectangle. Its expression is:

其中，A为区域面积，A_box为最小外接矩形的面积。Among them, A is the area of the region, and A _box is the area of the smallest circumscribed rectangle.

圆形度C用于度量区域的圆形程度，其表达式为：Circularity C is used to measure the circularity of an area, and its expression is:

其中，P为区域边界的周长，A为区域面积。Among them, P is the perimeter of the region boundary, and A is the area of the region.

通过以上方法，提取了颜色特征、纹理特征以及形状特征等图像特征。为了统一特征维度，在第一相关分析模块、第二相关分析模块中通过采用典型相关分析法CCA(Canonical Correlation Analysis)进行特征融合。CCA是一种用于分析两个随机向量相互关系的统计分析方法，其可以去除特征之间的冗余信息，能很好的对特征进行融合。Through the above method, image features such as color features, texture features, and shape features are extracted. In order to unify the feature dimensions, the canonical correlation analysis method CCA (Canonical Correlation Analysis) is used to perform feature fusion in the first correlation analysis module and the second correlation analysis module. CCA is a statistical analysis method used to analyze the relationship between two random vectors. It can remove redundant information between features and fuse features well.

在特征融合和分类模块连接在第二相关分析模块和U-Net的输出端，包括至少一个卷积层和全连接层，例如3个卷积层和1个全连接层，通过卷积层实现特征融合，通过全连接层实现皮肤病分类。The feature fusion and classification module is connected to the output of the second correlation analysis module and U-Net, including at least one convolutional layer and a fully connected layer, such as 3 convolutional layers and 1 fully connected layer, implemented through the convolutional layer Feature fusion realizes skin disease classification through fully connected layers.

S130，利用掩码蒙版标注图像对分类模型进行监督学习，以优化分类模型的参数。S130: Use the mask annotation image to perform supervised learning on the classification model to optimize the parameters of the classification model.

实施例中，利用掩码蒙版标注图像对分类模型进行监督学习时，以分类模型输出的儿童皮肤病分类结果和分类真值标签的交叉熵函数作为损失函来更新模型参数，该模型参数包括U-Net的网络参数、特征融合和分类模块的网络参数，通过参数优化后，得到了一个可靠的分类模型，能够给儿童皮肤病图像进行准确的分类。In the embodiment, when the mask annotated image is used to perform supervised learning on the classification model, the cross-entropy function of the children's skin disease classification results output by the classification model and the classification true value label is used as a loss function to update the model parameters. The model parameters include After parameter optimization of U-Net's network parameters, feature fusion and classification modules, a reliable classification model was obtained, which can accurately classify children's skin disease images.

S140，利用参数优化的分类模型进行儿童皮肤病图像自动分类。S140, use the parameter-optimized classification model to automatically classify children's skin disease images.

实施例中，在利用参数优化的分类模型进行儿童皮肤病图像自动分类时，将获得的待分类儿童皮肤病图像进行病灶区域的粗标注后，对粗标注的病灶区域进行预处理以建立掩码蒙版标注图像，将掩码蒙版标注图像输入至参数优化的分类模型中，经过前向推理计算得到儿童皮肤病图像分类结果。In the embodiment, when a parameter-optimized classification model is used to automatically classify children's skin disease images, the obtained children's skin disease images to be classified are subjected to rough annotation of the lesion area, and then the rough annotation of the lesion area is preprocessed to establish a mask. The mask annotated image is input into the parameter-optimized classification model, and the classification results of children's skin disease images are obtained through forward inference calculation.

基于同样的发明构思，实施例还提供了一种基于粗标注的儿童皮肤病图像自动分类装置，如图6所示，包括预处理单元、模型构建单元、参数优化单元以及应用单元；其中，预处理单元用于对获取的儿童皮肤病图像进行病灶区域的粗标注后，对粗标注的病灶区域进行预处理以建立掩码蒙版标注图像；模型构建单元用于构建分类模型；参数优化单元用于利用掩码蒙版标注图像对分类模型进行监督学习，以优化分类模型的参数；应用单元用于利用参数优化的分类模型进行儿童皮肤病图像自动分类。Based on the same inventive concept, the embodiment also provides an automatic classification device for children's skin disease images based on coarse annotation, as shown in Figure 6, including a preprocessing unit, a model building unit, a parameter optimization unit and an application unit; wherein, the preprocessing unit The processing unit is used to perform rough annotation of the lesion area on the acquired children's skin disease images, and then pre-processes the rough annotation of the lesion area to establish a mask annotation image; the model construction unit is used to build a classification model; the parameter optimization unit is used to It uses mask mask annotation images to perform supervised learning on the classification model to optimize the parameters of the classification model; the application unit is used to use the parameter-optimized classification model to automatically classify children's skin disease images.

需要说明的是，上述实施例提供的儿童皮肤病图像自动分类装置在进行儿童皮肤病图像自动分类时，应以上述各功单元的划分进行举例说明，可以根据需要将上述功能分配由不同的功能单元完成，即在终端或服务器的内部结构划分成不同的功能单元，以完成以上描述的全部或者部分功能。另外，上述实施例提供的儿童皮肤病图像自动分类装置与儿童皮肤病图像自动分类方法实施例属于同一构思，其具体实现过程详见儿童皮肤病图像自动分类方法实施例，这里不再赘述。It should be noted that when the automatic classification device for children's skin disease images provided in the above embodiments performs automatic classification of children's skin disease images, the above-mentioned division of each functional unit should be used as an example. The above functions can be allocated to different functions as needed. Unit completion means that the internal structure of the terminal or server is divided into different functional units to complete all or part of the functions described above. In addition, the device for automatically classifying children's skin disease images provided in the above embodiments and the embodiment of the method for automatically classifying children's skin disease images belong to the same concept. For details of the implementation process, see the embodiment of the method for automatically classifying children's skin disease images, which will not be described again here.

以上所述的具体实施方式对本发明的技术方案和有益效果进行了详细说明，应理解的是以上所述仅为本发明的最优选实施例，并不用于限制本发明，凡在本发明的原则范围内所做的任何修改、补充和等同替换等，均应包含在本发明的保护范围之内。The above-described specific embodiments describe in detail the technical solutions and beneficial effects of the present invention. It should be understood that the above are only the most preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, additions, equivalent substitutions, etc. made within the scope of the invention shall be included in the protection scope of the present invention.

Claims

1. An automatic classification method for children's skin disease images based on coarse annotation, which is characterized by including the following steps:

After rough annotation of the lesion area on the acquired children's skin disease images, pre-process the rough annotation of the lesion area to establish a mask annotation image, including: using the channel difference between children's skin and adult skin on the RGB image, The CLAHE algorithm is used to process children's skin disease images to obtain a preprocessed image; the SLIC algorithm is used to perform superpixel segmentation on the roughly labeled lesion area in the preprocessed image, and then the three-channel values of each pixel in the superpixel block are added to obtain Single pixel chromatographic value; use a clustering algorithm to divide the single pixel chromatographic value into two categories, that is, divide the pixels in the lesion area into 0 or 1 pixels to obtain a binary image; search all boundaries based on the binary image, and Calculate the area within each boundary and then fill the maximum connected domain to obtain a fine-grained mask annotation image;

Construct a classification model including U-Net, texture feature extraction module, color feature extraction module, shape feature extraction module, first correlation analysis module, second correlation analysis module, and feature fusion and classification module. Among them, U-Net, texture feature The extraction module, color feature extraction module, and shape color feature extraction module are respectively used to extract depth features, texture features, color features, and shape features from the mask annotation image. The first correlation analysis module is used to extract texture features and color features. Perform primary feature fusion, the second correlation analysis module is used to perform secondary feature fusion on the shape feature and the primary feature fusion result, and the feature fusion and classification module is used to perform feature fusion on the splicing result of the secondary feature fusion result and the depth feature. Classify children's skin diseases; wherein, in the texture feature extraction module, a color histogram feature extraction method based on a Bayesian classifier is used to extract color features, including: labeling each pixel in the image based on a mask mask The HIS color model value obtains three vectors X ^HS , X ^HI ^and , to obtain color features; in the texture feature extraction module, the ULBP algorithm is used to extract the texture features of the mask mask annotation image; in the shape feature extraction module, the mask mask annotation image is extracted based on regional edges Shape features include shape parameters, bending energy, rectangularity, and circularity; the first correlation analysis module and the second correlation analysis module use the CCA algorithm to perform feature fusion;

Use mask annotation images to perform supervised learning on the classification model to optimize the parameters of the classification model;

Automatic classification of children's skin disease images using a parameter-optimized classification model.

2. The automatic classification method of children's skin disease images based on rough annotation according to claim 1, characterized in that the clustering algorithm includes a K-means clustering algorithm, that is, the K-means clustering algorithm is used to classify single pixels. Chromatographic values are divided into two categories.

3. The automatic classification method of children's skin disease images based on coarse annotation according to claim 1, characterized in that the feature fusion and classification module includes at least one convolution layer and a fully connected layer, and feature fusion is achieved through the convolution layer. , achieving skin disease classification through fully connected layers.

4. The automatic classification method of children's skin disease images based on coarse annotation according to claim 1, characterized in that when using mask mask annotation images to perform supervised learning on the classification model, the children's skin disease classification results output by the classification model are and the cross-entropy of the classification ground truth labels are used as the loss function to update the model parameters.

5. An automatic classification device for children's skin disease images based on coarse annotation, characterized by including a preprocessing unit, a model building unit, a parameter optimization unit and an application unit;

The preprocessing unit is used to perform rough annotation of the lesion area on the acquired children's skin disease images, and then preprocess the roughly annotated lesion area to establish a mask annotation image, including: using the method based on children's skin and adult skin. For channel differences in RGB images, the CLAHE algorithm is used to process children's skin disease images to obtain preprocessed images; after using the SLIC algorithm to perform superpixel segmentation on the roughly labeled lesion areas in the preprocessed image, each pixel in the superpixel block is The three-channel values are added to obtain the single-pixel chromatographic value; the clustering algorithm is used to divide the single-pixel chromatographic value into two categories, that is, the pixels in the lesion area are divided into 0 or 1 pixels to obtain a binary image; based on the binary The value image searches all boundaries, calculates the area within each boundary, and then fills the maximum connected domain to obtain a fine-grained mask annotation image;

The model building unit is used to build a classification model including U-Net, texture feature extraction module, color feature extraction module, shape feature extraction module, first correlation analysis module, second correlation analysis module and feature fusion and classification module, where , U-Net, texture feature extraction module, color feature extraction module, and shape color feature extraction module are respectively used to extract depth features, texture features, color features, and shape features from the mask mask annotation image. The first correlation analysis module uses The second correlation analysis module is used for primary feature fusion of texture features and color features. The second correlation analysis module is used for secondary feature fusion of shape features and primary feature fusion results. The feature fusion and classification module is used for secondary feature fusion results and depth features. After feature fusion of the splicing results, children's skin diseases are classified; wherein, in the texture feature extraction module, the color histogram feature extraction method based on Bayesian classifier is used to extract color features, including: based on mask mask annotation The HIS color model value of each pixel in the image is obtained as three vectors: X ^HS , X ^HI ^and The number of pixels in the histogram is used to obtain color features; in the texture feature extraction module, the ULBP algorithm is used to extract the texture features of the mask mask annotated image; in the shape feature extraction module, the mask mask is The annotated image extracts shape features based on regional edges, including shape parameters, bending energy, rectangularity, and circularity; the first correlation analysis module and the second correlation analysis module use the CCA algorithm to perform feature fusion;

The parameter optimization unit is used to perform supervised learning on the classification model using mask annotation images to optimize the parameters of the classification model;

The application unit is used to automatically classify children's skin disease images using a parameter-optimized classification model.