[go: up one dir, main page]

CN106845551B - A kind of histopathological image recognition method - Google Patents

A kind of histopathological image recognition method Download PDF

Info

Publication number
CN106845551B
CN106845551B CN201710059300.0A CN201710059300A CN106845551B CN 106845551 B CN106845551 B CN 106845551B CN 201710059300 A CN201710059300 A CN 201710059300A CN 106845551 B CN106845551 B CN 106845551B
Authority
CN
China
Prior art keywords
disease
dictionary
free
samples
diseased
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710059300.0A
Other languages
Chinese (zh)
Other versions
CN106845551A (en
Inventor
汤红忠
李骁
王翔
毛丽珍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiangtan University
Original Assignee
Xiangtan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiangtan University filed Critical Xiangtan University
Priority to CN201710059300.0A priority Critical patent/CN106845551B/en
Publication of CN106845551A publication Critical patent/CN106845551A/en
Application granted granted Critical
Publication of CN106845551B publication Critical patent/CN106845551B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a tissue pathology image identification method, which comprises the following steps: selecting disease-free and disease-existing training samples and disease-free and disease-existing testing samples; establishing a disease-free dictionary learning model and a disease dictionary learning model by combining the disease-free training samples and the disease training samples, alternately and iteratively optimizing two objective functions until the maximum iteration times is reached, and learning to obtain a disease-free dictionary and a disease dictionary; performing sparse representation on the test sample by using the disease-free dictionary and the disease dictionary, and respectively calculating sparse reconstruction error vectors of the test sample under the disease-free dictionary and the disease dictionary; and obtaining classification statistics through sparse reconstruction of the error vector, and determining the category of the test sample through comparison of the classification statistics and a threshold value. The invention provides a new model and a new method for the application of dictionary learning in the classification of histopathology images, and the learned dictionary with class marks has better sparse reconstruction and intra-class robustness for similar samples and better inter-class discrimination for non-similar samples.

Description

一种组织病理图像识别方法A kind of histopathological image recognition method

技术领域technical field

本发明涉及一种组织病理图像识别方法。The invention relates to a tissue pathological image recognition method.

背景技术Background technique

随着计算机辅助诊断技术的发展,“数字病理”的研究也逐渐受到广大科研工作者的关注,其中,如何精确地自动提取隐藏在图像中的判别性特征,为后续组织病理图像分析或分类提供必要的信息,从而快速准确给出疾病等级与分类,已成为“数字病理”中极具挑战性的研究课题之一。With the development of computer-aided diagnosis technology, the research of "digital pathology" has gradually attracted the attention of the majority of scientific researchers. Among them, how to accurately and automatically extract the discriminative features hidden in the image provides information for subsequent analysis or classification of histopathological images. It has become one of the most challenging research topics in "digital pathology" to obtain the necessary information to quickly and accurately give disease grades and classifications.

传统的特征提取方式主要分为以下两类:第一大类是基于特定域或特定任务的特征,如生物细胞的大小与形态特征、图像的灰度或彩色信息、纹理等;第二大类主要以空间结构与多尺度特征为主,如形态学特征、图方法、尺度不变特征、小波特征等。上述传统特征提取方式多为像素级特征或手工特征,一般只适合特定的数据对象,其应用范围受到限制,而且特征冗余度高,判别性低。The traditional feature extraction methods are mainly divided into the following two categories: the first category is based on the characteristics of specific domains or specific tasks, such as the size and morphological characteristics of biological cells, grayscale or color information of images, textures, etc.; the second category It mainly focuses on spatial structure and multi-scale features, such as morphological features, graph methods, scale-invariant features, and wavelet features. The above-mentioned traditional feature extraction methods are mostly pixel-level features or manual features, which are generally only suitable for specific data objects, their application scope is limited, and feature redundancy is high and discriminative is low.

近些年来,稀疏表示因其在众多计算机视觉问题中的突出表现而获得了极大关注。其基本思想是将一个原始信号表示成以一组过完备字典为基的稀疏信号。稀疏表示在图像去噪与恢复,人脸识别,图像分类等领域中都获得了极大成功。而随着技术的发展,如何学习到适用于特定问题(比如用于图像分类)的字典成为学者们关注的焦点,即一个字典学习的理论框架。In recent years, sparse representations have received great attention due to their outstanding performance in numerous computer vision problems. The basic idea is to represent an original signal as a sparse signal based on a set of overcomplete dictionaries. Sparse representations have achieved great success in areas such as image denoising and restoration, face recognition, and image classification. With the development of technology, how to learn a dictionary suitable for specific problems (such as image classification) has become the focus of scholars, that is, a theoretical framework for dictionary learning.

字典学习的关键在于构造的字典是否具有较好的重构性与判别性。对这一类问题,Zhang等提出了一种判别性K-SVD(Discriminative K-SVD,DK-SVD)字典学习方法。Jiang等提出了基于类标一致K-SVD(Label Consistent K-SVD,LC-KSVD)的字典学习方法。Yang等采用Fisher准则提出判别性字典学习(Fisher Discrimination DictionaryLearning,FDDL)方法,通过约束稀疏表示系数间接提升字典的判别性能。Vu等提出了一种面向判别性特征的字典学习(Discriminative Feature-oriented Dictionary Learning,DFDL)方法,并将其应用于组织病理图像分类。上述方法,在图像分类中能取得非常不错的分类效果。The key to dictionary learning is whether the constructed dictionary has good reconstruction and discriminative properties. For this type of problem, Zhang et al. proposed a discriminative K-SVD (DK-SVD) dictionary learning method. Jiang et al. proposed a dictionary learning method based on Label Consistent K-SVD (LC-KSVD). Yang et al. proposed the Fisher Discrimination Dictionary Learning (FDDL) method using the Fisher criterion, which indirectly improves the discriminative performance of the dictionary by constraining the sparse representation coefficients. Vu et al. proposed a Discriminative Feature-oriented Dictionary Learning (DFDL) method and applied it to histopathological image classification. The above methods can achieve very good classification results in image classification.

然而,由于不同类型的组织病理图像呈现的特征各异,同一类型的组织病理图像中细胞形态与几何结构特征变化较大,病理特征也呈现出多样化,这导致同类病理图像样本间的特征差异大于非同类病理图像样本间的特征差异,使得上述方法学习的有病字典与无病字典相似程度较高,对无病样本与有病样本的判别性仍然较低,其分类性能依然有的待于提高。However, due to the different features presented by different types of histopathological images, the cell morphology and geometric structure characteristics in the same type of histopathological images vary greatly, and the pathological features also show diversification, which leads to the feature differences between the same type of pathological image samples. It is greater than the feature difference between non-similar pathological image samples, so that the disease dictionary learned by the above method is more similar to the disease-free dictionary, and the discrimination between disease-free samples and diseased samples is still low, and its classification performance still needs to be to improve.

发明内容SUMMARY OF THE INVENTION

为了解决上述技术问题,本发明提供一种准确率高、鲁棒性高的组织病理图像识别方法。In order to solve the above technical problems, the present invention provides a histopathological image recognition method with high accuracy and high robustness.

本发明解决上述问题的技术方案是:一种组织病理图像识别方法,包括以下步骤:The technical solution of the present invention to solve the above problems is: a method for identifying histopathological images, comprising the following steps:

步骤一,从某一组织的无病和有病两种图像中分别选取若干图像块作为无病和有病训练样本,无病和有病测试样本;Step 1: Select a number of image blocks from the disease-free and diseased images of a certain tissue, respectively, as disease-free and diseased training samples, and disease-free and diseased test samples;

步骤二,优化学习无病字典:结合无病训练样本和有病训练样本,建立无病字典学习模型,通过两步交替迭代的优化方式最小化目标函数,学习得到无病字典;Step 2, optimize the learning of the disease-free dictionary: combine the disease-free training samples and the diseased training samples to establish a disease-free dictionary learning model, and minimize the objective function through the optimization method of two-step alternate iteration, and learn to obtain the disease-free dictionary;

步骤三,优化学习有病字典:结合有病训练样本和无病训练样本,建立有病字典学习模型,通过两步交替迭代的优化方式最小化目标函数,学习得到有病字典;Step 3, optimize the learning of the diseased dictionary: combine the diseased training samples and the disease-free training samples to establish a diseased dictionary learning model, and minimize the objective function through the optimization method of two-step alternate iteration, and learn to obtain the diseased dictionary;

步骤四,判断是否达到最大迭代次数,若是,则进入步骤五,若不是,则返回步骤二;Step 4, determine whether the maximum number of iterations is reached, if so, go to Step 5, if not, return to Step 2;

步骤五,获得测试样本的重构误差向量:利用获得的无病字典和有病字典,对测试样本进行稀疏表示,然后分别计算测试样本在无病字典和有病字典下的稀疏重构误差向量;Step 5: Obtain the reconstruction error vector of the test sample: use the obtained disease-free dictionary and diseased dictionary to sparsely represent the test sample, and then calculate the sparse reconstruction error vector of the test sample under the disease-free dictionary and the diseased dictionary respectively. ;

步骤六:获得测试样本的分类结果:通过稀疏重构误差向量获得分类统计量,然后通过分类统计量与阈值的比较确定测试样本的类别。Step 6: Obtain the classification result of the test sample: obtain the classification statistic by sparsely reconstructing the error vector, and then determine the category of the test sample by comparing the classification statistic with the threshold.

上述组织病理图像识别方法,所述步骤一具体步骤为,从某一组织无病和有病两种图像中分别选取同等数量的图像块,然后将每个图像块分为RGB三通道,将三通道的像素值转换成列向量后串联得到特征向量,最后将特征向量并列作为无病和有病训练样本Y,

Figure BDA0001218063270000031
同理获得测试样本。In the above-mentioned histopathological image recognition method, the specific step of the first step is to select an equal number of image blocks from two images of a certain tissue without disease and with disease, and then divide each image block into three RGB channels, and divide the three image blocks into three RGB channels. The pixel values of the channels are converted into column vectors to obtain feature vectors in series, and finally the feature vectors are juxtaposed as the disease-free and diseased training samples Y,
Figure BDA0001218063270000031
Obtain test samples in the same way.

上述组织病理图像识别方法,所述步骤二的具体步骤为In the above method for identifying histopathological images, the specific steps of the second step are as follows:

2-1:从无病和有病训练样本中分别随机选取n列向量作为初始化的无病字典D和有病字典

Figure BDA0001218063270000032
2-1: Randomly select n-column vectors from the disease-free and diseased training samples as the initial disease-free dictionary D and diseased dictionary
Figure BDA0001218063270000032

2-2:建立无病字典学习模型,模型如下:2-2: Establish a disease-free dictionary learning model, the model is as follows:

Figure BDA0001218063270000033
Figure BDA0001218063270000033

其中,argmin表示使目标函数取最小值时的变量值,Y、

Figure BDA0001218063270000034
分别代表无病与有病训练样本,X、
Figure BDA0001218063270000035
分别代表无病与有病训练样本的稀疏表示系数,N和
Figure BDA0001218063270000036
分别代表无病和有病图像特征向量的数量,L1为无病样本和有病样本在无病字典下的编码稀疏度,ρ为正则化参数,且ρ>0;式中的
Figure BDA0001218063270000041
代表无病字典与无病训练样本的稀疏重构误差,
Figure BDA0001218063270000042
代表无病字典与有病训练样本的重构误差,F表示范数,Ψ(D)为无病字典的Fisher准则约束项,其表达式为:
Figure BDA0001218063270000043
其中m为无病字典D中所有原子的均值,M为无病字典D的原子均值m组成的矩阵,
Figure BDA0001218063270000044
为有病字典
Figure BDA0001218063270000045
中所有原子的均值,α、β分别代表类内间距与类间间距的惩罚系数,α,β>0;Among them, argmin represents the variable value when the objective function takes the minimum value, Y,
Figure BDA0001218063270000034
represent the disease-free and diseased training samples, respectively, X,
Figure BDA0001218063270000035
represent the sparse representation coefficients of disease-free and diseased training samples, respectively, N and
Figure BDA0001218063270000036
Represent the number of disease-free and diseased image feature vectors, respectively, L 1 is the coding sparsity of disease-free samples and diseased samples under the disease-free dictionary, ρ is a regularization parameter, and ρ>0; in the formula
Figure BDA0001218063270000041
represents the sparse reconstruction error of the disease-free dictionary and the disease-free training samples,
Figure BDA0001218063270000042
Represents the reconstruction error between the disease-free dictionary and the diseased training sample, F represents the norm, Ψ(D) is the Fisher criterion constraint of the disease-free dictionary, and its expression is:
Figure BDA0001218063270000043
where m is the mean of all atoms in the disease-free dictionary D, M is a matrix composed of the atomic mean m of the disease-free dictionary D,
Figure BDA0001218063270000044
dictionary for sick
Figure BDA0001218063270000045
The mean of all atoms in , α and β represent the penalty coefficients of intra-class spacing and inter-class spacing, respectively, α, β>0;

2-3:固定无病字典D,更新稀疏编码系数,此时的目标函数如下:2-3: Fix the disease-free dictionary D and update the sparse coding coefficients. The objective function at this time is as follows:

Figure BDA0001218063270000046
Figure BDA0001218063270000046

令训练样本

Figure BDA0001218063270000047
编码系数矩阵
Figure BDA0001218063270000048
L1为无病样本和有病样本在无病字典下的编码稀疏度,最优稀疏解为
Figure BDA0001218063270000049
则目标函数的求解分为无病训练样本在无病字典D下的稀疏表示与有病训练样本在无病字典D下的稀疏表示两步迭代完成,统一的简化如下:Let the training sample
Figure BDA0001218063270000047
Coding coefficient matrix
Figure BDA0001218063270000048
L 1 is the coding sparsity of disease-free samples and diseased samples under the disease-free dictionary, and the optimal sparse solution is
Figure BDA0001218063270000049
The solution of the objective function is divided into two steps: the sparse representation of the disease-free training samples under the disease-free dictionary D and the sparse representation of the diseased training samples under the disease-free dictionary D. The unified simplification is as follows:

Figure BDA00012180632700000410
Figure BDA00012180632700000410

利用SPAMS工具箱中的OMP算法,分别求解训练样本在无病字典D稀疏解

Figure BDA00012180632700000411
Use the OMP algorithm in the SPAMS toolbox to solve the sparse solutions of the training samples in the disease-free dictionary D respectively
Figure BDA00012180632700000411

2-4:固定稀疏编码系数,更新无病字典D,此时的目标函数如下:2-4: Fix the sparse coding coefficients and update the disease-free dictionary D. The objective function at this time is as follows:

Figure BDA00012180632700000412
Figure BDA00012180632700000412

通过化简得:By simplifying:

Figure BDA00012180632700000413
Figure BDA00012180632700000413

其中,tr表示矩阵的迹where tr represents the trace of the matrix

Figure BDA0001218063270000051
Figure BDA0001218063270000051

采用坐标梯度下降法求出无病字典D最优解。The optimal solution of the disease-free dictionary D is obtained by using the coordinate gradient descent method.

上述组织病理图像识别方法,所述步骤三的具体步骤为In the above-mentioned histopathological image recognition method, the specific steps of the third step are as follows:

3-1:从无病和有病训练样本中分别随机选取n列向量作为初始化的无病字典D和有病字典

Figure BDA0001218063270000052
3-1: Randomly select n-column vectors from the disease-free and diseased training samples as the initialized disease-free dictionary D and diseased dictionary
Figure BDA0001218063270000052

3-2:建立有病字典学习模型,模型如下:3-2: Establish a sick dictionary learning model, the model is as follows:

Figure BDA0001218063270000053
Figure BDA0001218063270000053

其中,Y、

Figure BDA0001218063270000054
分别代表无病与有病训练样本,X、
Figure BDA0001218063270000055
分别代表无病与有病训练样本的稀疏表示系数,N和
Figure BDA0001218063270000056
分别代表无病和有病图像特征向量的数量,L2为无病样本和有病样本在有病字典下的编码稀疏度,ρ为正则化参数,且ρ>0;式中的
Figure BDA0001218063270000057
代表有病字典与有病样本的稀疏重构误差,
Figure BDA0001218063270000058
代表有病字典与无病样本的重构误差,
Figure BDA0001218063270000059
为有病字典的Fisher准则约束项,其表达式为:
Figure BDA00012180632700000510
其中m为无病字典D中所有原子的均值,
Figure BDA00012180632700000511
为有病字典
Figure BDA00012180632700000512
中所有原子的均值,M为有病字典
Figure BDA00012180632700000513
中所有原子的均值
Figure BDA00012180632700000514
组成的矩阵;Among them, Y,
Figure BDA0001218063270000054
represent the disease-free and diseased training samples, respectively, X,
Figure BDA0001218063270000055
represent the sparse representation coefficients of disease-free and diseased training samples, respectively, N and
Figure BDA0001218063270000056
Represent the number of disease-free and diseased image feature vectors respectively, L 2 is the coding sparsity of disease-free samples and diseased samples under the diseased dictionary, ρ is the regularization parameter, and ρ>0; in the formula
Figure BDA0001218063270000057
represents the sparse reconstruction error of the diseased dictionary and the diseased sample,
Figure BDA0001218063270000058
represents the reconstruction error between the diseased dictionary and the disease-free sample,
Figure BDA0001218063270000059
is the Fisher criterion constraint of the diseased dictionary, and its expression is:
Figure BDA00012180632700000510
where m is the mean of all atoms in the disease-free dictionary D,
Figure BDA00012180632700000511
dictionary for sick
Figure BDA00012180632700000512
The mean of all atoms in , M is the sick dictionary
Figure BDA00012180632700000513
mean of all atoms in
Figure BDA00012180632700000514
composed of a matrix;

3-3:固定有病字典

Figure BDA00012180632700000515
更新稀疏编码系数,此时的目标函数如下:3-3: Fixed sick dictionary
Figure BDA00012180632700000515
To update the sparse coding coefficients, the objective function at this time is as follows:

Figure BDA00012180632700000516
Figure BDA00012180632700000516

令训练样本

Figure BDA00012180632700000517
编码系数矩阵
Figure BDA00012180632700000518
L2为无病样本和有病样本在有病字典下的编码稀疏度,最优稀疏解为
Figure BDA00012180632700000519
则目标函数的求解分为无病训练样本在有病字典
Figure BDA00012180632700000520
下的稀疏表示与有病训练样本在有病字典
Figure BDA00012180632700000521
下的稀疏表示两步迭代完成,统一的简化如下:Let the training sample
Figure BDA00012180632700000517
Coding coefficient matrix
Figure BDA00012180632700000518
L 2 is the coding sparsity of disease-free samples and diseased samples under the diseased dictionary, and the optimal sparse solution is
Figure BDA00012180632700000519
Then the solution of the objective function is divided into the disease-free training samples in the diseased dictionary
Figure BDA00012180632700000520
Sparse representation with sick training samples under sick dictionary
Figure BDA00012180632700000521
The sparse representation below is completed in two iterations, and the unified simplification is as follows:

Figure BDA00012180632700000522
Figure BDA00012180632700000522

利用SPAMS工具箱中的OMP算法,分别求解训练样本在有病字典

Figure BDA00012180632700000523
稀疏解
Figure BDA0001218063270000061
Use the OMP algorithm in the SPAMS toolbox to solve the training samples in the diseased dictionary respectively
Figure BDA00012180632700000523
sparse solution
Figure BDA0001218063270000061

3-4:固定稀疏编码系数,更新有病字典

Figure BDA0001218063270000062
此时的目标函数如下:3-4: Fix sparse coding coefficients, update diseased dictionary
Figure BDA0001218063270000062
The objective function at this time is as follows:

Figure BDA0001218063270000063
Figure BDA0001218063270000063

通过化简得:By simplifying:

Figure BDA0001218063270000064
Figure BDA0001218063270000064

其中,

Figure BDA0001218063270000065
in,
Figure BDA0001218063270000065

采用坐标梯度下降法求出有病字典

Figure BDA0001218063270000066
最优解。Using the Coordinate Gradient Descent Method to Find the Diseased Dictionary
Figure BDA0001218063270000066
Optimal solution.

上述组织病理图像识别方法,所述步骤五的具体步骤为In the above-mentioned histopathological image recognition method, the specific steps of the step 5 are as follows:

5-1,将测试样本图像分块,每个图块视为一个列向量h,随机取u个图块组成矩阵H作为测试样本,利用

Figure BDA0001218063270000067
求得测试样本H在带类标字典
Figure BDA0001218063270000068
下的稀疏编码
Figure BDA0001218063270000069
5-1, divide the test sample image into blocks, each block is regarded as a column vector h, randomly select u blocks to form a matrix H as the test sample, use
Figure BDA0001218063270000067
Obtain the test sample H in the dictionary with class label
Figure BDA0001218063270000068
sparse coding under
Figure BDA0001218063270000069

5-2,计算测试样本在无病字典D与有病字典

Figure BDA00012180632700000610
下的稀疏重构误差向量,即δ1=diag((H-DX)(H-DX)T),
Figure BDA00012180632700000611
其中,diag(·)表示矩阵主对角线上的元素。5-2, Calculate the test sample in the disease-free dictionary D and the diseased dictionary
Figure BDA00012180632700000610
The sparse reconstruction error vector under , namely δ 1 =diag((H-DX)(H-DX) T ),
Figure BDA00012180632700000611
where diag( ) represents the elements on the main diagonal of the matrix.

上述组织病理图像识别方法,所述步骤六的具体步骤为In the above-mentioned histopathological image recognition method, the specific steps of the step 6 are as follows:

6-1,定义向量

Figure BDA00012180632700000612
Nt为测试样本的个数;6-1, define vector
Figure BDA00012180632700000612
N t is the number of test samples;

6-2,由向量C得到分类统计量S:6-2, get the classification statistic S from the vector C:

Figure BDA00012180632700000613
Figure BDA00012180632700000613

当分类统计量S大于或者等于阈值Th,测试样本为无病样本;反之,当分类统计量S小于阈值Th,则测试样本为有病样本。When the classification statistic S is greater than or equal to the threshold Th, the test sample is a disease-free sample; on the contrary, when the classification statistic S is less than the threshold Th, the test sample is a diseased sample.

本发明的有益效果在于:本发明的步骤包括:首先从组织病理图像数据集中分别随机选取若干图像块作为训练样本和测试样本;然后将不同类型的训练样本输入到模型中,使用交替迭代的方法对模型进行求解,不断优化目标函数,学习得到带类标字典;最后基于得到的带类标字典对测试集矩阵进行稀疏表示,通过重构误差向量和阈值的对比确定此测试集矩阵的类别。本发明对字典学习在组织病理图像分类中的应用提出了新的模型和方法,学习出的带类标字典对同类样本具有较好的稀疏重构性与类内鲁棒性,对非同类样本具有较好的类间判别性,能有效提高组织病理图像分类性能。The beneficial effects of the present invention are as follows: the steps of the present invention include: first, randomly selecting several image blocks from the histopathological image data set as training samples and test samples; then inputting different types of training samples into the model, using an alternate iteration method The model is solved, the objective function is continuously optimized, and the dictionary with class labels is obtained by learning; finally, the test set matrix is sparsely represented based on the obtained dictionary with class labels, and the category of the test set matrix is determined by comparing the reconstructed error vector and the threshold. The invention proposes a new model and method for the application of dictionary learning in the classification of histopathological images. The learned dictionary with class labels has good sparse reconstruction and intra-class robustness for similar samples, and it has better sparse reconstruction and intra-class robustness for non-homogeneous samples. It has good inter-class discrimination and can effectively improve the classification performance of histopathological images.

附图说明Description of drawings

图1为本发明的流程图。FIG. 1 is a flow chart of the present invention.

图2为ADL数据库中肺、脾脏、肾脏的组织病理示意图,其中(a)从左至右分别为肺、脾脏、肾脏的无病图像,(b)从左至右分别为肺、脾脏、肾脏的有病图像。Figure 2 is a schematic diagram of the histopathology of the lung, spleen and kidney in the ADL database, in which (a) from left to right are the disease-free images of the lung, spleen, and kidney, respectively, (b) from left to right are the lung, spleen, and kidney, respectively sick images.

图3为BreaKHis数据库中腺病与叶状癌的组织病理示意图,其中(a)为腺病的组织病理图像,(b)为叶状癌的组织病理图像。3 is a schematic diagram of histopathology of adenopathy and phyllodes carcinoma in the BreaKHis database, wherein (a) is the histopathological image of adenopathy, and (b) is the histopathological image of phyllodes carcinoma.

具体实施方式Detailed ways

下面结合附图和实施例对本发明作进一步的说明。The present invention will be further described below with reference to the accompanying drawings and embodiments.

如图1所示,本发明包括以下步骤:As shown in Figure 1, the present invention comprises the following steps:

步骤一:从某一组织的无病和有病两种图像中分别选取若干图像块作为无病和有病训练样本,无病和有病测试样本。具体步骤为:Step 1: Select several image blocks from two kinds of images of a tissue without disease and with disease as training samples without disease and with disease, and test samples without disease and disease. The specific steps are:

从某一组织的无病和有病两种图像中分别随机选取40张图像,从每张图像随机提取250个图块,块的大小为20×20,则共计10000个彩色图块,然后将每个彩色图块分为RGB三通道,将三通道的像素值转换成列向量后串联得到特征向量,最后将特征向量并列作为训练样本,则Y,

Figure BDA0001218063270000081
R1200×10000表示矩阵的大小,分别从剩余的某一种组织图像中随机选取无病和有病两种图像各110张作为测试集。Randomly select 40 images from two images of a certain tissue without disease and with disease, and randomly extract 250 blocks from each image, the size of the block is 20 × 20, there are a total of 10,000 color blocks, and then the Each color block is divided into three RGB channels, and the pixel values of the three channels are converted into column vectors to obtain feature vectors in series, and finally the feature vectors are juxtaposed as training samples, then Y,
Figure BDA0001218063270000081
R 1200×10000 represents the size of the matrix, and 110 images of disease-free and diseased images are randomly selected from the remaining tissue images as the test set.

步骤二,优化学习无病字典:结合无病训练样本和有病训练样本,建立无病字典学习模型,通过两步交替迭代的优化方式最小化目标函数,学习得到无病字典。具体步骤为:Step 2: Optimizing learning of disease-free dictionary: combining disease-free training samples and diseased training samples, establishing a disease-free dictionary learning model, and learning to obtain a disease-free dictionary by minimizing the objective function through two-step alternate iterative optimization. The specific steps are:

2-1:从无病和有病训练样本中分别随机选取n列向量作为初始化的无病字典D和有病字典

Figure BDA0001218063270000082
2-1: Randomly select n-column vectors from the disease-free and diseased training samples as the initial disease-free dictionary D and diseased dictionary
Figure BDA0001218063270000082

2-2:建立无病字典学习模型,模型如下:2-2: Establish a disease-free dictionary learning model, the model is as follows:

Figure BDA0001218063270000083
Figure BDA0001218063270000083

其中,argmin表示使目标函数取最小值时的变量值,Y、

Figure BDA0001218063270000084
分别代表无病与有病训练样本,X、
Figure BDA0001218063270000085
分别代表无病与有病训练样本的稀疏表示系数,N和
Figure BDA0001218063270000086
分别代表无病和有病图像特征向量的数量,L1为无病样本和有病样本在无病字典下的编码稀疏度,ρ为正则化参数,且ρ>0;式中的
Figure BDA0001218063270000087
代表无病字典与无病训练样本的稀疏重构误差,
Figure BDA0001218063270000088
代表无病字典与有病训练样本的重构误差,F表示范数,Ψ(D)为无病字典的Fisher准则约束项,其表达式为:
Figure BDA0001218063270000089
其中m为无病字典D中所有原子的均值,M为无病字典D的原子均值m组成的矩阵,
Figure BDA00012180632700000810
为有病字典
Figure BDA00012180632700000811
中所有原子的均值,α、β分别代表类内间距与类间间距的惩罚系数,α,β>0;模型目的是通过最小化第1项和第3项并同时最大化第2项,则学习的带类标字典对同类样本的重构性能较好,对于非同类样本重构性能较差,甚至无法重构,且学习的字典间具有较强辨别能力,从而获得具有判别性特征从而进一步可以更好的分类;Among them, argmin represents the variable value when the objective function takes the minimum value, Y,
Figure BDA0001218063270000084
represent the disease-free and diseased training samples, respectively, X,
Figure BDA0001218063270000085
represent the sparse representation coefficients of disease-free and diseased training samples, respectively, N and
Figure BDA0001218063270000086
Represent the number of disease-free and diseased image feature vectors, respectively, L 1 is the coding sparsity of disease-free samples and diseased samples under the disease-free dictionary, ρ is a regularization parameter, and ρ>0; in the formula
Figure BDA0001218063270000087
represents the sparse reconstruction error of the disease-free dictionary and the disease-free training samples,
Figure BDA0001218063270000088
Represents the reconstruction error between the disease-free dictionary and the diseased training sample, F represents the norm, Ψ(D) is the Fisher criterion constraint of the disease-free dictionary, and its expression is:
Figure BDA0001218063270000089
where m is the mean of all atoms in the disease-free dictionary D, M is a matrix composed of the atomic mean m of the disease-free dictionary D,
Figure BDA00012180632700000810
dictionary for sick
Figure BDA00012180632700000811
The mean of all atoms in , α and β represent the penalty coefficients of intra-class spacing and inter-class spacing, respectively, α, β>0; the purpose of the model is to minimize the first and third terms and maximize the second term, then The learned dictionary with class labels has good reconstruction performance for similar samples, but poor reconstruction performance for non-homogeneous samples, and even cannot be reconstructed, and the learned dictionaries have strong discriminative ability, so as to obtain discriminative features and further. can be better classified;

2-3:固定无病字典D,更新稀疏编码系数,此时的目标函数如下:2-3: Fix the disease-free dictionary D and update the sparse coding coefficients. The objective function at this time is as follows:

Figure BDA0001218063270000091
Figure BDA0001218063270000091

令训练样本

Figure BDA0001218063270000092
编码系数矩阵
Figure BDA0001218063270000093
L1为无病样本和有病样本在无病字典下的编码稀疏度,最优稀疏解为
Figure BDA0001218063270000094
则目标函数的求解分为无病训练样本在无病字典D下的稀疏表示与有病训练样本在无病字典D下的稀疏表示两步迭代完成,统一的简化如下:Let the training sample
Figure BDA0001218063270000092
Coding coefficient matrix
Figure BDA0001218063270000093
L 1 is the coding sparsity of disease-free samples and diseased samples under the disease-free dictionary, and the optimal sparse solution is
Figure BDA0001218063270000094
The solution of the objective function is divided into two steps: the sparse representation of the disease-free training samples under the disease-free dictionary D and the sparse representation of the diseased training samples under the disease-free dictionary D. The unified simplification is as follows:

Figure BDA0001218063270000095
Figure BDA0001218063270000095

利用SPAMS工具箱中的OMP算法,分别求解训练样本在无病字典D稀疏解

Figure BDA0001218063270000096
Use the OMP algorithm in the SPAMS toolbox to solve the sparse solutions of the training samples in the disease-free dictionary D respectively
Figure BDA0001218063270000096

2-4:固定稀疏编码系数,更新无病字典D,此时的目标函数如下:2-4: Fix the sparse coding coefficients and update the disease-free dictionary D. The objective function at this time is as follows:

Figure BDA0001218063270000097
Figure BDA0001218063270000097

通过化简得:By simplifying:

Figure BDA0001218063270000098
Figure BDA0001218063270000098

其中,tr表示矩阵的迹where tr represents the trace of the matrix

Figure BDA0001218063270000099
Figure BDA0001218063270000099

上述函数为凸函数,采用坐标梯度下降法求出无病字典D最优解。The above functions are convex functions, and the optimal solution of the disease-free dictionary D is obtained by using the coordinate gradient descent method.

步骤三,优化学习有病字典:结合有病训练样本和无病训练样本,建立有病字典学习模型,通过两步交替迭代的优化方式最小化目标函数,学习得到有病字典。具体步骤为:Step 3, optimize the learning of the diseased dictionary: combine the diseased training samples and the disease-free training samples to establish a diseased dictionary learning model, and minimize the objective function through a two-step alternate iterative optimization method to learn the diseased dictionary. The specific steps are:

3-1:从无病和有病训练样本中分别随机选取n列向量作为初始化的无病字典D和有病字典

Figure BDA0001218063270000101
3-1: Randomly select n-column vectors from the disease-free and diseased training samples as the initialized disease-free dictionary D and diseased dictionary
Figure BDA0001218063270000101

3-2:建立有病字典学习模型,模型如下:3-2: Establish a sick dictionary learning model, the model is as follows:

Figure BDA0001218063270000102
Figure BDA0001218063270000102

其中,Y、

Figure BDA0001218063270000103
分别代表无病与有病训练样本,X、
Figure BDA0001218063270000104
分别代表无病与有病训练样本的稀疏表示系数,N和
Figure BDA0001218063270000105
分别代表无病和有病图像特征向量的数量,L2为无病样本和有病样本在有病字典下的编码稀疏度,ρ为正则化参数,且ρ>0;式中的
Figure BDA0001218063270000106
代表有病字典与有病样本的稀疏重构误差,
Figure BDA0001218063270000107
代表有病字典与无病样本的重构误差,
Figure BDA0001218063270000108
为有病字典的Fisher准则约束项,其表达式为:
Figure BDA0001218063270000109
其中m为无病字典D中所有原子的均值,
Figure BDA00012180632700001010
为有病字典
Figure BDA00012180632700001011
中所有原子的均值,M为有病字典
Figure BDA00012180632700001012
中所有原子的均值
Figure BDA00012180632700001013
组成的矩阵;模型目的是通过最小化第1项和第3项并同时最大化第2项,则学习的带类标字典对同类样本的重构性能较好,对于非同类样本重构性能较差,甚至无法重构,且学习的字典间具有较强辨别能力,从而获得具有判别性特征从而进一步可以更好的分类。Among them, Y,
Figure BDA0001218063270000103
represent the disease-free and diseased training samples, respectively, X,
Figure BDA0001218063270000104
represent the sparse representation coefficients of disease-free and diseased training samples, respectively, N and
Figure BDA0001218063270000105
Represent the number of disease-free and diseased image feature vectors respectively, L 2 is the coding sparsity of disease-free samples and diseased samples under the diseased dictionary, ρ is the regularization parameter, and ρ>0; in the formula
Figure BDA0001218063270000106
represents the sparse reconstruction error of the diseased dictionary and the diseased sample,
Figure BDA0001218063270000107
represents the reconstruction error between the diseased dictionary and the disease-free sample,
Figure BDA0001218063270000108
is the Fisher criterion constraint of the diseased dictionary, and its expression is:
Figure BDA0001218063270000109
where m is the mean of all atoms in the disease-free dictionary D,
Figure BDA00012180632700001010
dictionary for sick
Figure BDA00012180632700001011
The mean of all atoms in , M is the sick dictionary
Figure BDA00012180632700001012
mean of all atoms in
Figure BDA00012180632700001013
The purpose of the model is to minimize the 1st and 3rd items and maximize the 2nd item at the same time, then the learned dictionary with class labels has better reconstruction performance for similar samples, and better reconstruction performance for non-homogeneous samples. Poor, or even impossible to reconstruct, and the learned dictionaries have strong discriminative ability, so as to obtain discriminative features and further better classification.

3-3:固定有病字典

Figure BDA00012180632700001014
更新稀疏编码系数,此时的目标函数如下:3-3: Fixed sick dictionary
Figure BDA00012180632700001014
To update the sparse coding coefficients, the objective function at this time is as follows:

Figure BDA00012180632700001015
Figure BDA00012180632700001015

令训练样本

Figure BDA00012180632700001016
编码系数矩阵
Figure BDA00012180632700001017
L2为无病样本和有病样本在有病字典下的编码稀疏度,最优稀疏解为
Figure BDA00012180632700001018
则目标函数的求解分为无病训练样本在有病字典
Figure BDA00012180632700001019
下的稀疏表示与有病训练样本在有病字典
Figure BDA00012180632700001020
下的稀疏表示两步迭代完成,统一的简化如下:Let the training sample
Figure BDA00012180632700001016
Coding coefficient matrix
Figure BDA00012180632700001017
L 2 is the coding sparsity of disease-free samples and diseased samples under the diseased dictionary, and the optimal sparse solution is
Figure BDA00012180632700001018
Then the solution of the objective function is divided into the disease-free training samples in the diseased dictionary
Figure BDA00012180632700001019
Sparse representation with sick training samples under sick dictionary
Figure BDA00012180632700001020
The sparse representation below is completed in two iterations, and the unified simplification is as follows:

Figure BDA0001218063270000111
Figure BDA0001218063270000111

利用SPAMS工具箱中的OMP算法,分别求解训练样本在有病字典

Figure BDA0001218063270000112
稀疏解
Figure BDA0001218063270000113
Use the OMP algorithm in the SPAMS toolbox to solve the training samples in the diseased dictionary respectively
Figure BDA0001218063270000112
sparse solution
Figure BDA0001218063270000113

3-4:固定稀疏编码系数,更新有病字典

Figure BDA0001218063270000114
此时的目标函数如下:3-4: Fix sparse coding coefficients, update diseased dictionary
Figure BDA0001218063270000114
The objective function at this time is as follows:

Figure BDA0001218063270000115
Figure BDA0001218063270000115

通过化简得:By simplifying:

Figure BDA0001218063270000116
Figure BDA0001218063270000116

其中,

Figure BDA0001218063270000117
in,
Figure BDA0001218063270000117

采用坐标梯度下降法求出有病字典

Figure BDA0001218063270000118
最优解;Using the Coordinate Gradient Descent Method to Find the Diseased Dictionary
Figure BDA0001218063270000118
Optimal solution;

3-5:返回步骤二,优化学习无病字典和优化学习有病字典的过程交替进行,直至达到最大迭代次数时停止。3-5: Return to step 2, the process of optimizing the learning of the disease-free dictionary and the process of optimizing the learning of the diseased dictionary is performed alternately, and stops when the maximum number of iterations is reached.

步骤四,判断是否达到最大迭代次数,若是,则进入步骤五,若不是,则返回步骤二。Step 4, judge whether the maximum number of iterations is reached, if yes, go to Step 5, if not, go back to Step 2.

步骤五,获得测试样本的重构误差向量:利用获得的无病字典和有病字典,对测试样本进行稀疏表示,然后分别计算测试样本在无病字典和有病字典下的稀疏重构误差向量。具体步骤为:Step 5: Obtain the reconstruction error vector of the test sample: use the obtained disease-free dictionary and diseased dictionary to sparsely represent the test sample, and then calculate the sparse reconstruction error vector of the test sample under the disease-free dictionary and the diseased dictionary respectively. . The specific steps are:

5-1,将测试样本图像分块,每个图块视为一个列向量h,随机取250个图块组成矩阵H作为测试样本,利用

Figure BDA0001218063270000119
求得测试样本H在带类标字典
Figure BDA00012180632700001110
下的稀疏编码
Figure BDA00012180632700001111
5-1, divide the test sample image into blocks, each block is regarded as a column vector h, randomly select 250 blocks to form a matrix H as the test sample, use
Figure BDA0001218063270000119
Obtain the test sample H in the dictionary with class label
Figure BDA00012180632700001110
sparse coding under
Figure BDA00012180632700001111

5-2,计算测试样本在无病字典D与有病字典

Figure BDA00012180632700001112
下的稀疏重构误差向量,即δ1=diag((H-DX)(H-DX)T),
Figure BDA00012180632700001113
其中,diag(·)表示矩阵主对角线上的元素。5-2, Calculate the test sample in the disease-free dictionary D and the diseased dictionary
Figure BDA00012180632700001112
The sparse reconstruction error vector under , namely δ 1 =diag((H-DX)(H-DX) T ),
Figure BDA00012180632700001113
where diag( ) represents the elements on the main diagonal of the matrix.

步骤六:获得测试样本的分类结果:通过稀疏重构误差向量获得分类统计量,然后通过分类统计量与阈值的比较确定测试样本的类别。具体步骤为:Step 6: Obtain the classification result of the test sample: obtain the classification statistic by sparsely reconstructing the error vector, and then determine the category of the test sample by comparing the classification statistic with the threshold. The specific steps are:

6-1,定义向量

Figure BDA0001218063270000121
Nt为测试样本的个数;6-1, define vector
Figure BDA0001218063270000121
N t is the number of test samples;

6-2,由向量C得到分类统计量S:6-2, get the classification statistic S from the vector C:

Figure BDA0001218063270000122
Figure BDA0001218063270000122

当分类统计量S大于或者等于阈值Th,测试样本为无病样本;反之,当分类统计量S小于阈值Th,则测试样本为有病样本。When the classification statistic S is greater than or equal to the threshold Th, the test sample is a disease-free sample; on the contrary, when the classification statistic S is less than the threshold Th, the test sample is a diseased sample.

表1为本发明与其它方法运用到ADL数据库中的肺部图像的分类结果对比表。Table 1 is a comparison table of the classification results of lung images applied to the ADL database by the present invention and other methods.

表1Table 1

Figure BDA0001218063270000123
Figure BDA0001218063270000123

表2为本发明与其它方法运用到ADL数据库中的脾脏图像的分类结果对比表。Table 2 is a comparison table of the classification results of spleen images applied to the ADL database by the present invention and other methods.

表2Table 2

Figure BDA0001218063270000131
Figure BDA0001218063270000131

表2为本发明与其它方法运用到ADL数据库中的肾脏图像的分类结果对比表。Table 2 is a comparison table of the classification results of kidney images applied to the ADL database by the present invention and other methods.

表3table 3

Figure BDA0001218063270000132
Figure BDA0001218063270000132

由表1、表2、表3可以知道,本发明提出的模型对这三类器官的疾病诊断效果明显要好于其他方法,在无病样本与有病样本下正分率都有所提高。特别地,表1的肺部分类结果更为明显,与DFDL相比,本文方法的分类精度提升了2~3%。由图2可知,无病的肺部图像中包含体积较大的肺泡,而在有病的肺部图像中肺泡体积较小,且布满了蓝紫色的炎症细胞,且纹理更为复杂,无病与有病的肺部图像之间差异性明显大于脾脏与肾脏图像。同时,无病与有病的脾脏图像纹理与结构相似度高,但因颜色差异较大,两类图像判别性次之,其分类性能次之;无病与有病的肾脏图像不仅纹理与结构相似高、而且颜色相似度高,判别性最差,其分类性能最弱。表中实验结果与图1完全相符,再次说明本发明提出的模型的有效性。From Table 1, Table 2, and Table 3, it can be known that the model proposed by the present invention has significantly better disease diagnosis effect on these three types of organs than other methods, and the positive score rate is improved in both disease-free samples and diseased samples. In particular, the lung classification results in Table 1 are more obvious. Compared with DFDL, the classification accuracy of our method is improved by 2-3%. It can be seen from Figure 2 that the lung images without disease contain larger alveoli, while in the images of diseased lungs, the alveoli are smaller in volume, covered with blue-purple inflammatory cells, and have more complex textures. The difference between diseased and diseased lung images was significantly greater than that of spleen and kidney images. At the same time, the texture and structure of the images of the spleen without disease and disease are highly similar, but due to the large color difference, the two types of images are second in discriminative performance, and their classification performance is second; the images of kidneys without disease and disease are not only texture and structure The similarity is high, and the color similarity is high, the discriminative is the worst, and its classification performance is the weakest. The experimental results in the table are completely consistent with Fig. 1, again illustrating the validity of the model proposed by the present invention.

为了验证本发明构建的组织病理图像的判别性特征学习框架的普适性,特别的,将本发明所提的模型应用于BreaKHis数据集中疾病类型的诊断。In order to verify the universality of the discriminative feature learning framework for histopathological images constructed by the present invention, in particular, the model proposed by the present invention is applied to the diagnosis of disease types in the BreaKHis dataset.

表4为本发明与其它方法运用到BreaKHis数据库中分类结果对比表。Table 4 is a comparison table of classification results applied to BreaKHis database between the present invention and other methods.

表4Table 4

Figure BDA0001218063270000141
Figure BDA0001218063270000141

表4给出了不同方法在BreaKHis数据库上的分类结果,实验结果表明,本发明提出的模型对于图3中两种良性乳腺癌图像同样显示出了较好疾病分类性能,这一结果说明本发明对于有效提高带类标字典对同类样本的稀疏表示的重构性与鲁棒性具有较好的作用,同时也解决了对于非同类样本判别性差的问题。Table 4 shows the classification results of different methods on the BreaKHis database. The experimental results show that the model proposed by the present invention also shows better disease classification performance for the two benign breast cancer images in FIG. 3, and this result shows that the present invention It has a good effect on effectively improving the reconstruction and robustness of the sparse representation of the sparse representation of the similar samples with the class-labeled dictionary, and also solves the problem of poor discrimination for non-similar samples.

Claims (4)

1. A tissue pathology image recognition method, comprising the steps of:
firstly, selecting a plurality of image blocks from disease-free images and disease-containing images of a certain tissue as disease-free training samples and disease-containing training samples, and disease-free testing samples and disease-containing testing samples;
step two, optimizing and learning the disease-free dictionary: establishing a study model of the disease-free dictionary by combining the disease-free training samples and the disease training samples, and obtaining the disease-free dictionary through learning by minimizing a target function in a two-step alternate iterative optimization mode;
the second step comprises the following specific steps
2-1: respectively randomly selecting n column vectors from the training samples without diseases and with diseases as initialized dictionary D without diseases and dictionary with diseases
Figure FDA0002551522900000011
2-2: establishing a disease-free dictionary learning model, wherein the model is as follows:
Figure FDA0002551522900000012
wherein argmin represents a variable value at which the objective function is minimized, Y,
Figure FDA0002551522900000013
Respectively represent the training samples of no disease and disease, X,
Figure FDA0002551522900000019
Sparse representation coefficients representing the training samples of disease-free and disease respectively, N and N representing the number of feature vectors of disease-free and disease images respectively, L1The encoding sparsity of the disease-free samples and the disease-containing samples under the disease-free dictionary, rho is a regularization parameter, and rho>0; in the formula
Figure FDA0002551522900000014
Representing the sparse reconstruction error of the disease-free dictionary and the disease-free training sample,
Figure FDA0002551522900000015
representing the reconstruction error of the disease-free dictionary and the disease-containing training sample, wherein F represents a norm, psi (D) is a Fisher criterion constraint term of the disease-free dictionary, and the expression is as follows:
Figure FDA0002551522900000016
wherein M is the mean value of all atoms in the disease-free dictionary D, M is a matrix formed by the mean values M of the atoms in the disease-free dictionary D,
Figure FDA0002551522900000017
for having a fault dictionary
Figure FDA0002551522900000018
The mean values of all atoms in (α), (β) represent the penalty coefficients of the intra-class spacing and the inter-class spacing, α>0;
2-3: fixing the disease-free dictionary D, and updating the sparse coding coefficient, wherein the objective function at the moment is as follows:
Figure FDA0002551522900000021
order training sample
Figure FDA0002551522900000022
Coding coefficient matrix
Figure FDA0002551522900000023
L1The coding sparsity of the disease-free samples and the disease-containing samples under the disease-free dictionary is optimally solved as
Figure FDA0002551522900000024
Then, the solution of the objective function is completed by two steps of iteration of the sparse representation of the disease-free training sample in the disease-free dictionary D and the sparse representation of the disease-free training sample in the disease-free dictionary D, and the unified simplification is as follows:
Figure FDA0002551522900000025
respectively solving sparse solutions of training samples in the disease-free dictionary D by utilizing OMP algorithm in SPAMS toolbox
Figure FDA0002551522900000026
2-4: fixing the sparse coding coefficient, and updating the disease-free dictionary D, wherein the objective function at the moment is as follows:
Figure FDA0002551522900000027
through simplification, the method comprises the following steps:
Figure FDA0002551522900000028
where tr denotes the trace of the matrix
Figure FDA0002551522900000029
Solving an optimal solution of the disease-free dictionary D by adopting a coordinate gradient descent method; step three, optimizing and learning the sick dictionary: establishing a diseased dictionary learning model by combining a diseased training sample and a disease-free training sample, and learning to obtain a diseased dictionary by minimizing a target function in a two-step alternate iteration optimization mode;
the third step comprises the following specific steps
3-1: respectively randomly selecting n column vectors from the training samples without diseases and with diseases as initialized dictionary D without diseases and dictionary with diseases
Figure FDA00025515229000000210
3-2: a disease dictionary learning model is established, and the model is as follows:
Figure FDA0002551522900000031
wherein Y is,
Figure FDA0002551522900000032
Respectively represent the training samples of no disease and disease, X,
Figure FDA0002551522900000033
Sparse representation coefficients representing the training samples of disease-free and disease respectively, N and N representing the number of feature vectors of disease-free and disease images respectively, L2The encoding sparsity of the disease-free samples and the disease-containing samples under the disease dictionary, rho is a regularization parameter, and rho>0; in the formula
Figure FDA0002551522900000034
Representing sparse reconstruction errors of the diseased dictionary and the diseased sample,
Figure FDA0002551522900000035
representing the reconstruction error of the diseased dictionary and the non-diseased sample,
Figure FDA0002551522900000036
the Fisher criterion constraint term of the sick dictionary is expressed as:
Figure FDA0002551522900000037
where m is the mean of all atoms in the disease-free dictionary D,
Figure FDA0002551522900000038
for having a fault dictionary
Figure FDA0002551522900000039
The mean value of all the atoms in (c),
Figure FDA00025515229000000310
for having a fault dictionary
Figure FDA00025515229000000311
Mean of all atoms in
Figure FDA00025515229000000312
A matrix of compositions;
3-3: fixed with a sick dictionary
Figure FDA00025515229000000313
And updating the sparse coding coefficient, wherein the objective function at the moment is as follows:
Figure FDA00025515229000000314
order training sample
Figure FDA00025515229000000315
Coding coefficient matrix
Figure FDA00025515229000000316
L2The coding sparsity of the disease-free samples and the disease-containing samples under the disease dictionary is determined as the optimal sparsity solution
Figure FDA00025515229000000317
The solution of the objective function is divided into the case that the disease-free training sample is in the disease dictionary
Figure FDA00025515229000000318
Sparse representation of lower and sick training sample in sick dictionary
Figure FDA00025515229000000319
The following sparseness represents two iterative steps, which are uniformly simplified as follows:
Figure FDA00025515229000000320
respectively solving the training samples in the dictionary with diseases by utilizing the OMP algorithm in the SPAMS toolbox
Figure FDA00025515229000000321
Sparse solution
Figure FDA00025515229000000322
3-4: fixing sparse coding coefficients and updating a sick dictionary
Figure FDA00025515229000000323
The objective function at this time is as follows:
Figure FDA00025515229000000324
through simplification, the method comprises the following steps:
Figure FDA00025515229000000325
wherein,
Figure FDA0002551522900000041
method for solving dictionary with diseases by adopting coordinate gradient descent method
Figure FDA0002551522900000042
An optimal solution;
step four, judging whether the maximum iteration times is reached, if so, entering step five, and if not, returning to step two;
step five, obtaining a reconstructed error vector of the test sample: performing sparse representation on the test sample by using the acquired disease-free dictionary and disease dictionary, and then respectively calculating sparse reconstruction error vectors of the test sample under the disease-free dictionary and the disease dictionary;
step six: obtaining a classification result of the test sample: obtaining a classification statistic through sparse reconstruction of the error vector, and then determining the category of the test sample through comparison of the classification statistic with a threshold value.
2. The histopathological image recognition method according to claim 1, wherein: the first step is that the image blocks with the same number are respectively selected from two images with diseases and without diseases of a certain tissue, then each image block is divided into RGB three channels, pixel values of the three channels are converted into column vectors and then are connected in series to obtain a feature vector, finally the feature vectors are juxtaposed to be used as training samples Y with diseases and without diseases,
Figure FDA0002551522900000043
test samples were obtained in the same manner.
3. The histopathological image recognition method according to claim 2, wherein: the concrete steps of the fifth step are
5-1, dividing the image of the test sample into blocks, regarding each block as a column vector H, randomly selecting u blocks to form a matrix H as the test sample, and utilizing
Figure FDA0002551522900000044
Solving test sample H in dictionary with class mark
Figure FDA0002551522900000045
Sparse coding of
Figure FDA0002551522900000046
5-2, calculating the test samples in the non-diseased dictionary D and the diseased dictionary
Figure FDA0002551522900000047
Sparse reconstructed error vector of1=diag((H-DX)(H-DX)T),
Figure FDA0002551522900000048
Where diag (·) represents the elements on the main diagonal of the matrix.
4. The histopathological image recognition method according to claim 3, wherein the concrete step of the sixth step is
6-1, defining a vector
Figure FDA0002551522900000051
NtThe number of the test samples;
6-2, obtaining a classification statistic S from the vector C:
Figure FDA0002551522900000052
when the classification statistic S is greater than or equal to the threshold Th, the test sample is a disease-free sample; otherwise, when the classification statistic S is smaller than the threshold Th, the test sample is a diseased sample.
CN201710059300.0A 2017-01-24 2017-01-24 A kind of histopathological image recognition method Active CN106845551B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710059300.0A CN106845551B (en) 2017-01-24 2017-01-24 A kind of histopathological image recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710059300.0A CN106845551B (en) 2017-01-24 2017-01-24 A kind of histopathological image recognition method

Publications (2)

Publication Number Publication Date
CN106845551A CN106845551A (en) 2017-06-13
CN106845551B true CN106845551B (en) 2020-08-11

Family

ID=59122438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710059300.0A Active CN106845551B (en) 2017-01-24 2017-01-24 A kind of histopathological image recognition method

Country Status (1)

Country Link
CN (1) CN106845551B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832786B (en) * 2017-10-31 2019-10-25 济南大学 A Face Recognition Classification Method Based on Dictionary Learning
CN109063766B (en) * 2018-07-31 2021-11-30 湘潭大学 Image classification method based on discriminant prediction sparse decomposition model
CN109308485B (en) * 2018-08-02 2022-11-29 中国矿业大学 A Migration Sparse Coding Image Classification Method Based on Dictionary Domain Adaptation
CN109376802B (en) * 2018-12-12 2021-08-03 浙江工业大学 A dictionary learning-based method for classifying gastroscopic organs
CN111027594B (en) * 2019-11-18 2022-08-12 西北工业大学 A step-by-step anomaly detection method based on dictionary representation
CN113627556B (en) * 2021-08-18 2023-03-24 广东电网有限责任公司 Method and device for realizing image classification, electronic equipment and storage medium
CN113793319B (en) * 2021-09-13 2023-08-25 浙江理工大学 Fabric image flaw detection method and system based on category constraint dictionary learning model
CN114428873B (en) * 2022-04-07 2022-06-28 源利腾达(西安)科技有限公司 Thoracic surgery examination data sorting method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9946931B2 (en) * 2015-04-20 2018-04-17 Los Alamos National Security, Llc Change detection and change monitoring of natural and man-made features in multispectral and hyperspectral satellite imagery
CN104866810B (en) * 2015-04-10 2018-07-13 北京工业大学 A kind of face identification method of depth convolutional neural networks
CN105844223A (en) * 2016-03-18 2016-08-10 常州大学 Face expression algorithm combining class characteristic dictionary learning and shared dictionary learning

Also Published As

Publication number Publication date
CN106845551A (en) 2017-06-13

Similar Documents

Publication Publication Date Title
CN106845551B (en) A kind of histopathological image recognition method
CN107122809B (en) A neural network feature learning method based on image self-encoding
Jia et al. Image transformation based on learning dictionaries across image spaces
CN110533683B (en) A radiomics analysis method integrating traditional features and deep features
CN104008375B (en) The integrated face identification method of feature based fusion
CN104933711A (en) Automatic fast segmenting method of tumor pathological image
Hsu et al. Capturing implicit hierarchical structure in 3D biomedical images with self-supervised hyperbolic representations
CN106778807A (en) The fine granularity image classification method of dictionary pair is relied on based on public dictionary pair and class
CN112836671A (en) A Data Dimensionality Reduction Method Based on Maximizing Ratio and Linear Discriminant Analysis
CN108460400B (en) Hyperspectral image classification method combining various characteristic information
CN110796022B (en) Low-resolution face recognition method based on multi-manifold coupling mapping
CN115496720A (en) Gastrointestinal cancer pathological image segmentation method and related equipment based on ViT mechanism model
CN113256494A (en) Text image super-resolution method
Franco-Barranco et al. Current progress and challenges in large-scale 3d mitochondria instance segmentation
CN111695455B (en) Low-resolution face recognition method based on coupling discrimination manifold alignment
CN104142978B (en) A kind of image indexing system and method based on multiple features and rarefaction representation
CN110298365B (en) Theme color extraction method based on human vision
CN111783796A (en) A PET/CT Image Recognition System Based on Depth Feature Fusion
Lei et al. HPLTS-GAN: A high-precision remote sensing spatiotemporal fusion method based on low temporal sensitivity
CN108121964B (en) Matrix-based joint sparse locality preserving projection face recognition method
CN114022521A (en) A registration method and system for non-rigid multimodal medical images
CN112949422A (en) Hyperspectral target detection method based on self-supervision spectrum matching framework
Xu et al. Data-efficient histopathology image analysis with deformation representation learning
CN113011506A (en) Texture image classification method based on depth re-fractal spectrum network
CN109754001B (en) Image classification method, computer storage medium and image classification device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant