CN114037720A

CN114037720A - Method and device for pathological image segmentation and classification based on semi-supervised learning

Info

Publication number: CN114037720A
Application number: CN202111211187.6A
Authority: CN
Inventors: 宋红; 朱翊铭; 杨健; 付天宇; 肖德强
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2021-10-18
Filing date: 2021-10-18
Publication date: 2022-02-11
Anticipated expiration: 2041-10-18
Also published as: CN114037720B

Abstract

The method and device for segmentation and classification of pathological images based on semi-supervised learning can effectively solve the problem of scale change of images, change the structure of the network so that it can perform segmentation and classification at the same time, solve the problem of class imbalance of samples in full-field slices, and improve classification performance with segmentation. Methods include: (1) using a U-shaped network structure based on Swin‑Transformer Blocks to extract multi-scale information from image data; (2) using dense connections in upsampling; (3) using point annotations of cell images for weak Supervised segmentation, using the negative boundary sparse supervision of annotation points and geometric constraints, combined with the Voronoi partition strategy of point-to-region spatial expansion to perform coarse segmentation; in the fine segmentation stage, the edge in the unmodified image is further utilized through the contour-sensitive constraint function The prior knowledge is used to adjust the kernel contour; (4) in the network, the last linear mapping layer is modified so that it can input the results of segmentation and classification at the same time.

Description

Pathological image segmentation and classification method and device based on semi-supervised learning

Technical Field

The invention relates to the technical field of medical image processing, in particular to a pathological image segmentation and classification method based on semi-supervised learning and a cytopathological image segmentation and classification device based on semi-supervised learning.

Background

In pathological Image analysis, images are mainly stored in full-field digital (WSI) slices, traditional pathological slices and high-resolution digital images of mobile phones are scanned by a digital scanner, and then the digital images are seamlessly spliced and integrated by a computer to produce a visible digital Image. Compared with the traditional glass slide, the glass slide has the advantages of easy storage, difficult fading, difficult loss, easy retrieval and the like. When cells in a pathological image are segmented, a time-consuming and labor-consuming work is required when the images are labeled, a large amount of manpower is required for labeling each cell at a pixel level, generally, a finished WSI slice has tens of millions of cell images, currently, few researches on a point supervision segmentation kernel are carried out, and due to the complex segmentation background and multi-scale information in the pathological image, the segmentation and classification performance of the images can be influenced.

The difficulties of pathological image analysis are as follows:

1. the image is large in size, difficult to label data, and contains many different scales of information.

2. When the segmentation is performed, the segmented gold standard is required to be different from a class label, and the acquisition of the segmented gold standard is time-consuming and labor-consuming.

3. The class of images has positive and negative samples with class imbalance problems.

The invention content is as follows:

in order to overcome the defects of the prior art, the invention aims to solve the technical problem of a pathological image segmentation and classification method based on semi-supervision, which can effectively solve the problem of image segmentation under weak annotation, and can change a network structure, simultaneously execute segmentation kernel classification and solve the problem of difficult image annotation.

The technical scheme of the invention is as follows: the pathological image segmentation and classification method based on semi-supervised learning comprises the following steps:

(1) extracting multi-scale information from image data in a self-adaptive mode by using a U-shaped network structure based on Swin-Transformer Blocks;

(2) in the up-sampling process, bilinear interpolation dense connection is used, so that the problems of gradient loss and overfitting are relieved while the fineness loss of a decoder is reduced;

(3) performing weak supervision segmentation by using point annotation of a cell image, adopting negative boundary sparse supervision of annotation points and geometric constraints in a rough segmentation stage, and combining a Voronoi division strategy of space expansion from points to regions; in the fine segmentation stage, the kernel contour is adjusted by further utilizing edge priori knowledge in an unmodified image through a contour sensitive constraint function;

(4) in the network, the results of segmentation and classification are output simultaneously by modifying the final linear mapping layer.

According to the invention, through the Swin-Transformer-based U-shaped network structure, multi-scale information can be effectively extracted from an image, and a dense connection is adopted in the up-sampling process, so that the fineness loss of a decoder can be reduced, and the problems of gradient loss and overfitting can be relieved. And changing the structure of the network so that it can perform classification and segmentation simultaneously; two-stage segmentation is performed using more readily available cell point annotations, and in the first stage, coarse segmentation is performed using a negative boundary sparse supervision of annotation points and geometric constraints in combination with a point-to-region spatially expanded Voronoi partitioning strategy. In the second stage, a contour sensitivity constraint function is provided, and the kernel contour is further adjusted by using the edge priori knowledge in the unmodified image.

There is also provided a pathological image segmentation and classification device based on semi-supervised learning, comprising:

the extraction module is used for enabling a network to adaptively extract multi-scale information from the image based on a U-shaped network structure of Swin-Transformer Blocks;

the up-sampling module adopts a dense connection structure based on bilinear difference values, reduces the fineness loss of a decoder, and relieves the problems of gradient disappearance and overfitting;

the weak supervision module is used for carrying out rough segmentation by using weak annotation of the cell image based on annotation points, geometric constraints and boundary constraints in combination with a Voronoi division strategy from points to regions, and adjusting the nuclear contour by using image edge prior information in a fine segmentation stage through a contour sensitive constraint function;

a result output module configured to change a last full connectivity layer of the network to output both the segmentation results and the classification results.

Drawings

Fig. 1 shows a flowchart of a first method of pathological image segmentation and classification based on semi-supervised learning according to the present invention.

FIG. 2 shows a structure diagram of Swin-Transformer Block employed in the present invention.

Fig. 3 shows hierarchical concatenation (a), residual concatenation (b), and dense concatenation (c), respectively.

Fig. 4 shows a schematic diagram of a bilinear interpolation algorithm.

Fig. 5 shows a deep learning overall network architecture employed by the present invention.

Fig. 6 shows the original drawing and the dot interpretation drawing (a), the Voronoi region constraint drawing (b), and the dot interval drawing (c).

Fig. 7 shows a first stage coarse segmentation flow chart.

Fig. 8 shows the rough segmentation map (b) of the first stage of the original (a) in the second stage, the Kirsch operator edge extraction map (c), the morphological edge extraction map (d), and the sparse region map (e).

Detailed Description

As shown in fig. 1, the method for pathological image segmentation and classification based on semi-supervised learning includes the following steps:

Preferably, in the step (1), Swin-Transformer Bolcks perform adaptive feature extraction based on a multi-head attention module and residual connection of a moving window and a multi-layer perceptron, and a calculation mode when Swin-Transformer Bolcks performs feature extraction of the ith layer is formula (1)

Wherein:

and z^lRespectively represent the ith(S) W-MSA of a layer and output of a multi-layer perceptron;

the self-attention mechanism is calculated by equation (2):

wherein:

representing the query, key and value matrices, M²And d represents the number of patch and dimensions of query, key under a window, respectively, and B represents the confusion matrix

A value of (1).

Preferably, in the step (2), in the upsampling process, the problem of gradient loss and overfitting is alleviated while the loss of fineness of the decoder is reduced by adopting a dense connection structure, and in the dense connection process, the upsampling process is formula (3):

wherein: f. of_nRepresenting an up-sampling difference method, wherein the method is a bilinear difference method; known function Q in bilinear difference₁₁＝(x₁,y₁),Q₁₂＝(x₁,y₂),Q₂₁＝(x₂,y₁),Q₂₂＝(x₂,y₂) The bilinear interpolation formula for a pixel (x, y) is (4):

preferably, in the step (3), the cell image is first point-annotated, two distance maps are generated to focus on the positive pixel and the negative pixel respectively, and the distance maps including the point annotation

And edge map

Point annotated distance map

For focusing on high confidence positive pixels, assuming the annotation point for each kernel is close to the center of the kernel, then the point annotation is expanded by a distance filter to a reliable kernel surveillance area,

is calculated by the formula (5), as follows:

wherein: m and n are respectively mark points of the distance annotation graph marked by the points, and alpha is a scaling parameter for controlling the distribution proportion;

negative pixels with high confidence are concentrated on Voronoi diagrams and are marked as

Obtaining the partition edges by a Voronoi diagram, which can be further enlarged by a fast decreasing response of the distance filter (5);

for describing negative pixels with high confidence.

Preferably, in the weakly supervised learning of the step (3):

first, a polar loss function is used to update the Swin-Transformer-Unet parameters, and

the output segmentation graph is shown, and the ploar loss function is expressed by equation (6):

h (x) performing self-supervision learning on the output segmentation graph by correcting the segmentation graph into a binary image mask;

and simultaneously setting two sparse loss functions for updating parameters in the transform network, and respectively paying attention to partial positive and negative pixels:

wherein: the (DEG) point operation represents the pixel-by-pixel product, and reliable weight mask is extracted by ReLu operation to carry out sparse loss calculation, so that L_pointFocusing only on high-confidence positive pixels, L_voronoiOnly negative pixels of high confidence are of interest.

Preferably, in the step (3), in the first-stage coarse segmentation stage, the initial segmentation graph is used as an expansion of the point label in the initial state; iterating the segmentation model using an expanded point-distance map, the maps being updated by the newly trained model, the point-distance map

Updating according to equation (5), wherein the annotation graph P is replaced by the coarse segmentation result of the previous round; this operation is repeated several times to obtain a coarse segmentation result which is denoted as R_coarse。

Preferably, in the step (3), in the first stage fine segmentation stage, a local area map supervision method is used; by extracting the apparent contour of the input image as additional supervision, firstly, the edge image is refined, and the result is recorded as E_r：

E_r＝(dilation(R_coarse,k)-erosion(R_coarse,k))&E_Kirsch (9)

Wherein, dilation and erosion are the dilation and erosion morphological operations of the image over k pixels, respectively, E_KirschRepresenting the image after the Kirsch operator extracts the edge of the input image.

Preferably, in the second stage of the step (3), sparse supervised learning, in order to implement supplementary boundary supervision, the contour sensitivity loss L is added_contourTo fine-tune the contour of the kernel, and also to use the map of the local area for supervision

Wherein E is_rRepresents the edge map of the refinement process and Kirsch represents the Kirsch operator.

It will be understood by those skilled in the art that all or part of the steps in the method of the above embodiments may be implemented by hardware instructions related to a program, the program may be stored in a computer-readable storage medium, and when executed, the program includes the steps of the method of the above embodiments, and the storage medium may be: ROM/RAM, magnetic disks, optical disks, memory cards, and the like. Therefore, corresponding to the method of the invention, the invention also includes a device for pathological image segmentation and classification based on semi-supervised learning, which is generally expressed in the form of functional modules corresponding to the steps of the method. The device includes:

The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and all simple modifications, equivalent variations and modifications made to the above embodiment according to the technical spirit of the present invention still belong to the protection scope of the technical solution of the present invention.

Claims

1. A method for pathological image segmentation and classification based on semi-supervised learning, characterized in that: it comprises the following steps:

(1) Using a U-shaped network structure based on Swin-Transformer Blocks to adaptively extract multi-scale information from image data;

(2) In the process of upsampling, the dense connection of bilinear interpolation is used to reduce the fineness loss of the decoder and alleviate the problems of gradient disappearance and overfitting;

(3) Weakly supervised segmentation is performed using point annotations of cell images. In the coarse segmentation stage, the negative boundary sparse supervision of annotation points and geometric constraints is used, combined with the Voronoi division strategy of point-to-region spatial expansion; in the fine segmentation stage, contour-sensitive constraints are adopted. function, which further utilizes the prior knowledge of edges in the unretouched image to adjust the kernel contour;

2. the method for pathological image segmentation and classification based on semi-supervised learning according to claim 1, is characterized in that: in described step (1), Swin-Transformer Bolcks is based on the multi-head attention module of moving window, residual connection With multi-layer perceptron for adaptive feature extraction, the calculation method of Swin-Transformer Bolcks in the i-th layer for feature extraction is formula (1)

in:

and z ^l represent the output of the (S)W-MSA and the multilayer perceptron of the i-th layer, respectively;

The self-attention mechanism is calculated by formula (2):

Among them: Q, K,

Represents the query, key and value matrix, M ² and d represent the number of patches in a window and the dimension of query and key, respectively, and B represents the confusion matrix

value in .

3. The method for pathological image segmentation and classification based on semi-supervised learning according to claim 2, wherein in the step (2), in the upsampling process, a dense connection structure is adopted to reduce the fineness of the decoder At the same time of loss, the problem of gradient disappearance and overfitting is alleviated. In dense connection, the upsampling process is formula (3):

Among them: f _n represents the difference method of up-sampling, this method is the bilinear difference method; in the bilinear difference, the known functions Q ₁₁ =(x ₁ , y ₁ ), Q ₁₂ =(x ₁ , y ) ₂ ), Q ₂₁ =(x ₂ , y ₁ ), Q ₂₂ =(x ₂ , y ₂ ) values of four points, the formula for bilinear interpolation of one pixel point (x, y) is (4):

4. The method for pathological image segmentation and classification based on semi-supervised learning according to claim 3, characterized in that: in the step (3), point annotation is first performed on the cell image, and two distance maps are respectively generated to focus on Distance maps for positive and negative pixels, including point annotations

with edge graph

Distance map for point annotations

is used to focus on high-confidence positive pixels, assuming that the annotation points of each kernel are close to the center of the kernel, and then expand the point annotations to reliable kernel supervision regions through distance filters,

Each element in is calculated by equation (5) as follows:

Among them: m, n are the marked points of the distance annotation map of the point labeling, and α is the scaling parameter that controls the distribution scale;

Use Voronoi diagram to focus on high-confidence negative pixels, denoted as

The partition edges are obtained by Voronoi diagrams, which can be further enlarged by the rapidly decreasing response of the distance filter (5);

Used to describe negative pixels with high confidence.

5. the method for pathological image segmentation and classification based on semi-supervised learning according to claim 4, is characterized in that: in the weakly supervised learning of described step (3):

First, the polar loss function is used to update the parameters of Swin-Transformer-Unet, using

Represents the segmentation map of the output, and the ploar loss function is formula (6):

Among them, H(x) performs self-supervised learning by modifying the output segmentation map into a binarized image mask;

At the same time, two sparse loss functions are set to update the parameters in the Transformer network, focusing on some positive and negative pixels:

Among them: ( ) point operation represents pixel-by-pixel product, ReLu operation extracts a reliable weight mask for sparse loss calculation, let L _point only focus on positive pixels with high confidence, and L _voronoi only focus on negative pixels with high confidence.

6. The method for pathological image segmentation and classification based on semi-supervised learning according to claim 5, characterized in that: in the step (3), in the first stage rough segmentation stage, the initial segmentation map is used as the initial Extension of point labeling in state; segmentation model is iterated using extended point distance maps, which are updated by the newly trained model, point distance maps

Update according to formula (5), in which the annotation map P is replaced with the coarse segmentation result of the previous round; this operation is repeated several times, and the coarse segmentation result obtained is recorded as R _coarse .

7. The method for pathological image segmentation and classification based on semi-supervised learning according to claim 6, characterized in that: in the step (3), in the first stage fine segmentation stage, a local area map supervision method is used; By extracting the apparent contour of the input image as additional supervision, the edge map is first refined and the result is denoted as E _r :

E _r = (dilation(R _coarse , k)-erosion(R _coarse , k))&E _Kirsch (9)

Among them, dilation and erosion are the dilation and erosion morphological operations of the image on k pixels, respectively, and E _Kirsch represents the image after the Kirsch operator extracts the edge of the input image.

8. The method for pathological image segmentation and classification based on semi-supervised learning according to claim 7, wherein in the second stage of the step (3), in the sparse supervised learning, in order to realize supplementary boundary supervision, additional A contour-sensitive loss L _contour to fine-tune the kernel contour, again, using local area maps for supervision

Among them, _Er represents the refined edge map, and Kirsch represents the kirsch operator.

9. A device for pathological image segmentation and classification based on semi-supervised learning, characterized in that: it comprises:

The extraction module, based on the U-shaped network structure of Swin-Transformer Blocks, enables the network to adaptively extract multi-scale information from the image;

The upsampling module adopts a dense connection structure based on bilinear difference to reduce the loss of decoder precision and alleviate the problems of gradient disappearance and overfitting;

The weak supervision module uses the weak annotation of the cell image to use the Voronoi division strategy based on annotation points and geometric constraints, boundary constraints, combined with points to regions to perform rough segmentation, and the fine segmentation stage uses the image edge prior information, through the contour sensitive constraint function, Carry out the adjustment of the nuclear contour;

The result output module, which is configured to change the last fully connected layer of the network to output both segmentation and classification results.