The invention content is as follows:
in order to overcome the defects of the prior art, the invention aims to solve the technical problem of a pathological image segmentation and classification method based on semi-supervision, which can effectively solve the problem of image segmentation under weak annotation, and can change a network structure, simultaneously execute segmentation kernel classification and solve the problem of difficult image annotation.
The technical scheme of the invention is as follows: the pathological image segmentation and classification method based on semi-supervised learning comprises the following steps:
(1) extracting multi-scale information from image data in a self-adaptive mode by using a U-shaped network structure based on Swin-Transformer Blocks;
(2) in the up-sampling process, bilinear interpolation dense connection is used, so that the problems of gradient loss and overfitting are relieved while the fineness loss of a decoder is reduced;
(3) performing weak supervision segmentation by using point annotation of a cell image, adopting negative boundary sparse supervision of annotation points and geometric constraints in a rough segmentation stage, and combining a Voronoi division strategy of space expansion from points to regions; in the fine segmentation stage, the kernel contour is adjusted by further utilizing edge priori knowledge in an unmodified image through a contour sensitive constraint function;
(4) in the network, the results of segmentation and classification are output simultaneously by modifying the final linear mapping layer.
According to the invention, through the Swin-Transformer-based U-shaped network structure, multi-scale information can be effectively extracted from an image, and a dense connection is adopted in the up-sampling process, so that the fineness loss of a decoder can be reduced, and the problems of gradient loss and overfitting can be relieved. And changing the structure of the network so that it can perform classification and segmentation simultaneously; two-stage segmentation is performed using more readily available cell point annotations, and in the first stage, coarse segmentation is performed using a negative boundary sparse supervision of annotation points and geometric constraints in combination with a point-to-region spatially expanded Voronoi partitioning strategy. In the second stage, a contour sensitivity constraint function is provided, and the kernel contour is further adjusted by using the edge priori knowledge in the unmodified image.
There is also provided a pathological image segmentation and classification device based on semi-supervised learning, comprising:
the extraction module is used for enabling a network to adaptively extract multi-scale information from the image based on a U-shaped network structure of Swin-Transformer Blocks;
the up-sampling module adopts a dense connection structure based on bilinear difference values, reduces the fineness loss of a decoder, and relieves the problems of gradient disappearance and overfitting;
the weak supervision module is used for carrying out rough segmentation by using weak annotation of the cell image based on annotation points, geometric constraints and boundary constraints in combination with a Voronoi division strategy from points to regions, and adjusting the nuclear contour by using image edge prior information in a fine segmentation stage through a contour sensitive constraint function;
a result output module configured to change a last full connectivity layer of the network to output both the segmentation results and the classification results.
Detailed Description
As shown in fig. 1, the method for pathological image segmentation and classification based on semi-supervised learning includes the following steps:
(1) extracting multi-scale information from image data in a self-adaptive mode by using a U-shaped network structure based on Swin-Transformer Blocks;
(2) in the up-sampling process, bilinear interpolation dense connection is used, so that the problems of gradient loss and overfitting are relieved while the fineness loss of a decoder is reduced;
(3) performing weak supervision segmentation by using point annotation of a cell image, adopting negative boundary sparse supervision of annotation points and geometric constraints in a rough segmentation stage, and combining a Voronoi division strategy of space expansion from points to regions; in the fine segmentation stage, the kernel contour is adjusted by further utilizing edge priori knowledge in an unmodified image through a contour sensitive constraint function;
(4) in the network, the results of segmentation and classification are output simultaneously by modifying the final linear mapping layer.
According to the invention, through the Swin-Transformer-based U-shaped network structure, multi-scale information can be effectively extracted from an image, and a dense connection is adopted in the up-sampling process, so that the fineness loss of a decoder can be reduced, and the problems of gradient loss and overfitting can be relieved. And changing the structure of the network so that it can perform classification and segmentation simultaneously; two-stage segmentation is performed using more readily available cell point annotations, and in the first stage, coarse segmentation is performed using a negative boundary sparse supervision of annotation points and geometric constraints in combination with a point-to-region spatially expanded Voronoi partitioning strategy. In the second stage, a contour sensitivity constraint function is provided, and the kernel contour is further adjusted by using the edge priori knowledge in the unmodified image.
Preferably, in the step (1), Swin-Transformer Bolcks perform adaptive feature extraction based on a multi-head attention module and residual connection of a moving window and a multi-layer perceptron, and a calculation mode when Swin-Transformer Bolcks performs feature extraction of the ith layer is formula (1)
Wherein:
and z
lRespectively represent the ith(S) W-MSA of a layer and output of a multi-layer perceptron;
the self-attention mechanism is calculated by equation (2):
wherein:
representing the query, key and value matrices, M
2And d represents the number of patch and dimensions of query, key under a window, respectively, and B represents the confusion matrix
A value of (1).
Preferably, in the step (2), in the upsampling process, the problem of gradient loss and overfitting is alleviated while the loss of fineness of the decoder is reduced by adopting a dense connection structure, and in the dense connection process, the upsampling process is formula (3):
wherein: f. ofnRepresenting an up-sampling difference method, wherein the method is a bilinear difference method; known function Q in bilinear difference11=(x1,y1),Q12=(x1,y2),Q21=(x2,y1),Q22=(x2,y2) The bilinear interpolation formula for a pixel (x, y) is (4):
preferably, in the step (3), the cell image is first point-annotated, two distance maps are generated to focus on the positive pixel and the negative pixel respectively, and the distance maps including the point annotation
And edge map
Point annotated distance map
For focusing on high confidence positive pixels, assuming the annotation point for each kernel is close to the center of the kernel, then the point annotation is expanded by a distance filter to a reliable kernel surveillance area,
is calculated by the formula (5), as follows:
wherein: m and n are respectively mark points of the distance annotation graph marked by the points, and alpha is a scaling parameter for controlling the distribution proportion;
negative pixels with high confidence are concentrated on Voronoi diagrams and are marked as
Obtaining the partition edges by a Voronoi diagram, which can be further enlarged by a fast decreasing response of the distance filter (5);
for describing negative pixels with high confidence.
Preferably, in the weakly supervised learning of the step (3):
first, a polar loss function is used to update the Swin-Transformer-Unet parameters, and
the output segmentation graph is shown, and the ploar loss function is expressed by equation (6):
h (x) performing self-supervision learning on the output segmentation graph by correcting the segmentation graph into a binary image mask;
and simultaneously setting two sparse loss functions for updating parameters in the transform network, and respectively paying attention to partial positive and negative pixels:
wherein: the (DEG) point operation represents the pixel-by-pixel product, and reliable weight mask is extracted by ReLu operation to carry out sparse loss calculation, so that LpointFocusing only on high-confidence positive pixels, LvoronoiOnly negative pixels of high confidence are of interest.
Preferably, in the step (3), in the first-stage coarse segmentation stage, the initial segmentation graph is used as an expansion of the point label in the initial state; iterating the segmentation model using an expanded point-distance map, the maps being updated by the newly trained model, the point-distance map
Updating according to equation (5), wherein the annotation graph P is replaced by the coarse segmentation result of the previous round; this operation is repeated several times to obtain a coarse segmentation result which is denoted as R
coarse。
Preferably, in the step (3), in the first stage fine segmentation stage, a local area map supervision method is used; by extracting the apparent contour of the input image as additional supervision, firstly, the edge image is refined, and the result is recorded as Er:
Er=(dilation(Rcoarse,k)-erosion(Rcoarse,k))&EKirsch (9)
Wherein, dilation and erosion are the dilation and erosion morphological operations of the image over k pixels, respectively, EKirschRepresenting the image after the Kirsch operator extracts the edge of the input image.
Preferably, in the second stage of the step (3), sparse supervised learning, in order to implement supplementary boundary supervision, the contour sensitivity loss L is addedcontourTo fine-tune the contour of the kernel, and also to use the map of the local area for supervision
Wherein E isrRepresents the edge map of the refinement process and Kirsch represents the Kirsch operator.
It will be understood by those skilled in the art that all or part of the steps in the method of the above embodiments may be implemented by hardware instructions related to a program, the program may be stored in a computer-readable storage medium, and when executed, the program includes the steps of the method of the above embodiments, and the storage medium may be: ROM/RAM, magnetic disks, optical disks, memory cards, and the like. Therefore, corresponding to the method of the invention, the invention also includes a device for pathological image segmentation and classification based on semi-supervised learning, which is generally expressed in the form of functional modules corresponding to the steps of the method. The device includes:
the extraction module is used for enabling a network to adaptively extract multi-scale information from the image based on a U-shaped network structure of Swin-Transformer Blocks;
the up-sampling module adopts a dense connection structure based on bilinear difference values, reduces the fineness loss of a decoder, and relieves the problems of gradient disappearance and overfitting;
the weak supervision module is used for carrying out rough segmentation by using weak annotation of the cell image based on annotation points, geometric constraints and boundary constraints in combination with a Voronoi division strategy from points to regions, and adjusting the nuclear contour by using image edge prior information in a fine segmentation stage through a contour sensitive constraint function;
a result output module configured to change a last full connectivity layer of the network to output both the segmentation results and the classification results.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and all simple modifications, equivalent variations and modifications made to the above embodiment according to the technical spirit of the present invention still belong to the protection scope of the technical solution of the present invention.