Disclosure of Invention
Based on the above-mentioned drawbacks of the prior art, an object of the present invention is to provide an automatic segmentation method for medical images based on cognitive deep learning, so as to solve the above-mentioned technical problems.
In order to achieve the above purpose, the invention provides the following technical scheme that the automatic medical image segmentation method based on cognitive deep learning comprises the following steps:
acquiring different types of medical images, and preprocessing the different types of medical images;
Inputting the preprocessed medical images of different types into a trained multi-branch convolutional neural network to obtain output characteristics of medical images of corresponding types of each branch, wherein each branch processes one type of medical image;
Cognizing the output characteristics through channel attention to obtain output characteristic fusion weights;
carrying out output feature combination according to the output features and the output feature fusion weights to obtain fusion segmentation medical images;
and carrying out post-processing on the fusion segmentation medical image to complete automatic segmentation of the medical image.
The invention is further arranged that the preprocessing of the different types of medical images comprises:
normalizing the medical images of different types, and carrying out contrast enhancement on the normalized medical images;
aligning the preprocessed medical images through elastic transformation to enable the positions of key features of different types of medical images to be consistent;
Resampling the medical images of different types after the elastic transformation to ensure that the spatial resolution and the size of the medical images of different types are consistent.
The invention is further arranged for training logic of the trained multi-branch convolutional neural network, comprising:
acquiring a historical data set, wherein the historical data set comprises a historical medical image data set and a historical segmentation medical image data set;
and performing branch training according to the historical data set to obtain the output characteristics of the medical image, calculating loss according to a loss function, repeating training until the loss is smaller than a loss threshold value, and completing training.
The invention is further arranged such that the loss function is: Where SL is the loss function, n is the total number of pixels, q j is the true value of pixel j, For the predicted value of pixel j, w1 is a weight coefficient, and the value logic of q t is:
The invention is further configured to obtain an output feature fusion weight by cognizing the output feature through channel attention, including:
The output characteristic is marked as an output characteristic set F, wherein F epsilon R C*H*W, R is a real number set symbol, elements representing a characteristic diagram belong to real numbers, C is the number of channels, H is the height of the output characteristic, and W is the width of the output characteristic;
Performing global average pooling on the output features in the output feature set F of each channel to obtain a channel descriptor;
And obtaining the fusion weight of the output characteristics according to the channel descriptor.
The invention is further configured to perform global averaging pooling on the output features in the output feature set F of each channel, obtain a channel descriptor, and calculate logic of the channel descriptor is as follows: Where S c is the channel descriptor of the c-th channel, and F (c, l, m) is the element value of the output feature at the height l, width m on the c-th channel.
The invention further provides that the output feature fusion weight is obtained according to the channel descriptor, the calculation logic is Z c=σ(w2*δ(w3*Sc +b1) +b2, wherein Z c is the feature fusion weight, sigma is a sigmoid activation function, delta is a ReLU activation function, w2 and w3 are weight parameters, and b1 and b2 are bias parameters.
The invention is further configured to perform output feature merging according to the output feature and the output feature fusion weight to obtain a fusion segmentation medical image, including:
Acquiring the output characteristics and the corresponding characteristic fusion weights of each channel;
calculating a weighted output feature according to the output feature and the feature fusion weight;
And splicing the weighted output characteristics to obtain a fused medical image, and dividing the fused medical image to obtain the fused divided medical image.
The invention further provides that the post-processing of the fused segmented medical image comprises:
Smoothing the segmentation boundary and filling the small hole of the fusion segmentation medical image through morphological transformation;
and blurring processing is carried out on the fusion segmentation medical image after morphological transformation through a Gaussian filter, so that image noise is reduced.
The invention provides an automatic segmentation method based on cognitive deep learning medical images, which comprises the steps of preprocessing different types of medical images by acquiring the different types of medical images, inputting the preprocessed different types of medical images into a trained multi-branch convolutional neural network to obtain output characteristics of medical images of corresponding types of each branch, wherein each branch processes one type of medical image, cognizing the output characteristics through channel attention to obtain output characteristic fusion weights, merging the output characteristics according to the output characteristics and the output characteristic fusion weights to obtain fusion segmentation medical images, and performing post-processing on the fusion segmentation medical images to complete automatic segmentation of the medical images, wherein the generated beneficial effects comprise:
1. The method can optimize the processing path and parameters for each image type by using the multi-branch convolutional neural network to process different types of medical images, and the customized processing mode can more effectively extract key features in various images, thereby improving the segmentation accuracy and adaptability.
2. Dynamic feature fusion, namely, through a channel attention mechanism, the method can effectively distribute weights of the features output by each branch network, so that dynamic fusion of the features is realized, the dynamic fusion strategy not only considers the characteristics of each type of image, but also dynamically adjusts the fusion strategy according to specific contents, and the final fusion image can reduce information loss and interference while retaining important information.
3. Optimizing the use of computing resources through an efficient network architecture and post-processing technology, including global averaging pooling and feature fusion technology.
4. The morphological transformation and Gaussian filtering processing in the post-processing step are helpful for smoothing the segmentation boundary and filling the small holes, so that the visual effect of segmentation is improved, and the consistency and reliability of the segmentation result are also improved.
The foregoing description is only an overview of the present application, and is intended to be implemented in accordance with the teachings of the present application in order that the same may be more clearly understood and to make the same and other objects, features and advantages of the present application more readily apparent.
Detailed Description
Further advantages and effects of the present invention will become readily apparent to those skilled in the art from the disclosure herein, by referring to the accompanying drawings and the preferred embodiments. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be understood that the preferred embodiments are presented by way of illustration only and not by way of limitation.
It should be noted that the illustrations provided in the following embodiments merely illustrate the basic concept of the present invention by way of illustration, and only the components related to the present invention are shown in the drawings and are not drawn according to the number, shape and size of the components in actual implementation, and the form, number and proportion of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complicated.
In the following description, numerous details are set forth in order to provide a more thorough explanation of embodiments of the present invention, it will be apparent, however, to one skilled in the art that embodiments of the present invention may be practiced without these specific details, in other embodiments, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring the embodiments of the present invention.
The automatic segmentation method based on the cognitive deep learning medical image, as shown in fig. 1, comprises the following steps:
acquiring different types of medical images, and preprocessing the different types of medical images;
Inputting the preprocessed medical images of different types into a trained multi-branch convolutional neural network to obtain output characteristics of medical images of corresponding types of each branch, wherein each branch processes one type of medical image;
Cognizing the output characteristics through channel attention to obtain output characteristic fusion weights;
carrying out output feature combination according to the output features and the output feature fusion weights to obtain fusion segmentation medical images;
and carrying out post-processing on the fusion segmentation medical image to complete automatic segmentation of the medical image.
In particular, acquiring and preprocessing different types of medical images involves collecting different types of medical images, including X-ray (X-ray), one of the most common medical images for examination of bones and certain soft tissue structures, such as the lungs, computed Tomography (CT) scanning, which uses X-rays and computer technology to generate cross-sectional images that can provide more detailed tissue structure information for the diagnosis of various diseases, magnetic Resonance Imaging (MRI), which generates detailed internal body structure images using magnetic fields and harmless radio waves, and has a better display effect on soft tissues and organs, often for examination of brain, spine, joints, etc., ultrasound (Ultrasound) which generates images using high frequency sound waves, often for examination of fetuses, livers, hearts, blood vessels, etc., of pregnant women, positron Emission Tomography (PET) which provides information about tissue and organ functions by detecting the distribution of radiolabeled drugs in vivo, often for tumor diagnosis and evaluation of brain functions. A single photon emission tomography (SPECT) similar to PET scanning but using a single photon radioactive tracer for detecting cardiovascular and nervous system problems, a multi-branch convolutional neural network with multiple parallel convolutional branches for processing different types of medical images, a channel attention mechanism for learning the importance of different characteristic channels in the neural network, wherein the channel attention can help the network focus on the more important characteristic channels for improving performance, an output feature fusion weight obtained by the channel attention mechanism for fusing the output features of each branch, an output feature fusion for weighting and fusing the features according to the output features and the fusion weight to obtain final fused segmented medical images, and a post-processing for removing noise, filling holes, smoothing boundaries and the like applied on the fused segmented medical images to generate final medical image segmentation results.
The invention is further arranged that the preprocessing of the different types of medical images comprises:
The normalization is to scale the pixel value of the image to a fixed range so as to eliminate the brightness difference between different images, and the contrast enhancement is to enhance the details and contrast in the image by adjusting the dynamic range of the pixel value so as to improve the image quality;
The elastic transformation is an image transformation technology, can locally deform the image to realize the alignment of the image, and is generally used for aligning the images of different patients in the medical image, so that the key characteristics, including organ boundaries, are correspondingly consistent in space, thereby being beneficial to improving the generalization capability of a neural network;
the method comprises the steps of carrying out elastic transformation on medical images of different types, and carrying out resampling on the medical images of different types after the elastic transformation to enable the spatial resolution and the size of the medical images of different types to be consistent, wherein in particular, the resampling is to adjust the spatial resolution and the size of the images to enable the different images to have the same spatial scale, so that the neural network is ensured to be capable of processing the images of different resolutions and sizes, because the neural network generally requires the input to have the same size.
The invention is further arranged for training logic of the trained multi-branch convolutional neural network, comprising:
The method comprises the steps of obtaining a historical data set, wherein the historical data set comprises a historical medical image data set and a historical segmentation medical image data set, and specifically comprises a historical medical image data set and a historical segmentation medical image data set, the historical medical image data set comprises original medical image data, and the historical segmentation medical image data set comprises corresponding segmentation tag data, namely accurate boundaries of a region of interest in a medical image;
The method comprises the steps of obtaining a history data set, carrying out branch training according to the history data set to obtain output characteristics of medical images, calculating loss according to the loss function, repeating training until the loss is smaller than a loss threshold value, and completing training, specifically, using the history data set for training of a multi-branch convolutional neural network, wherein each branch is responsible for processing one type of medical image, inputting the medical image into the corresponding branch, carrying out network learning to extract the characteristics of the medical image and generate output characteristics matched with a segmentation label, evaluating the performance of the branch by using the loss function, wherein the loss function generally measures the difference between network output and the segmentation label, and in each training iteration, the network adjusts parameters according to the feedback of the loss function to reduce the difference between prediction and the label until the loss is reduced to meet a predefined loss threshold value.
The invention is further arranged such that the loss function is: Where SL is the loss function, n is the total number of pixels, q j is the true value of pixel j, For the predicted value of pixel j, w1 is a weight coefficient, and the value logic of q t is:
The invention is further configured to obtain an output feature fusion weight by cognizing the output feature through channel attention, including:
The output characteristic is marked as an output characteristic set F, wherein F epsilon R C*H*W, R is a real number set symbol, elements representing a characteristic diagram belong to real numbers, C is the number of channels, H is the height of the output characteristic, and W is the width of the output characteristic;
For the output characteristics of each channel, global average pooling operation is carried out, which is the average pooling operation carried out on the whole characteristic diagram, all elements of each channel are averaged to obtain one characteristic descriptor of the channel, and for C channels, C channel descriptors are obtained;
And specifically, the channel descriptor is used as input, and the output feature fusion weight is obtained through a preset calculation logic.
The invention is further configured to perform global averaging pooling on the output features in the output feature set F of each channel, obtain a channel descriptor, and calculate logic of the channel descriptor is as follows: Where S c is the channel descriptor of the c-th channel, and F (c, l, m) is the element value of the output feature at the height l, width m on the c-th channel.
The invention further provides that the output feature fusion weights are obtained according to the channel descriptors, the calculation logic is Z c=σ(w2*δ(w3*Sc +b1) +b2, wherein Z c is the feature fusion weight, sigma is a sigmoid activation function, delta is a ReLU activation function, w2 and w3 are weight parameters, b1 and b2 are bias parameters, the values of the weight parameters w2 and w3 and the bias parameters b1 and b2 are obtained by learning through a back propagation algorithm in the training process, so that the network can better learn the relation between the features according to input data, the weight parameters w2 and w3 and the bias parameters b1 and b2 are random, then iterative optimization is carried out in the training process through the back propagation algorithm, reLU is a modified linear unit activation function, the sigmoid activation function is used for ensuring that the output is non-negative, the sigmoid activation function maps w3 to the space between (0, 1), the calculation logic can calculate the output feature fusion weights of each channel according to the channel descriptors, the final fusion performance of the output feature fusion can be improved, and the fusion performance of the output feature fusion can be more important, and the fusion performance of the network can be improved
The invention is further configured to perform output feature merging according to the output feature and the output feature fusion weight to obtain a fusion segmentation medical image, including:
And specifically, for each channel k, acquiring a corresponding output characteristic F k and a characteristic fusion weight Z c.
Calculating a weighted output characteristic according to the output characteristic and the characteristic fusion weight, specifically, for each channel k, multiplying the output characteristic F k by the characteristic fusion weight Z c to obtain the weighted output characteristic The weighted output characteristics take the importance of each channel into consideration, and the characteristics of the corresponding channels are weighted according to the weights;
the weighted output characteristics are spliced to obtain a fused medical image, and the fused segmented medical image is obtained by segmentation, and specifically, all the weighted output characteristics are processed The fusion feature map F s is obtained by stitching along the channel dimension, and further, stitching refers to that a plurality of tensors are connected along a certain dimension to form a larger tensor, in the embodiment of the present invention, stitching is performed on the channel dimension, that is, the feature maps of a plurality of channels are connected according to the channel direction, specifically, in the medical image segmentation task, a multi-branch convolutional neural network is used as a model, and in the process of feature extraction, the multi-branch convolutional neural network generates the feature maps of a plurality of channels, and each channel represents different feature representations, so when merging different features, each piece of learned information is reserved to make full use of features of different types and dimensions, and the feature maps of each channel need to be stitched together according to the channel dimension to form a more comprehensive feature representation, so that the model can more comprehensively understand the input image.
The invention further provides that the post-processing of the fused segmented medical image comprises:
The method comprises the steps of carrying out a morphological transformation on the fusion segmentation medical image to smooth the segmentation boundary and fill the small holes, wherein the morphological transformation is an operation based on the shape of the image and is commonly used for smoothing the boundary and filling the holes;
The Gaussian filter is a common image blurring processing method, can effectively reduce noise in an image, and in the embodiment of the invention, the Gaussian filter can be applied to carry out blurring processing on a segmentation result after morphological transformation so as to further reduce noise in the image and enable a segmentation boundary to be smoother and natural.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more sets of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
It should be understood that the term "and/or" is merely an association relationship describing the associated object, and means that three relationships may exist, for example, a and/or B, and may mean that a exists alone, while a and B exist alone, and B exists alone, wherein a and B may be singular or plural. In addition, the character "/" herein generally indicates that the associated object is an "or" relationship, but may also indicate an "and/or" relationship, and may be understood by referring to the context.
In the present application, "at least one" means one or more, and "a plurality" means two or more. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (a, b, or c) of a, b, c, a-b, a-c, b-c, or a-b-c may be represented, wherein a, b, c may be single or plural.
It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided by the present application, it should be understood that the disclosed system may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. The storage medium includes a U disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.