CN113870138A

CN113870138A - Low-dose CT image denoising method and system based on three-dimensional U-net

Info

Publication number: CN113870138A
Application number: CN202111163825.1A
Authority: CN
Inventors: 韩玉; 宋晓芙; 李磊; 闫镔; 杨双站; 谭思宇; 刘梦楠; 孙钊颖
Original assignee: PLA Information Engineering University
Current assignee: PLA Information Engineering University
Priority date: 2021-09-30
Filing date: 2021-09-30
Publication date: 2021-12-31

Abstract

The invention belongs to the technical field of CT scanning, and in particular relates to a three-dimensional U-net-based low-dose CT image denoising method and system. The low-dose CT projection data to be processed is regarded as the superposition of normal-dose CT projection and noise projection; The optimized 3D U‑net network is trained to denoise the low-dose CT projection data to be processed. In the denoising process, the 3D U‑net network encoder is used to extract the volumetric features of the input low-dose CT projection data. , and simultaneously perform multiple downsampling operations on the input data to obtain volume feature maps to capture the underlying features of multi-scale 3D context information; then use the decoder to reconstruct and progressively upsample the underlying features to predict full-resolution denoised images . The invention makes full use of the information in the depth direction of the CT image, can reduce the amount of network parameters, avoid bottlenecks in network training, can improve the quality of low-dose CT imaging, and has advantages in saving important structural information and preventing edge blurring.

Description

Low-dose CT image denoising method and system based on three-dimensional U-net

Technical Field

The invention belongs to the technical field of CT scanning, and particularly relates to a low-dose CT image denoising method and system based on three-dimensional U-net.

Background

X-ray computed tomography has found widespread use in clinical, industrial, and public safety inspection fields as a versatile, high resolution imaging modality. However, the radiation exposure that accumulates while a patient is undergoing a CT scan may be at risk of inducing cancer and causing genetic damage. In view of these risks, very low radiation doses are typically achieved during scanning by reducing the tube current or shortening the exposure time of the x-ray tube. The main disadvantage of reducing the radiation dose is the increase of image background noise, which severely compromises the diagnostic information. It has therefore been a challenge for researchers to generate high quality images that meet the diagnostic requirements while ensuring patient safety. Aiming at the problem of noise reduction of low-dose CT, a plurality of related research methods are proposed. There are generally three strategies to improve low dose CT image quality, sinogram restoration, Iterative Reconstruction (IR) techniques, and CT image post-processing. However, due to the complexity of the scanning process, these conventional methods are difficult to accurately model the noise and require high computational cost. In recent years, with the rapid development of deep learning, a new low-dose CT denoising method is emerging, but these network-based processing methods are often processing for two-dimensional slices, and spatial information may be ignored, resulting in an unsatisfactory denoising effect.

Disclosure of Invention

Therefore, the invention provides a three-dimensional U-net-based low-dose CT image denoising method and system, which can rapidly and accurately complete low-dose CT image noise removal by using a three-dimensional deep neural network, avoid loss of detail information and improve CT image reconstruction effect and quality.

According to the design scheme provided by the invention, the low-dose CT image denoising method based on the three-dimensional U-net comprises the following contents:

the low-dose CT projection data to be processed are regarded as the superposition of normal-dose CT projection and noise projection;

carrying out denoising processing on low-dose CT projection data to be processed by utilizing a three-dimensional U-net network after training optimization, wherein in the denoising processing, firstly, a three-dimensional U-net network encoder is utilized to extract volume space characteristics of input low-dose CT projection data, and simultaneously, a plurality of downsampling operations are carried out on the input data to obtain volume characteristic mapping so as to capture bottom layer characteristics of multi-scale three-dimensional context information; and then, reconstructing and progressively upsampling the bottom layer characteristics by using a decoder to predict the full-resolution denoised image.

The method comprises the steps that as a three-dimensional U-net-based low-dose CT image denoising method, in a three-dimensional U-net network, each layer of an encoder coding path consists of two continuous convolution blocks with Relu activation functions and comprises a maximum pooling layer for executing downsampling operation, the convolution blocks with downsampling are stacked through the maximum pooling layer to suppress noise layer by layer, and local three-dimensional context information contained in low-dose image projection is converted into a feature space; each layer of the decoding path of the decoder comprises an deconvolution layer, two convolution layers connected with the deconvolution layer and a ReLU layer respectively connected with the two convolution layers, and the deconvolution layer, the convolution layers and the ReLU layer are used for carrying out cascade upsampling operation and pixel level denoising on the output of the encoder so as to restore a full-resolution denoised image.

As the three-dimensional U-net-based low-dose CT image denoising method, further, in the three-dimensional U-net network training optimization, according to the CT image noise characteristics, Poisson noise is added into high-dose CT projection data to obtain corresponding low-dose projection data, so that a high-low matching three-dimensional volume data sample for network training optimization is generated.

As the three-dimensional U-net-based low-dose CT image denoising method, further, in the generation of a three-dimensional volume data sample, firstly, a high-dose CT image generates fan-beam geometric projection data by using a ray driving method, then, simulation degree frontal Poisson noise is added into the fan-beam geometric projection data to simulate and generate corresponding low-dose projection data, and a reconstruction software is used for reconstructing matched high-dose and low-dose three-dimensional volume data samples.

As the low-dose CT image denoising method based on the three-dimensional U-net, the invention further utilizes the Poisson equation Z_i～Poisson{Z_oiexp(-s_i)+r_iI, and generating corresponding low dose projection data, wherein I1, 2_iIs the number of incident photons along the i-ray path, Z_0iIs the initial incident intensity of X-rays, r_iFor background electronic noise along the ith X-ray path, s_iIs the attenuation coefficient line integral, and I is the number of X-ray paths.

As the low-dose CT image denoising method based on the three-dimensional U-net, the invention further comprises the steps of preprocessing three-dimensional sample data in the three-dimensional U-net network training optimization, and randomly dividing the preprocessed three-dimensional sample data into a training set for network training and a testing set for network testing.

As the low-dose CT image denoising method based on the three-dimensional U-net, the invention further comprises the following steps of preprocessing a three-dimensional volume data sample: firstly, acquiring three-dimensional image blocks with the same size by using a sliding window; and then, carrying out normalization processing on the three-dimensional image block.

Further, the present invention provides a low dose CT image denoising system based on three-dimensional U-net, comprising: a noise analysis module and a noise processing module, wherein,

the noise analysis module is used for regarding the low-dose CT projection data to be processed as the superposition of the normal-dose CT projection and the noise projection;

the noise processing module is used for carrying out denoising processing on low-dose CT projection data to be processed by utilizing the three-dimensional U-net network after training optimization, wherein in the denoising processing, firstly, a three-dimensional U-net network encoder is utilized to extract volume space characteristics of the input low-dose CT projection data, and meanwhile, a plurality of downsampling operations are carried out on the input data to obtain volume characteristic mapping so as to capture bottom layer characteristics of multi-scale three-dimensional context information; and then, reconstructing and progressively upsampling the bottom layer characteristics by using a decoder to predict the full-resolution denoised image.

The invention has the beneficial effects that:

the invention fully utilizes the information of the depth direction of the CT image, cuts the input three-dimensional data into a cube for processing, and utilizes the encoder and the decoder in the three-dimensional U-net network to carry out a plurality of down-sampling operations, thereby reducing the parameter quantity of the network and avoiding the bottleneck of network training. And further, simulation experiment results show that the scheme can remarkably improve the imaging quality of the low-dose CT, has good advantages in aspects of storing important structural information, preventing edge blurring and the like, and has good application prospect.

Description of the drawings:

FIG. 1 is a flow chart of a low-dose CT image denoising method in an embodiment;

FIG. 2 is a schematic diagram of a three-dimensional U-net network structure in the embodiment.

The specific implementation mode is as follows:

in order to make the objects, technical solutions and advantages of the present invention clearer and more obvious, the present invention is further described in detail below with reference to the accompanying drawings and technical solutions.

The medical CT image itself belongs to a three-dimensional space, and processing only on a two-dimensional slice often results in loss of detail information, and the processing effect is not ideal. The embodiment of the invention, as shown in fig. 1, provides a low-dose CT image denoising method based on three-dimensional U-net, comprising the following contents:

s101, regarding low-dose CT projection data to be processed as superposition of normal-dose CT projection and noise projection;

s102, denoising low-dose CT projection data to be processed by using a three-dimensional U-net network after training optimization, wherein in the denoising process, firstly, a three-dimensional U-net network encoder is used for extracting volume space characteristics of input low-dose CT projection data, and meanwhile, a plurality of down-sampling operations are carried out on the input data to obtain volume characteristic mapping so as to capture bottom layer characteristics of multi-scale three-dimensional context information; and then, reconstructing and progressively upsampling the bottom layer characteristics by using a decoder to predict the full-resolution denoised image.

The information in the depth direction of the CT image is fully utilized, the imaging quality of the low-dose CT is improved by using a three-dimensional U-net network architecture, the input three-dimensional data is cut into a cube for processing, and a plurality of down-sampling operations adopted in the network can reduce the parameter quantity of the network and avoid bottleneck of network training; the method solves the problem that the prior network processing method for the two-dimensional slice ignores the spatial information and further influences the denoising and image reconstruction effects.

As the three-dimensional U-net-based low-dose CT image denoising method in the embodiment of the invention, further, in a three-dimensional U-net network, each layer of an encoder coding path consists of two continuous convolution blocks with Relu activation functions and comprises a maximum pooling layer for executing downsampling operation, the convolution blocks with downsampling are stacked through the maximum pooling layer to suppress noise layer by layer, and local three-dimensional context information contained in low-dose image projection is converted into a feature space; each layer of the decoding path of the decoder comprises an deconvolution layer, two convolution layers connected with the deconvolution layer and a ReLU layer respectively connected with the two convolution layers, and the deconvolution layer, the convolution layers and the ReLU layer are used for carrying out cascade upsampling operation and pixel level denoising on the output of the encoder so as to restore a full-resolution denoised image. Further, in the three-dimensional U-net network training optimization, the three-dimensional sample data is preprocessed, and the preprocessed three-dimensional sample data is randomly divided into a training set for network training and a testing set for network testing. Further, the three-dimensional volume data sample preprocessing comprises the following contents: firstly, acquiring three-dimensional image blocks with the same size by using a sliding window; and then, carrying out normalization processing on the three-dimensional image block.

The low dose CT projection can be seen as an approximate superposition of the normal dose projection and the noisy projection, the noise of the low dose CT mainly comes from quantum noise, and assuming that the light source is monochromatic X-rays, the corresponding low dose projection can be simulated by adding simulated poisson noise to the projection data of the normal dose. The low dose projections are then reconstructed into a three-dimensional image using filtered back-projection reconstruction. The three-dimensional low-dose CT denoising problem is solved by using a network, and a function f is actually found:

f_3DU-Net:x→y，

wherein y represents three-dimensional low-dose LDCT volume data distribution, x represents three-dimensional normal-dose NDCT volume data distribution, x is infinitely close to y through optimization transformation f, and the function f is estimated in a learning mode.

Referring to FIG. 2, given an input CT scan X ∈ R^C×H×W×DSpatial resolution H W, depth dimension D and channel number C, compact underlying features are first generated using a 3D encoder. The coding path consists of two successive 3 x 3 convolutions with the Relu activation function per layer, followed by a 2 x 2 max pooling layer to perform the downsampling operation, with a step size of 2 in each direction. In the first 4 layers, in each down-sampling step, the number of characteristic channels is doubled to avoid training bottleneck, and the number of channels is 32, 64, 128, 256 convolutional layers respectively. Progressively encoding an image input low resolution/high level characterization by stacking 3 x 3 convolutional blocks with downsampling (convolution step size of 2) represents F ∈ R^{K×H/8×W/8×D/8}(K256), i.e., input dimensions H, W, and D of 1/8. Through the encoder part, noise is suppressed from a low layer to a high layer, so that abundant local 3D context information contained in the low-dose image can be effectively converted into a feature space. Then, in the decoder part, in order to generate a denoising result in the original 3D image space (H × W × D), a 3D decoder is introduced to perform feature upsampling and pixel level denoising. In the decoding path, each layer contains a 2 × 2 × 2 deconvolution layer with a step size of 2 in each direction, followed by two 3 × 3 × 3 convolution layers, each followed by a ReLU layer, the number of eigen channels of the last layer being 1, the convolution kernel size being 1 × 1 × 1, for outputting the processed three-dimensional volume data. Number of characteristic channels per layer and decoder portion guaranteeAnd keeping the consistency, keeping the number of the characteristic channels of the first layers unchanged, and reducing the number of the characteristic channels of the second layers by half after the downsampling operation, wherein the number of the characteristic channels of the first layers is 256, 128, 64 and 32. In order to match the input and output of the network accurately, the number of characteristic channels corresponding to each layer of the encoder and decoder can be kept consistent. The output of the encoder is subjected to cascade upsampling operation and convolution block, and the denoising result y belonging to R of the full resolution is gradually recovered^H×W×D. In addition, the features of the encoder and decoder are fused by a skip connection, passing layers of the same resolution in the encoding path to the decoding path, providing them with the original high resolution features. Through the decoder part, image reconstruction can be carried out according to the features extracted from the input by the convolutional layer, and finally denoised three-dimensional volume data is output.

In the three-dimensional U-net training and testing, firstly, according to the noise characteristic of a CT image, Poisson noise is added into high-dose projection data to generate corresponding low-dose projection data, then, three-dimensional volume data matched in height is generated by utilizing reconstruction software, and three-dimensional CT image small blocks with the same size are obtained by utilizing a block extraction method. The CT Hounsfield Unit (HU) scale is normalized to [0,1] before the three-dimensional image block is input into the network. Randomly selecting data as a training set and a testing set, carrying out network performance testing after network training is completed, taking a low-dose three-dimensional CT image block with obvious noise as input, and carrying out denoising processing through a 3D U-net network model.

As the three-dimensional U-net-based low-dose CT image denoising method in the embodiment of the invention, further, in the three-dimensional U-net network training optimization, according to the CT image noise characteristics, Poisson noise is added in high-dose CT projection data to obtain corresponding low-dose projection data, so as to generate a high-low matching three-dimensional volume data sample for network training optimization. Furthermore, in the generation of the three-dimensional volume data sample, firstly, the high-dose CT image is used for generating fan-beam geometric projection data by using a ray driving method, then, the simulation degree Poisson noise is added in the fan-beam geometric projection data to simulate and generate corresponding low-dose projection data, and the reconstruction software is used for reconstructing the matched high-dose and low-dose three-dimensional volume data sample. Further, using Poisson's equation Z_i～Poisson{Z_oiexp(-s_i)+r_iI, and generating corresponding low dose projection data, wherein I1, 2_iIs the number of incident photons along the i-ray path, Z_0iIs the initial incident intensity of X-rays, r_iFor background electronic noise along the ith X-ray path, s_iIs the attenuation coefficient line integral, and I is the number of X-ray paths.

Firstly, a CT image with normal dose generates fan-shaped beam geometric projection data by using a Siddon ray driving method. Then according to the following formula:

Z_i～Poisson{Z_oiexp(-s_i)+r_i}i＝1,2...I

simulated poisson noise is added to the normal dose projection data to simulate a corresponding low dose projection. Where Zi is the incident photon number along the i-ray path, Z0i is the initial incident intensity of the X-ray, ri is the background electronic noise along the i-th X-ray path, and si is the line integral of the attenuation coefficient. In the present method, Z0i is uniformly set to 10⁵. In this way, matched high and low dose projection data may be generated. And finally, reconstructing three-dimensional CT volume data with matched normal dose and low dose by using reconstruction software.

In the aspect of data preprocessing, for limited data, the denoising performance based on the deep learning method depends on the size of a training data set, so that the large-scale effective training data set can improve the denoising performance. This requirement is difficult to meet in practice, especially in clinical imaging. In the embodiment of the scheme, the overlapped small blocks are used in the CT image, so that not only can the spatial interconnection among the small blocks be considered, but also the size of a training small block data set can be obviously increased. The original three-dimensional low-dose and normal-dose CT images were 512 x 512 pixels. Since direct processing of the entire patient image is computationally inefficient and infeasible, a denoising model is utilized for the image block. First, an overlapping sliding window with a sliding size of 1 × 1 × 1 is applied to obtain three-dimensional image blocks of the same size. The CT Hounsfield Unit (HU) scale is then normalized to [0,1] before the three-dimensional image block is input into the network. The method fully utilizes information in the depth direction, takes three-dimensional cube data as input, obtains multi-scale bottom-layer characteristics while reducing the parameter number through a plurality of three-dimensional down-sampling operations, improves the image quality by combining three-dimensional volume information, can effectively retain the structure and texture information of the CT image, and obviously inhibits noise and artifacts.

Further, based on the above method, an embodiment of the present invention further provides a three-dimensional U-net based low-dose CT image denoising system, including: a noise analysis module and a noise processing module, wherein,

Because the medical CT image is three-dimensional, the information between adjacent slices has continuity, and the processing aiming at two-dimensional slices can ignore the spatial information, so the denoising effect is not ideal, therefore, the three-dimensional U-net network is used for directly processing the three-dimensional volume data to solve the problem of low-dose CT denoising, the network is established on a coder-decoder structure, a network coder firstly utilizes a three-dimensional CNN coder to extract the volume space characteristics, and simultaneously carries out a plurality of downsampling operations on the input three-dimensional image to obtain compact volume characteristic mapping, thereby effectively capturing the bottom three-dimensional context information of multiple scales. Then, the network decoder reconstructs the bottom layer characteristics output by the encoder and performs progressive upsampling to predict a full-resolution denoised image. By generating a large amount of training data, network training can be well completed, and the problem of low-dose CT noise removal can be quickly and simply solved through the obtained training parameters.

To verify the validity of the scheme, the following further explanation is made by combining specific experiments:

the experimental environment is as follows: training and testing of the network is done in a Pytorch (version 1.1.0) environment on an AMAX workstation. The two CPU models of the AMAX workstation are both Intel Xeon Gold 5118, and the available memory is 128 GB. Network training and testing used a computing graphics card model GeForce RTX 2080 Ti.

The effect of the used three-dimensional network is verified through the simulation data experiment, and the network not only can improve the image quality, but also has good advantages in the aspects of saving important structural information and preventing edge blurring.

Based on the foregoing system, an embodiment of the present invention further provides a server, including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method described above.

Based on the system, the embodiment of the invention further provides a computer readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method.

The device provided by the embodiment of the present invention has the same implementation principle and technical effect as the system embodiment, and for the sake of brief description, reference may be made to the corresponding content in the system embodiment for the part where the device embodiment is not mentioned.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing system embodiments, and are not described herein again.

In all examples shown and described herein, any particular value should be construed as merely exemplary, and not as a limitation, and thus other examples of example embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and system may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the system according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A low-dose CT image denoising method based on three-dimensional U-net is characterized by comprising the following contents:

2. The three-dimensional U-net based low-dose CT image denoising method according to claim 1, wherein, in the three-dimensional U-net network, each layer of the encoder coding path is composed of two consecutive convolution blocks with Relu activation function and contains a maximum pooling layer for performing the downsampling operation, and local three-dimensional context information contained in the low-dose image projection is transformed into the feature space by stacking the convolution blocks with downsampling through the maximum pooling layer to suppress noise layer by layer; each layer of the decoding path of the decoder comprises an deconvolution layer, two convolution layers connected with the deconvolution layer and a ReLU layer respectively connected with the two convolution layers, and the deconvolution layer, the convolution layers and the ReLU layer are used for carrying out cascade upsampling operation and pixel level denoising on the output of the encoder so as to restore a full-resolution denoised image.

3. The three-dimensional U-net-based low-dose CT image denoising method as claimed in claim 1 or 2, wherein in the three-dimensional U-net network training optimization, according to CT image noise characteristics, Poisson noise is added to high-dose CT projection data to obtain corresponding low-dose projection data so as to generate high-low matching three-dimensional volume data samples for network training optimization.

4. The three-dimensional U-net based low-dose CT image denoising method as claimed in claim 3, wherein in the three-dimensional volume data sample generation, the high-dose CT image is firstly used to generate fan-beam geometric projection data by using a ray-driven method, then the simulated frontal Poisson noise is added to the fan-beam geometric projection data to simulate and generate corresponding low-dose projection data, and the reconstruction software is used to reconstruct the matched high-dose and low-dose three-dimensional volume data samples.

5. The three-dimensional U-net based low-dose CT image denoising method of claim 4, wherein Poisson equation Z is used_i～Poisson{Z_oiexp(-s_i)+r_iI, and generating corresponding low dose projection data, wherein I1, 2_iIs the number of incident photons along the i-ray path, Z_0iIs the initial incident intensity of X-rays, r_iFor background electronic noise along the ith X-ray path, s_iIs the attenuation coefficient line integral, and I is the number of X-ray paths.

6. The three-dimensional U-net-based low-dose CT image denoising method of claim 3, wherein in the three-dimensional U-net network training optimization, the three-dimensional sample data is preprocessed, and the preprocessed three-dimensional sample data is randomly divided into a training set for network training and a testing set for network testing.

7. The three-dimensional U-net based low-dose CT image denoising method of claim 4, wherein the three-dimensional volume data sample preprocessing comprises the following steps: firstly, acquiring three-dimensional image blocks with the same size by using a sliding window; and then, carrying out normalization processing on the three-dimensional image block.

8. A low-dose CT image denoising system based on three-dimensional U-net is characterized by comprising: a noise analysis module and a noise processing module, wherein,

9. A computer-readable storage medium, storing a computer program which, when executed by a processor, causes the processor to perform the steps of the method according to any one of claims 1 to 7.

10. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the method according to any one of claims 1 to 7.