CN112801964B

CN112801964B - Multi-label intelligent detection method, device, equipment and medium for lung CT image

Info

Publication number: CN112801964B
Application number: CN202110076124.8A
Authority: CN
Inventors: 何昆仑; 邢宁; 聂永康; 管希周; 马林; 刘盼; 郭华源; 王文君; 钟琴; 郭桦; 丁俊谕; 李宗任; 刘博罕; 赵诚辉; 张培芳
Original assignee: Beijing Ande Yizhi Technology Co ltd; Chinese PLA General Hospital
Current assignee: Beijing Ande Yizhi Technology Co ltd; Chinese PLA General Hospital
Priority date: 2021-01-20
Filing date: 2021-01-20
Publication date: 2022-02-22
Anticipated expiration: 2041-01-20
Also published as: CN112801964A

Abstract

The present application discloses a multi-label intelligent detection method, device, equipment and medium for lung CT images, obtaining target images input layer by layer, and the target images are lung CT images; In the multi-category segmentation model, a multi-category segmentation image and a single-category segmentation image are obtained, the multi-category segmentation image is used to indicate the type label of at least a part of the target image, and the single-category segmentation image is used to indicate The region of interest contained in the target image; the multi-category segmentation image is screened by using the single-category segmentation image to obtain the type label of the region of interest in the target image, which is effective without losing features. Improve image processing speed.

Description

Multi-label intelligent detection method, device, equipment and medium for lung CT image

Technical Field

The present disclosure relates generally to the field of image processing, and more particularly to the field of medical image processing, and more particularly to a method and an apparatus for multi-tag intelligent detection of lung CT images, an electronic device, and a storage medium.

Background

In the medical field, the traditional way of manually detecting lung CT images by related medical professionals is time-consuming and labor-consuming, and a machine vision algorithm based on a deep neural network which is developed in recent years can effectively solve the problem.

Disclosure of Invention

In view of the above-mentioned drawbacks and deficiencies in the prior art, it is desirable to provide a method, an apparatus, an electronic device, and a storage medium for multi-label intelligent detection of lung CT images, which effectively increase the image processing speed without losing features.

In a first aspect, the present application provides a multi-label intelligent detection method for a lung CT image, including:

acquiring target images input layer by layer;

inputting the target image into a trained multi-class segmentation model to obtain a multi-class segmentation image and a single-class segmentation image, wherein the multi-class segmentation image is used for indicating a type label to which at least one part of a region in the target image belongs, and the single-class segmentation image is used for indicating an interested region contained in the target image;

and screening the multi-class segmentation images by using the single-class segmentation image to obtain the type tag of the interest region in the target image.

Optionally, the multi-class segmentation model includes a contraction module, an expansion module, and a segmentation module, and the method includes:

utilizing the shrinkage module to carry out down-sampling on the target image for three times to obtain a down-sampled image;

performing up-sampling on the down-sampling image for three times by using the expansion module to obtain an up-sampling image;

and performing multi-class segmentation and single-class segmentation on the up-sampled image by using the segmentation module to obtain the multi-class segmented image and the single-class segmented image.

Optionally, the shrinking module includes two first shrinking units and a first normalization layer, the two first shrinking units are sequentially connected, the first shrinking unit includes a first rolling layer and a first nonlinear layer, and the first normalization layer is disposed between the first rolling layer and the first nonlinear layer of a second first shrinking unit;

the expansion module comprises two first expansion units and a second normalization layer, the two first expansion units are sequentially connected, the first expansion unit comprises a second convolution layer and a second nonlinear layer, and the second normalization layer is arranged between the second convolution layer and the second nonlinear layer of the second first expansion unit.

Optionally, when the type tag meets a preset condition, a shrinkage module with residual connection is used for performing down-sampling on the target image for three times to obtain the down-sampled image, and an expansion module with residual connection is used for performing up-sampling on the down-sampled image for three times to obtain the up-sampled image.

Optionally, the puncturing module with residual connection includes two second puncturing units, a third normalization layer and a first residual connection layer, where the two second puncturing units are connected in sequence, the second puncturing unit includes a third convolution layer and a third nonlinear layer, the third normalization layer and the first residual connection layer are disposed between the third convolution layer and the third nonlinear layer of the second puncturing unit, and the first residual connection layer is configured to perform residual connection calculation on feature data input to the first second puncturing unit and output data of the third normalization layer;

the expansion module with the residual connection comprises two second expansion units, a fourth normalization layer and a second residual connection layer, the two second expansion units are sequentially connected, the second expansion unit comprises a fourth convolution layer and a fourth nonlinear layer, the fourth normalization layer and the second residual connection layer are arranged between the fourth convolution layer and the fourth nonlinear layer of the second expansion unit, and the second residual connection layer is used for performing residual connection calculation on characteristic data input to the first second expansion unit and output data of the fourth normalization layer.

Optionally, the acquiring a target image input layer by layer includes:

acquiring an input original image;

normalizing each pixel position in the original image to obtain a normalized value of each pixel position;

determining a clipping center according to the normalized value of each pixel position in the original image;

and cutting the original image into the target image with a preset size according to the cutting center.

In a second aspect, the present application provides a multi-label intelligent detection apparatus for lung CT images, including:

the acquisition module is used for acquiring target images input layer by layer;

the multi-class segmentation model is used for obtaining a multi-class segmentation image and a single-class segmentation image according to the target image, the multi-class segmentation image is used for indicating the type label of at least one part of the region in the target image, and the single-class segmentation image is used for indicating the region of interest contained in the target image;

and the screening module is used for screening the multi-class segmentation images by using the single-class segmentation image to obtain the type labels of the interest areas in the target image.

Optionally, the multi-class segmentation model includes a contraction module, an expansion module, and a segmentation module,

the contraction module is used for carrying out down-sampling on the target image for three times to obtain a down-sampled image;

the expansion module is used for performing up-sampling on the down-sampling image for three times to obtain an up-sampling image;

the segmentation module is used for performing multi-class segmentation and single-class segmentation on the up-sampled image to obtain the multi-class segmented image and the single-class segmented image.

In a third aspect, embodiments of the present application provide an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor executes the computer program to implement the method described in the embodiments of the present application.

In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method as described in the embodiments of the present application.

According to the method and the device, the multi-class segmentation images and the single-class segmentation images are obtained simultaneously by using the multi-class segmentation model, and the multi-class segmentation images are supervised by using the single-class segmentation images, so that the identification capability of the focus types is effectively improved, and the misdiagnosis rate is reduced. Meanwhile, the multi-class segmentation model is used, so that the image processing speed can be effectively improved under the condition of not losing the characteristics.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

fig. 1 is a flowchart of a multi-label intelligent detection method for lung CT images according to the present application;

FIG. 2 is a flowchart of another method for multi-label intelligent detection of CT images of lungs according to the present application;

FIG. 3 is a flowchart of another method for multi-label intelligent detection of lung CT images according to the present application;

FIG. 4 is a schematic structural diagram of a multi-class segmentation model according to the present application;

FIG. 5 is a schematic structural diagram of a shrink module according to the present application;

FIG. 6 is a schematic diagram of an expansion module of the present application;

FIG. 7 is a schematic structural diagram of another shrink module of the present application;

FIG. 8 is a schematic view of another expansion module of the present application;

FIG. 9 is a schematic structural diagram of a multi-class segmentation model according to the present application;

FIG. 10 is a schematic structural diagram of a multi-label intelligent detection apparatus for lung CT images according to an embodiment of the present application;

fig. 11 shows a schematic structural diagram of a computer system suitable for implementing the electronic device or the server according to the embodiment of the present application.

Detailed Description

The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

Fig. 1 is a flowchart of a multi-label intelligent detection method for a lung CT image according to the present application. As shown in fig. 1, a method for multi-label intelligent detection of a lung CT image according to an embodiment of the present application includes:

step 101, acquiring a target image input layer by layer.

Wherein, the target image is a Computed Tomography (CT) image of the lung.

Although the 3D image can be obtained by a current CT imaging apparatus for a patient, the present application deals with a 2D lung image input layer by layer in consideration of control of the model size and the image processing speed. In the embodiment of the present application, the target image may be a lung image.

Further, in order to increase the image processing speed, the target image may be further cropped to remove regions of the image that are not related to the target region, such as regions that are not related to the lung.

Specifically, as shown in fig. 2, acquiring a target image input layer by layer includes:

step 201, acquiring an input original image.

Wherein the original image may be a tomographic image of a 3D image of the lungs.

Step 202, normalizing each pixel in the original image to obtain a normalized value of each pixel.

Optionally, the following formula is adopted to calculate the normalized value of each pixel:

where x is the pixel value before pixel normalization, x' is the pixel value after pixel normalization, a is the minimum value of the CT values of the lung regions, and b is the maximum value of the CT values of the lung regions, and in the embodiment of the present application, [ a, b ] may be [ -1024, 2048 ].

Step 203, determining a clipping center according to the normalized value of each pixel in the original image.

Specifically, pixel positions having a normalized value greater than 0.1 are identified, the centroid position of the pixel positions having a normalized value greater than 0.1 is calculated, and the centroid position is taken as the clipping center.

And step 204, cutting the original image into a target image with a preset size according to the cutting center.

Optionally, the original image is cropped to a target image with a size of 384 × 384 pixels according to the cropping center, wherein in the embodiment of the present application, the size of the original image is 512 × 512.

And 102, inputting the target image into the trained multi-class segmentation model to obtain a multi-class segmentation image and a single-class segmentation image.

The multi-class segmentation image is used for indicating the type label of at least one part of the region in the target image, and the single-class segmentation image is used for indicating the region of interest contained in the target image. In the embodiment of the present application, the type label may be an attribute label corresponding to the type of the lesion of the patient, such as a tumor, a nodule, and the like. The region of interest is a focal region in the target image.

That is, the present application adds a network head capable of generating a supervision function, i.e., an output end capable of outputting a single-class segmentation image, at the end of the conventional deep learning model, thereby constructing a multi-class segmentation model as a multitask model.

Specifically, the present application employs two different classifiers to achieve multi-class segmentation and single-class segmentation, and specifically, a pixel-level classifier may be employed. That is, in the embodiment of the present application, the multi-class segmentation model has two classifiers, and after feature processing is performed on the lung image, the processed feature data is respectively input to the two classifiers, so that a multi-class segmentation image and a single-class segmentation image are respectively obtained by the two classifiers. It should be understood that, when training the multi-class segmentation model, training sets are required to be respectively made for the multi-class segmentation and the single-class segmentation, and the multi-class segmentation model is respectively trained by using the two training sets, so as to ensure that the output results of the multi-class segmentation and the single-class segmentation do not generate influence in the multi-class segmentation model.

And 103, screening the multi-class segmentation images by using the single-class segmentation image to obtain the type label of the interest region in the target image.

Specifically, the multi-class divided image and the single-class divided image are overlapped to remove a portion of the multi-class divided image that does not belong to the single-class divided image. That is, the method screens out the focus area by using the single-class segmentation image, and then monitors the multi-class segmentation result by using the single-class segmentation image to determine the type of the focus area.

It should be understood that, when training multi-class segmentation, only the type of the lesion may be labeled, so that the multi-class segmented image only contains the type of the lesion, but a part of normal structural features in a human body may be similar to the structure of the lesion, but the position is not easy to generate a lesion, so that the position is not labeled as a region of interest in a single-class segmented image trained only for the lesion, and if the position is identified as the type of the lesion in the multi-class segmented image, the position may also be filtered through the single-class segmented image, so that a type only containing the region of the lesion in the target image is finally obtained.

In the embodiment of the present application, the type label belongs to is displayed by labeling in the target image, for example, a lesion area is selected in the target image, and the type label belonging to the lesion is labeled by characters beside the selected line.

Preferably, the multi-class segmented image is not supervised by the single-class segmented image during the training of the model.

Therefore, the multi-class segmentation image and the single-class segmentation image are obtained simultaneously by the multi-class segmentation model, and the single-class segmentation image is used for monitoring the multi-class segmentation image, so that the identification capability of the focus type is effectively improved, and the misdiagnosis rate is reduced. Meanwhile, the multi-class segmentation model is used, so that the image processing speed can be effectively improved under the condition of not losing the characteristics.

As one possible embodiment, the multi-class segmentation model includes a contraction module, an expansion module, and a segmentation module, as shown in fig. 3, the method includes:

and 301, performing down-sampling on the image for three times by using a contraction module to obtain a down-sampled image.

And 302, performing up-sampling on the down-sampled image for three times by using an expansion module to obtain an up-sampled image.

And 303, performing multi-class segmentation and single-class segmentation on the up-sampled image by using a segmentation module to obtain a multi-class segmented image and a single-class segmented image.

According to the method and the device, 3 times of downsampling is utilized, model calculation amount is effectively reduced under the condition that characteristic precision is not lost, and model calculation speed is improved.

That is, as shown in fig. 4, the multi-class segmentation model of the present application may be a U-shaped structure that performs three times of downsampling and three times of upsampling. Wherein, if the modules with the same number of channels are defined as belonging to the same stage, the contraction path consists of 4 stages, and specifically, the contraction path includes one (output) contraction module with 64 channels, one contraction module with 128 channels, one contraction module with 256 channels, and one contraction module with 512 channels. The construction of these puncture modules is identical except for the different number of output channels and the fact that the first stage (puncture module with channel number 64) does not perform downsampling. The number of output channels segmented by multiple classes is the number of target result classes plus 1, and the number of output channels segmented by abnormal detection is 1.

In some embodiments, as shown in fig. 5, the shrinking module includes two first shrinking units 51 and a first normalization layer 52, the two first shrinking units 51 are connected in sequence, the first shrinking unit 51 includes a first convolution layer 511 and a first non-linear layer 512, and the first normalization layer 52 is disposed between the first convolution layer 511 and the first non-linear layer 512 of the second first shrinking unit.

Specifically, after the image feature data is input into the contraction module, the image feature data is firstly down-sampled by the largest pooling layer with the size of 2 × 2 and the step of 2, then the down-sampled result is input into the first contraction unit, and the down-sampled result processed by the first contraction unit is sequentially input into the first convolution layer, the first normalization layer and the first nonlinear layer of the second contraction unit, so that the output feature of the contraction module is obtained. Wherein the convolution kernel size of the first convolution layer is 3 x 3.

Further, the expansion path and the contraction path constitute a mirror structure, i.e. three upsampling stages comprising

output channels

256, 128 and 64.

As shown in fig. 6, the expansion module includes two first expansion units 61 and a second normalization layer 62, the two first expansion units 61 are connected in sequence, the first expansion unit 61 includes a second convolution layer 611 and a second nonlinear layer 612, and the second normalization layer 62 is disposed between the second convolution layer 611 and the second nonlinear layer 612 of the second first expansion unit 61.

The expansion module is provided with a transposition convolution with a convolution kernel size of 2 x 2 and a step length of 2 in front of the first expansion unit to perform up-sampling on the input image characteristic data, and the input image characteristic data of the expansion module is the splicing of the output data of the previous stage and the output data of the stage with the same resolution of the contraction path.

Therefore, the normalization layer is adopted in the module only once, the model operation amount is effectively reduced under the condition of not losing the feature precision, and the image processing speed is improved.

As a possible embodiment, when the type tag satisfies the preset condition, the target image is down-sampled three times by using a shrinkage module with residual connection to obtain a down-sampled image, and the down-sampled image is up-sampled three times by using an expansion module with residual connection to obtain an up-sampled image.

It should be noted that the predetermined condition may be a degree of subdivision of the lesion type.

As shown in fig. 7, the puncturing module including residual concatenation includes two second puncturing units 71, a third normalization layer 72, and a first residual concatenation layer 73, the two second puncturing units 71 are sequentially concatenated, the second puncturing unit 71 includes a third convolution layer 711 and a third nonlinear layer 712, the third normalization layer 72 and the first residual concatenation layer 73 are disposed between the third convolution layer 711 and the third nonlinear layer 712 of the second puncturing unit 71, and the first residual concatenation layer is configured to perform residual concatenation calculation on the feature data input to the first second puncturing unit and the output data of the third normalization layer.

As shown in fig. 8, the dilation module with residual connection includes two second dilation units 81, a fourth normalization layer 82, and a second residual connection layer 83, the two second dilation units 81 are connected in sequence, the second dilation unit 81 includes a fourth convolution layer 811 and a fourth non-linear layer 812, the fourth normalization layer 82 and the second residual connection layer 83 are disposed between the fourth convolution layer 8111 and the fourth non-linear layer 812 of the second dilation unit 81, and the second residual connection layer is configured to perform residual connection calculation on the feature data input to the first second dilation unit and the output data of the fourth normalization layer.

The first residual connecting layer is used for performing element-by-element addition operation on the down-sampling result and the result of the third normalization layer to obtain residual characteristics, and then inputting the residual characteristics into the third nonlinear layer.

When the number of input and output channels of the contraction module and/or the expansion module is different, the sampling result needs to be input into the convolution layer with the size of 1 × 1 and the step of 1 to be converted into the characteristic data with the same number as the output channels, so as to perform the calculation of element-by-element addition.

Further, as shown in fig. 9, when the type label satisfies the predetermined condition, the systolic path of the multi-class segmentation model includes one (output) channel number 64 systolic module, two channel number 128 systolic modules, two channel number 256 systolic modules, and one channel number 512 systolic module.

It should be understood that, in the embodiment of the present application, the structure of the convolutional layer and the non-linear layer in the contraction module containing residual connection is the same as that in the previous contraction module, and the structure of the convolutional layer and the non-linear layer in the expansion module containing residual connection is the same as that in the previous expansion module. In the embodiment of the present application, the Relu function is selected for the non-linear layer.

Therefore, when the segmentation quantity of the multi-class segmentation meets the preset condition, the contraction module and the expansion module which contain residual connection are adopted, the expression capacity of the network model to the target image characteristics is effectively improved, and the complex requirement of large segmentation quantity is convenient to process.

In the embodiment of the present application, when training the multi-class segmentation model, a mixed Loss function of Focal Loss and Dice Loss may be used, and the specific formula is as follows:

L＝αL_Focal+L_Dice

where α is a weight coefficient for balancing two loss values, and can be set to 5 × 10 in the present application^-5。

L_FocalAnd L_DiceThe expression of (a) is as follows:

where n is the number of pixels, c is the number of divisions, p^gtλ is a coefficient for controlling the sample number balance strength of Focal local, preferably 2, l is a correct label, and ∈ is a set coefficient, where the result of computing Dice can be kept at 1 when the number of pixels for correct segmentation and the number of pixels for model output are both 0.

In summary, the multi-class segmentation image and the single-class segmentation image are obtained simultaneously by the multi-class segmentation model, and the single-class segmentation image is used for supervising the multi-class segmentation image, so that the identification capability of the focus type is effectively improved, and the misdiagnosis rate is reduced. Meanwhile, the multi-class segmentation model is used, so that the image processing speed can be effectively improved under the condition of not losing the characteristics.

It should be noted that while the operations of the method of the present invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Rather, the steps depicted in the flowcharts may change the order of execution.

Fig. 10 is a schematic structural diagram of a multi-label intelligent detection apparatus for a lung CT image according to an embodiment of the present application. As shown in fig. 9, the multi-label intelligent detection apparatus 10 for lung CT images includes:

the acquisition module 11 is used for acquiring target images input layer by layer;

a multi-class segmentation model 12, configured to obtain a multi-class segmentation image and a single-class segmentation image according to the target image, where the multi-class segmentation image is used to indicate a type tag to which at least a part of a region in the target image belongs, and the single-class segmentation image is used to indicate a region of interest included in the target image;

and the screening module 13 is configured to screen the multi-class segmented images by using the single-class segmented image to obtain the type tag of the interest region in the target image.

In some embodiments, the multi-class segmentation model 12 includes a contraction module, an expansion module, and a segmentation module,

the expansion module is used for carrying out up-sampling on the down-sampled image for three times to obtain an up-sampled image;

and the segmentation module is used for carrying out multi-class segmentation and single-class segmentation on the up-sampled image to obtain a multi-class segmented image and a single-class segmented image.

In some embodiments, the shrinking module comprises two first shrinking units and a first normalization layer, the two first shrinking units are connected in sequence, the first shrinking unit comprises a first convolution layer and a first nonlinear layer, and the first normalization layer is arranged between the first convolution layer and the first nonlinear layer of the second first shrinking unit;

In some embodiments, when the type tag satisfies a preset condition, the target image is downsampled three times by using a shrinkage module with residual connection to obtain a downsampled image, and the downsampled image is upsampled three times by using a dilation module with residual connection to obtain an upsampled image.

In some embodiments, the puncturing module with residual error concatenation includes two second puncturing units, a third normalization layer and a first residual error concatenation layer, the two second puncturing units are sequentially connected, the second puncturing unit includes a third convolution layer and a third nonlinear layer, the third normalization layer and the first residual error concatenation layer are disposed between the third convolution layer and the third nonlinear layer of the second puncturing unit, and the first residual error concatenation layer is configured to perform residual error concatenation calculation on the feature data input to the first second puncturing unit and the output data of the third normalization layer;

In some embodiments, the obtaining module 11 is further configured to:

acquiring an input original image;

determining a cutting center according to the normalized value of each pixel position in the original image;

and cutting the original image into a target image with a preset size according to the cutting center.

It should be understood that the units or modules recited in the apparatus 10 correspond to the various steps in the method described with reference to fig. 1. Thus, the operations and features described above for the method are equally applicable to the device 1 and the units comprised therein and will not be described in detail here. Corresponding elements in the apparatus 10 may cooperate with elements in the electronic device to implement aspects of embodiments of the present application.

Referring now to fig. 11, fig. 11 illustrates a schematic diagram of a computer system suitable for use in implementing an electronic device or server of an embodiment of the present application,

as shown in fig. 11, the computer system includes a Central Processing Unit (CPU)1001 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. In the RAM1003, various programs and data necessary for operation instructions of the system are also stored. The CPU1001, ROM1002, and RAM1003 are connected to each other via a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.

The following components are connected to the I/O interface 1005; an input section 1006 including a keyboard, a mouse, and the like; an output section 1007 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 1008 including a hard disk and the like; and a communication section 1009 including a network interface card such as a LAN card, a modem, or the like. The communication section 1009 performs communication processing via a network such as the internet. The driver 1010 is also connected to the I/O interface 1005 as necessary. A removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1010 as necessary, so that a computer program read out therefrom is mounted into the storage section 1008 as necessary.

In particular, according to embodiments of the present application, the process described above with reference to the flowchart fig. 2 may be implemented as a computer software program. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program comprises program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication part 1009 and/or installed from the removable medium 1011. The computer program executes the above-described functions defined in the system of the present application when executed by the Central Processing Unit (CPU) 1001.

It should be noted that the computer readable medium shown in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operational instructions of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units or modules described in the embodiments of the present application may be implemented by software or hardware. The described units or modules may also be provided in a processor, and may be described as: a processor includes an acquisition module, a multi-class segmentation model, and a screening module. The names of these units or modules do not in some cases constitute a limitation on the units or modules themselves, for example, the acquisition module may also be described as "acquiring a layer-by-layer input image of the lungs".

As another aspect, the present application also provides a computer-readable storage medium, which may be included in the electronic device described in the above embodiments, or may exist separately without being assembled into the electronic device. The computer readable storage medium stores one or more programs which, when executed by one or more processors, perform the method for multi-tag smart detection of CT images of lungs described herein.

The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the disclosure. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims

1. a multi-label intelligent detection method of lung CT image, is characterized in that, comprises:

Obtain the target image input layer by layer, and the target image is a lung CT image;

The target image is input into the trained multi-category segmentation model to obtain a multi-category segmentation image and a single-category segmentation image, and the multi-category segmentation image is used to indicate the type label of at least a part of the area in the target image. The single-class segmentation image is used to indicate the region of interest contained in the target image, and the multi-class segmentation model has two network heads, one network head is used to output the multi-class segmentation image, and the other network head is used for outputting the single-category segmented image;

Screening the multi-category segmented images by using the single-category segmented image to obtain the type label of the region of interest in the target image;

Wherein, the multi-category segmentation model includes a contraction module, an expansion module and a segmentation module, and the method includes:

When the label of the type to which it belongs satisfies a preset condition, use a shrinking module containing residual connections to downsample the target image three times to obtain a downsampled image;

Upsampling the downsampled image three times by using an expansion module containing residual connections to obtain an upsampled image;

Using the segmentation module to perform multi-category segmentation and single-category segmentation on the upsampled image, the multi-category segmentation image and the single-category segmentation image are obtained.

2. the multi-label intelligent detection method of lung CT image according to claim 1, is characterized in that,

The shrinking module includes two first shrinking units and a first normalization layer, the two first shrinking units are connected in sequence, and the first shrinking unit includes a first convolution layer and a first nonlinear layer, so The first normalization layer is arranged between the first convolution layer and the first nonlinear layer of the second first shrinking unit;

The expansion module includes two first expansion units and a second normalization layer, the two first expansion units are connected in sequence, and the first expansion unit includes a second convolution layer and a second nonlinear layer, so The second normalization layer is disposed between the second convolution layer and the second nonlinear layer of the second first expansion unit.

3. the multi-label intelligent detection method of lung CT image according to claim 1, is characterized in that,

The shrinking module with residual connection includes two second shrinking units, a third normalization layer and a first residual connecting layer, the two second shrinking units are connected in sequence, and the second shrinking unit includes a first shrinking unit. Three convolutional layers and a third nonlinear layer, the third normalization layer and the first residual connection layer are arranged in the third convolutional layer and the second convolutional layer of the second shrinking unit Between the third nonlinear layers, the first residual connection layer is used to perform residual connection calculation between the feature data input to the first second contraction unit and the output data of the third normalization layer ;

The expansion module with residual connection includes two second expansion units, a fourth normalization layer and a second residual connection layer, the two second expansion units are connected in sequence, and the second expansion unit includes a second expansion unit. Four convolutional layers and a fourth nonlinear layer, the fourth normalization layer and the second residual connection layer are arranged in the fourth convolutional layer and the second of the second expansion unit. Between the fourth nonlinear layers, the second residual connection layer is used to perform residual connection calculation between the feature data input to the first second expansion unit and the output data of the fourth normalization layer .

4. the multi-label intelligent detection method of lung CT image according to claim 1, is characterized in that, described obtaining the target image of layer-by-layer input, comprising:

Get the original image of the input;

Normalize each pixel position in the original image, and obtain the normalized value of each pixel position;

Determine the crop center according to the normalized value of each pixel position in the original image;

According to the cropping center, the original image is cropped into the target image of a preset size.

5. A multi-label intelligent detection device for lung CT images, characterized in that, comprising:

The acquisition module is used to acquire the target image input layer by layer;

A multi-category segmentation model is used to obtain a multi-category segmentation image and a single-category segmentation image according to the target image, the multi-category segmentation image is used to indicate the type label of at least a part of the target image, and the single-category segmentation The image is used to indicate the region of interest contained in the target image, and the multi-class segmentation model has two network heads, one network head is used to output the multi-class segmentation image, and the other network head is used to output the single network head. class segmentation image;

a screening module, configured to screen the multi-category segmented images by using the single-category segmented images to obtain the type labels of the region of interest in the target image;

Wherein, the multi-category segmentation model includes a contraction module, an expansion module and a segmentation module, and when the type label to which it belongs satisfies a preset condition,

The shrinking module is used to downsample the target image three times by using a shrinking module containing residual connections to obtain a downsampled image;

The expansion module is configured to perform up-sampling on the down-sampled image three times by using the expansion module containing residual connections to obtain an up-sampled image;

The segmentation module is configured to perform multi-category segmentation and single-category segmentation on the up-sampled image to obtain the multi-category segmentation image and the single-category segmentation image.

6. An electronic device, comprising a memory, a processor and a computer program stored on the memory and running on the processor, characterized in that, when the processor executes the program, realizes as in claims 1-4 Any of the multi-label intelligent detection methods for lung CT images.

7. A computer-readable storage medium having a computer program stored thereon, characterized in that, when the program is executed by the processor, the multi-label intelligent detection of the lung CT image as described in any one of claims 1-4 is realized method.