TW202223914A

TW202223914A - Lung nodule detection method on low-dose chest computer tomography images using deep learning and its computer program product

Info

Publication number: TW202223914A
Application number: TW109142064A
Authority: TW
Inventors: 黃兆熊; 程大川
Original assignee: 中國醫藥大學
Priority date: 2020-11-30
Filing date: 2020-11-30
Publication date: 2022-06-16
Also published as: TWI733627B

Abstract

A lung nodule detection system on low-dose chest computer tomography images using deep learning includes a lung nodule detection stage and a false positive elimination stage. In the lung nodule detection stage, an original data set is prepared and a cross-validation is performed; the CT images in the original data set are disassembled into multiple 2D slice images, and the CT images are performed with image enhancement, lung area detected and three-channel superimposition to obtain a first data set; a Mask R-CNN image segmentation model receives the first data set for training, and multiple Mask R-CNN models trained from cross-validation are provided to perform hard negative mining to determine hard negative examples; and the hard negative examples are integrated with multiple training sets generated by cross-validation to complete a second data set for training a Mask R-CNN model. In the false positive elimination stage, the positive cases in the original data set and the negative cases selected from the lung nodule detection stage are combined into a data set, and the CT images of the original data set are cut into cuboid 3D images by taking the center of mass coordinates of the false positive results predicted by the Mask R-CNN model trained in the lung nodule detection stage; and a 3D CNN model receives the cuboid 3D images for training.

Description

Deep learning lung nodule detection method and computer program product based on chest low-dose computed tomography images

本發明關於一種肺結節偵測方法，特別是關於一種基於胸部低劑量電腦斷層影像之深度學習肺結節偵測方法。The present invention relates to a pulmonary nodule detection method, in particular to a deep learning pulmonary nodule detection method based on chest low-dose computed tomography images.

近年來，肺癌已成國人健康重大隱憂，而肺結節偵測是肺癌早期治療的重大關鍵。因此，遂有肺結節偵測系統之發展，以協助放射科醫師之診斷工作，提升其診斷品質及效率。In recent years, lung cancer has become a major health concern for Chinese people, and the detection of pulmonary nodules is an important key to early treatment of lung cancer. Therefore, a pulmonary nodule detection system has been developed to assist radiologists in their diagnostic work and improve their diagnostic quality and efficiency.

已知肺結節偵測雖已朝向基於深度學習和神經網路來發展，然在實際之應用上仍有模型架構複雜、訓練成本過高、準確度不足等缺失。且於完成肺結節偵測後，習知肺結節偵測系統在進行偽陽性排除時，更遭遇偽陽性預測率(false positive rate)不佳之問題。Although it is known that lung nodule detection has been developed based on deep learning and neural networks, there are still shortcomings such as complex model architecture, high training cost, and insufficient accuracy in practical applications. After the detection of pulmonary nodules is completed, the conventional pulmonary nodule detection system suffers from the problem of poor false positive rate when performing false positive exclusion.

有鑑於此，實有需要建立一種基於胸部低劑量電腦斷層影像之深度學習肺結節偵測方法，以有效協助放射科醫師之診斷工作，進而提升其診斷品質及效率。In view of this, it is necessary to establish a deep learning method for detecting pulmonary nodules based on chest low-dose computed tomography images to effectively assist radiologists in their diagnostic work, thereby improving their diagnostic quality and efficiency.

本發明提出一種基於胸部低劑量電腦斷層影像之深度學習肺結節偵測方法，以克服前述習知技術之缺失。The present invention proposes a deep learning pulmonary nodule detection method based on chest low-dose computed tomography images to overcome the aforementioned shortcomings of the prior art.

依據本發明之一特色，係提出一種基於胸部低劑量電腦斷層影像之深度學習肺結節偵測方法，包含：一肺結節偵測階段，包含步驟：(A) 準備一包含多數CT影像及標示多個肺結節的來源資料集，且對該來源資料集進行一交叉驗證，其中，該交叉驗證是以多個訓練集來訓練出多個Mask R-CNN模型及獲得多份測試結果；(B) 將該來源資料集中的CT影像拆解成多數2D的切片影像，並對該CT影像進行影像增強、肺部區域偵測、及三通道合成，以獲得一包含多張合成的2D圖像的第一資料集；(C) 以一Mask R-CNN圖像分割模型，接受該第一資料集來進行訓練，且以該交叉驗證所訓練出的多個Mask R-CNN模型來進行負難例挖掘(Hard Negative Mining)，以篩選出負難例；及(D) 將該負難例與該交叉驗證的多個訓練集整合，以完成一具有陽性肺結節及難負例等二類別的第二資料集，並利用該第二資料集訓練一Mask R-CNN模型；以及一偽陽性排除階段，包含步驟：(E) 將該來源資料集中有標示的肺結節作為正例，再從該肺結節偵測階段的該交叉驗證的測試結果中篩選出偽陽性例作為負例，組合成一個具有結節與非結節等二類別的第三資料集，再從該來源資料集的CT影像，以該肺結節偵測階段訓練出的Mask R-CNN模型預測出偽陽性結果的質心座標為中心點，切割出長方體立體影像；及(F) 以一3D CNN模型接受該長方體立體影像來進行訓練。According to a feature of the present invention, a deep learning pulmonary nodule detection method based on chest low-dose computed tomography images is proposed, which includes: a pulmonary nodule detection stage, including the steps of: (A) preparing a CT image including a plurality of CT images and marking A source data set of lung nodules, and a cross-validation is performed on the source data set, wherein the cross-validation uses multiple training sets to train multiple Mask R-CNN models and obtain multiple test results; (B) The CT images in the source data set are disassembled into a plurality of 2D slice images, and image enhancement, lung region detection, and three-channel synthesis are performed on the CT images to obtain a first 2D image including multiple synthesized 2D images. A data set; (C) use a Mask R-CNN image segmentation model, accept the first data set for training, and use the multiple Mask R-CNN models trained by the cross-validation to mine negative hard examples (Hard Negative Mining) to screen out negative hard cases; and (D) integrating the negative hard cases with multiple training sets of the cross-validation to complete a second class with two categories of positive lung nodules and hard negative cases data set, and use the second data set to train a Mask R-CNN model; and a false positive exclusion stage, including the steps: (E) the lung nodules marked in the source data set are regarded as positive examples, and then from the lung nodules False positive cases were screened out as negative cases from the cross-validation test results in the detection phase, and combined into a third dataset with two categories of nodules and non-nodules, and then from the CT images of the source dataset, the lung The Mask R-CNN model trained in the nodule detection stage predicts the centroid coordinate of the false positive result as the center point, and cuts out a cuboid stereo image; and (F) a 3D CNN model accepts the cuboid stereo image for training.

依據本發明之另一特色，係提出一種電腦程式產品，儲存於一非暫態電腦可讀取媒體之中，該電腦程式產品具備一指令，使一電腦系統進行前述的基於胸部低劑量電腦斷層影像之深度學習肺結節偵測方法。According to another feature of the present invention, a computer program product is provided, stored in a non-transitory computer-readable medium, the computer program product having an instruction to enable a computer system to perform the aforementioned chest-based low-dose computed tomography Image-based deep learning pulmonary nodule detection method.

以上概述與接下來的詳細說明皆為示範性質，是為了進一步說明本發明的申請專利範圍，而有關本發明的其他目的與優點，將在後續的說明與圖式加以闡述。The above overview and the following detailed description are exemplary in nature, and are intended to further illustrate the scope of the present invention, and other objects and advantages of the present invention will be explained in the following descriptions and drawings.

以下說明書將提供本發明的多個實施例。可理解的是，這些實施例並非用以限制。本發明的各實施例的特徵可加以修飾、置換、組合、分離及設計以應用於其他實施例。The following description will provide various embodiments of the present invention. It is understood that these examples are not intended to be limiting. The features of the various embodiments of the invention can be modified, substituted, combined, separated and designed to apply to other embodiments.

本說明書與請求項所使用的序數例如“第一”、“第二”等之用詞，以修飾請求元件，並不意含及代表較大序數之前必然存在較小序數，也不代表某一請求元件與另一請求元件的排列順序、或是製造順序，該些序數的使用僅用來使具有某命名的一請求元件得以和具有相同命名的另一請求元件能作出清楚區分。The ordinal numbers such as "first", "second" and other terms used in this specification and claims are used to modify the requested elements, and do not imply and represent that there must be a smaller ordinal number before a larger ordinal number, nor do they represent a certain request The arrangement order of elements and another requested element, or the manufacturing order, the use of these ordinal numbers is only used to clearly distinguish a requested element with a certain name from another requested element with the same name.

此外，本文關於“當…”或“…時”等描述係表示”當下、之前或之後”等態樣，而不限定為同時發生之情形，在此先行敘明。本文關於“設置於…上”等類似描述係表示兩元件的對應位置關係，除了有特別限定者之外，並不限定兩元件之間是否有所接觸，在此先行敘明。再者，本文若在多個功效(或元件)之間使用“或”一詞，係表示功效(或元件)可獨立存在，但亦不排除多個功效(或元件)可同時存在的態樣。In addition, the descriptions of "when..." or "when" in this document refer to aspects such as "now, before, or after", and are not limited to simultaneous occurrences, which are described in advance. Similar descriptions of "arranged on" and the like herein represent the corresponding positional relationship between the two elements. Unless otherwise specified, it does not limit whether there is contact between the two elements, which is described in advance here. Furthermore, if the word "or" is used between multiple functions (or elements) herein, it means that the functions (or elements) can exist independently, but it does not exclude the aspect that multiple functions (or elements) can exist simultaneously. .

另外，本發明中關於「連接」、「電性連接」或「耦接」等詞，若無特別強調，則表示包括了直接連接與間接連接之態樣。另外，在本揭露中關於「包含」、「包括」、「具有」、「具備」等用語皆屬於開放式之描述，在此先行敘明。In addition, the terms "connection", "electrical connection" or "coupling" in the present invention, if not particularly emphasized, include the aspects of direct connection and indirect connection. In addition, the terms "comprising", "including", "having", and "having" in this disclosure are all open-ended descriptions, and are described here first.

再者，以下關於本發明的各種實施例，皆可透過軟體程式或電子電路的方式實現，且不限於此。Furthermore, the following various embodiments of the present invention can be implemented by means of software programs or electronic circuits, and are not limited thereto.

本發明是一基於胸部低劑量電腦斷層影像之深度學習肺結節偵測方法及電腦程式產品，如圖1 所示，基於胸部低劑量電腦斷層影像之深度學習肺結節偵測方法在結構上包括一肺結節偵測階段11及一偽陽性排除階段12，其中，肺結節偵測階段11主要是負責從2D的CT切片中找出肺結節的可能位置，並計算該位置的質心(center of mass)座標，且於得出質心座標後，再根據該座標從原始3D的CT掃描影像中擷取肺結節的可能位置的立體影像，並將該立體影像輸入偽陽性排除階段12。偽陽性排除階段12則負責將肺結節的可能位置的立體影像分成「陽性」或「陰性」，以將肺結節偵測階段模型產生的偽陽性預測排除，降低系統整體的偽陽性率。The present invention is a deep learning pulmonary nodule detection method and computer program product based on chest low-dose computed tomography images. As shown in FIG. The lung nodule detection stage 11 and a false positive exclusion stage 12, wherein the lung nodule detection stage 11 is mainly responsible for finding out the possible position of the lung nodule from the 2D CT slice, and calculating the center of mass of the position. ) coordinates, and after the centroid coordinates are obtained, a stereoscopic image of possible positions of the lung nodule is extracted from the original 3D CT scan image according to the coordinates, and the stereoscopic image is input into the false positive elimination stage 12 . The false-positive exclusion stage 12 is responsible for classifying the stereoscopic images of possible locations of lung nodules into "positive" or "negative", so as to exclude the false-positive predictions generated by the model in the lung nodule detection stage and reduce the overall false-positive rate of the system.

如圖1 所示，於深度學習肺結節偵測方法的肺結節偵測階段11中，係先進行肺結節偵測的資料準備及處理111。資料準備是用以準備一包含多數CT影像及標示多個肺結節的來源資料集，本發明是基於由美國國家癌症研究所(National Cancer Institute, NCI)建立、及美國國家衛生研究所基金會(Foundation for the National Institutes of Health, FNIH)以及美國食品藥品管理監督局(Food and Drug Administration, FDA)協力組建的LIDC-IDRI (Lung Image Database Consortium image collection and Image Database Resource Initiative)資料集作為原始資料集來獲取本發明所需的來源資料集，其中，LIDC-IDRI原始資料集共包含1018份CT掃描影像，每份掃描含有0到6個肺結節，每份影像均由四位經驗豐富的放射科醫師進行二階段影像標記，在第一階段，每一位放射師獨立進行標註，並將標註的區域診斷為以下三個類別之一：(1)≧3 mm的肺結節；(2)＜3 mm的肺結節；(3)非結節；在第二階段，每位放射科醫師參考另外三位放射科醫師的診斷，並給出自己的最終診斷結果。As shown in FIG. 1 , in the lung nodule detection stage 11 of the deep learning lung nodule detection method, data preparation and processing 111 for lung nodule detection are first performed. Data preparation is used to prepare a source data set containing most CT images and marking a plurality of pulmonary nodules. The LIDC-IDRI (Lung Image Database Consortium image collection and Image Database Resource Initiative) dataset jointly organized by the Foundation for the National Institutes of Health (FNIH) and the U.S. Food and Drug Administration (FDA) serves as the original dataset. To obtain the source data set required by the present invention, the LIDC-IDRI original data set contains a total of 1018 CT scan images, each scan contains 0 to 6 pulmonary nodules, and each image is prepared by four experienced radiologists. Physicians perform two-stage image labeling. In the first stage, each radiologist independently labels and diagnoses the labeled area as one of the following three categories: (1) pulmonary nodules ≥ 3 mm; (2) < 3 Lung nodules of mm; (3) non-nodules; in the second stage, each radiologist referred to the diagnosis of the other three radiologists and gave his own final diagnosis.

本發明為了維持資料集所包含之CT影像在解析度上的一致性，故只從LIDC-IDRI資料集中選取切片厚度大於等於2.5毫米的CT影像、以及存在於這些CT影像中且符合結節直徑≧3 mm、及獲3至4位放射科醫師認可等條件之肺結節標記作為訓練及測試資料，以這些條件完成篩選之來源資料集共包含897份CT影像和1179個陽性肺結節實例。In order to maintain the consistency of the resolution of the CT images included in the data set, the present invention only selects CT images with a slice thickness greater than or equal to 2.5 mm from the LIDC-IDRI data set, and those existing in these CT images and conforming to the nodule diameter ≧ Pulmonary nodule markers of 3 mm and those approved by 3 to 4 radiologists were used as training and testing data. The source datasets screened under these conditions included a total of 897 CT images and 1179 instances of positive pulmonary nodules.

肺結節偵測階段11是由前述之來源資料集(897份CT影像、1179個陽性肺結節實例)來進行模型訓練。此階段之訓練是用以對該來源資料集進行一交叉驗證，其中，該交叉驗證是以多個訓練集來訓練出多個基於區域的遮罩式卷積神經網路(Mask Region-based Convolutional Neural Network，Mask R-CNN )模型及獲得多份測試結果。於本實施例中，該交叉驗證是一10等份交叉驗證(10-fold cross validation)，如圖2所示，此10等份交叉驗證包括：步驟S21：將來源資料集以CT影像為單位，隨機分為子資料集(0)、子資料集(1)、……、子資料集(9)等10等份(每等份含有89~90份CT影像)；以及，以索引值i=0~9，進行以下步驟：步驟S22：選擇子資料集(i)作為測試集(i)，同時將其他9等份子資料集合併為訓練集(i)；及步驟S23：以訓練集(i)訓練出一個Mask R-CNN模型(i)，並在訓練結束後以子資料集(i)測試Mask R-CNN模型(i)之表現；因此，最終可得到10個不同的Mask R-CNN(Mask R-CNN模型(0)~ Mask R-CNN模型(9))以及在各自測試集(測試集(0)~測試集(9))中的測試結果，將此10份測試結果合併，可作為第一階段模型整體表現之評估指標。The pulmonary nodule detection stage 11 is based on the aforementioned source dataset (897 CT images, 1179 positive pulmonary nodule instances) for model training. The training at this stage is used to perform a cross-validation on the source data set, wherein the cross-validation uses multiple training sets to train multiple Mask Region-based Convolutional Neural Networks (Mask Region-based Convolutional Neural Networks). Neural Network, Mask R-CNN) model and obtain multiple test results. In this embodiment, the cross-validation is a 10-fold cross-validation (10-fold cross-validation). As shown in FIG. 2, the 10-fold cross-validation includes: Step S21: use the source data set as a unit of CT image , randomly divided into 10 equal parts such as sub-data set (0), sub-data set (1), ..., sub-data set (9) (each equal part contains 89~90 CT images); and, with index value i =0~9, carry out the following steps: Step S22: select the sub-data set (i) as the test set (i), and simultaneously combine the other 9 equal sub-data sets into the training set (i); and step S23: use the training set ( i) Train a Mask R-CNN model (i), and test the performance of the Mask R-CNN model (i) with a subset of datasets (i) after training; therefore, 10 different Mask R-CNN models are finally obtained. CNN (Mask R-CNN model (0) ~ Mask R-CNN model (9)) and the test results in their respective test sets (test set (0) ~ test set (9)), merge the 10 test results , which can be used as an evaluation index for the overall performance of the first-stage model.

圖3顯示前述肺結節偵測階段11的資料處理主要包含：影像增強(image enhancement)步驟S31、肺部區域偵測(lung area detection)步驟S32、及三通道合成步驟S33等三個處理步驟。首先，由於肺結節偵測階段11的模型的輸入為2D圖像，因此需將來源資料集中的CT影像拆解成許多2D的切片影像以利輸入模型，為此，可利用以下公式1對來源資料集中的897份CT影像中的亨氏單位(Hounsfield Unit，HU)值進行運算，以達到影像增強的效果。

，其中， a和b是由選擇的窗寬(window width)和窗位(window level)決定，於本發明中，由於對所有CT 影像序列選擇的固定窗位為-400 HU、固定窗寬則為1500 HU，因此可得a=([2×(-400)-1500])⁄2=-1150 HU、b=([2×(-400)+1500])⁄2=350 HU。 3 shows that the data processing in the aforementioned lung nodule detection stage 11 mainly includes three processing steps: image enhancement step S31 , lung area detection step S32 , and three-channel synthesis step S33 . First, since the input of the model in the lung nodule detection stage 11 is a 2D image, the CT images in the source data set need to be disassembled into many 2D slice images to facilitate input into the model. To this end, the following formula 1 can be used to pair the source The Hounsfield Unit (HU) values in the 897 CT images in the dataset were calculated to achieve the effect of image enhancement.

, where a and b are determined by the selected window width and window level. In the present invention, since the fixed window level selected for all CT image sequences is -400 HU, the fixed window width is is 1500 HU, so a=([2×(-400)-1500])⁄2=-1150 HU, b=([2×(-400)+1500])⁄2=350 HU.

於肺部區域偵測步驟S32中，是對經過前述影像增強後的切片影像以一連串的影像形態學函數進行肺部區域偵測，藉以去除肺部以外區域的雜訊。更特定地，前述肺部區域偵測步驟S32包含九個步驟如下： (1) 影像二值化(thresholding)步驟321：此步驟使用的閾值為-400 HU，能達到區分肺部區域和肺外體腔的效果，經過二值化之後的影像會變成二值圖像(binary image)； (2) 邊界清除步驟S322：將二值圖像(binary image)中靠近圖像邊界的1值清除； (3) 連通區域標記(label)步驟323：將二值化的圖像區分為一個或多個連通區域； (4) 連通區域篩選步驟324：利用上個步驟223得出的標記，留下影像中面積前三大的三個連通區域的像素值為1，其他部分改成0，以確保所有肺結節都被留下來； (5)侵蝕(erosion)運算步驟325：利用半徑為2的圓盤(disk)作為結構元素(structuring element)對二值化影像進行影像形態學中的侵蝕運算，以將肺結節與血管分離，侵蝕運算的定義如公式2所示：

其中，X是二值化圖像，B是結構元素、Bx代表B在X上的平移； (6) 閉合(closing)運算步驟326：利用半徑為10的圓盤作為結構元素對二值化影像進行影像形態學中的閉合運算，以保留肺壁上的肺結節，閉合運算本質上是利用同一個結構元素，先對影像作擴張(dilation)運算，再作侵蝕運算(定義如公式2)，而擴張運算的定義則如公式3所示，且由侵蝕運算與擴張運算可組合出閉合運算，如公式4所示：

； (7) 擴張運算步驟327：利用半徑為10的圓盤再次對二值化圖像進行擴張運算，以便更加完整地保留肺壁上的結節； (8) 填補空洞步驟328：將運算過程中於連通區域內產生的空洞填補起來，使其成為完整的區域； (9) 影像疊加步驟329：將二值化影像疊加到原始影像上，將對應到二值化影像中像素值為1的區域的原始影像區域留下，其餘區域刪除，而只留下肺部區域。 In the lung region detection step S32, lung region detection is performed on the sliced image after the aforementioned image enhancement with a series of image morphological functions, so as to remove noise in regions other than the lungs. More specifically, the aforementioned lung area detection step S32 includes nine steps as follows: (1) Image binarization (thresholding) step 321: the threshold used in this step is -400 HU, which can distinguish between the lung area and the extra-pulmonary area. For the effect of the body cavity, the image after binarization will become a binary image; (2) Boundary removal step S322: remove the 1 value near the image boundary in the binary image; ( 3) Connected region labeling (label) step 323: distinguish the binarized image into one or more connected regions; (4) Connected region screening step 324: use the label obtained in the previous step 223 to leave the image in the image. The pixel value of the top three connected areas in the area is 1, and the other parts are changed to 0 to ensure that all lung nodules are left; (5) erosion (erosion) operation step 325: use a disk with a radius of 2 ( disk) as a structuring element to perform the erosion operation in image morphology on the binarized image to separate the pulmonary nodule from the blood vessel. The definition of the erosion operation is shown in Equation 2:

Among them, X is the binarized image, B is the structuring element, and Bx represents the translation of B on X; (6) Closing operation step 326: use a disk with a radius of 10 as a structuring element to perform the binarization image Perform the closing operation in image morphology to preserve the lung nodules on the lung wall. The closing operation essentially uses the same structural element to perform a dilation operation on the image first, and then perform an erosion operation (defined as Equation 2), The definition of the expansion operation is shown in Equation 3, and the closure operation can be combined by the erosion operation and the expansion operation, as shown in Equation 4:

(7) Expansion operation step 327: use a disk with a radius of 10 to perform an expansion operation on the binarized image again, so as to preserve the nodules on the lung wall more completely; Fill up the holes generated in the connected area to make it a complete area; (9) Image superimposition step 329: superimpose the binarized image on the original image, and map the area corresponding to the pixel value of 1 in the binarized image The original image area is left, the rest is deleted, and only the lung area is left.

圖4顯示出前述肺部區域偵測的每個步驟S321~S329的輸出圖像，由圖4可知，切片影像經肺部區域偵測步驟32的處理後，即可獲得清晰的肺部影像。Fig. 4 shows the output image of each step S321～S329 of the aforementioned lung region detection, as can be seen from Fig. 4, after the slice image is processed by the lung region detection step 32, a clear lung image can be obtained.

於三通道合成步驟S33中，對於上述經過肺部區域偵測的2D的切片影像，將3張深度上連續的切片影像利用圖片的RGB通道結合成一張合成的2D影像進行訓練，以增加2D影像中的立體空間資訊。且對於同一個肺結節(目標肺結節)，選取3張連續且不同深度的2D影像作為訓練資料，以使得模型可以從2D影像中擷取3D空間的特徵。其中，此3張2D影像的選取方式為：假設目標肺結節的質心深度座標為z，則第一張圖片的R通道會由深度為(z+2)的切片構成，G通道是深度(z+1)的切片，B通道則是深度z的切片；第二張圖片則依序為深度(z+1)、z、(z-1)的切片；第三張圖片則依序為深度z、(z-1)、(z-2)的切片。以此組合便能使模型更準確地辨識質心深度附近切片中的肺結節，也就能從模型的預測中得出更精確的結節質心座標。而由於某些結節的大小不足以合成3張圖像，所以只能將包含該結節的所有切片以相同的處理方式合成1~2張2D圖像，因此實際上包含肺結節的合成的2D圖像僅有3266張，而非1179×3=3537張，且如圖1所示，此3266張包含肺結節的合成的2D圖像即為肺結節偵測階段11所產生的只包含肺結節的單一類別的第一資料集112。In the three-channel synthesis step S33, for the above-mentioned 2D slice images detected in the lung region, three continuous slice images in depth are combined into a synthesized 2D image using the RGB channels of the picture for training, so as to increase the 2D image. 3D spatial information in . And for the same lung nodule (target lung nodule), three consecutive 2D images with different depths are selected as training data, so that the model can extract 3D space features from the 2D images. Among them, the selection method of the three 2D images is as follows: Assuming that the center of mass of the target lung nodule has a depth coordinate of z, the R channel of the first image will be composed of slices with a depth of (z+2), and the G channel is the depth (z+2). z+1), the B channel is the slice of depth z; the second image is the slice of depth (z+1), z, (z-1) in sequence; the third image is the depth in sequence Slices of z, (z-1), (z-2). This combination allowed the model to more accurately identify lung nodules in slices near the centroid depth, which in turn allowed more accurate nodule centroid coordinates to be derived from the model's predictions. However, because some nodules are not large enough to synthesize 3 images, only 1~2 2D images can be synthesized by all the slices containing the nodules in the same way, so the synthesized 2D images actually contain the lung nodules. There are only 3266 images, instead of 1179×3=3537 images, and as shown in Figure 1, the 3266 composite 2D images containing lung nodules are the lung nodule detection stage 11 generated images that only contain lung nodules. A first dataset 112 of a single category.

再如圖1所示，肺結節偵測階段11的模型是一個以ResNet50為骨幹(backbone)的Mask R-CNN圖像分割模型113，此Mask R-CNN圖像分割模型113接受上述第一資料集112的2D圖像作為輸入資料，並輸出一個遮罩(mask)將圖像中的物件予以標示。As shown in FIG. 1 again, the model of the lung nodule detection stage 11 is a Mask R-CNN image segmentation model 113 with ResNet50 as the backbone, and the Mask R-CNN image segmentation model 113 accepts the above-mentioned first data. The 2D image of set 112 is used as input data, and a mask is output to label the objects in the image.

圖5顯示肺結節偵測階段11的模型訓練流程，首先，於步驟S51中，是直接利用前述只包含肺結節的單一類別的第一資料集112來進行訓練，於訓練完成後，請一併參照圖1，步驟S52被執行以進行負難例挖掘(Hard Negative Mining)114，其係以前述交叉驗證所訓練出的10個模型(Mask R-CNN模型(0)~Mask R-CNN模型(9))對各自的訓練集(訓練集(0) ~訓練集(9))進行預測。模型回傳的結果包含其判讀出的肺結節在圖像中的範圍輪廓(segmentation contour)，以及代表該範圍是肺結節之機率的預測分數(score)。接著，根據預設條件：(1)預測分數高於0.5、且(2)與任意結節的交集率(Intersection over Union, IoU)都小於0.01，從模型的預測中篩選出符合該預設條件的預測結果，並將此預測結果定義為負難例(Hard Negative Example)。FIG. 5 shows the model training process of the lung nodule detection stage 11. First, in step S51, the first data set 112 that only contains a single type of lung nodules is directly used for training. Referring to FIG. 1 , step S52 is performed to perform Hard Negative Mining 114, which is 10 models (Mask R-CNN model (0) ~ Mask R-CNN model ( 9)) Predict the respective training sets (training set(0) ~ training set(9)). The results returned by the model include the segmentation contour in the image of the pulmonary nodule it detected, and the predicted score (score) representing the probability that the segment is a pulmonary nodule. Next, according to the preset conditions: (1) the predicted score is higher than 0.5, and (2) the intersection rate with any nodule (Intersection over Union, IoU) is less than 0.01, screen out the predictions from the model that meet the preset conditions. Predict the result, and define this prediction result as a Hard Negative Example.

最後，再如圖1所示，將前述負難例與原本的訓練集(訓練集(0) ~訓練集(9))整合，進而完成一份具有陽性肺結節及負難例等二類別的第二資料集115，並利用此第二資料集115訓練一新的Mask R-CNN模型116，以減少模型預測結果中的偽陽性例。且前述訓練流程的兩個訓練步驟皆是在0.002的學習速率(learning rate)下，訓練180000次迭代(iteration)後完成。Finally, as shown in Figure 1, the aforementioned negative examples are integrated with the original training set (training set (0) ~ training set (9)) to complete a two-category report with positive pulmonary nodules and negative examples. The second dataset 115 is used to train a new Mask R-CNN model 116 to reduce false positives in the model prediction results. The two training steps of the aforementioned training process are both completed after 180,000 iterations of training at a learning rate of 0.002.

請參照圖1，於深度學習肺結節偵測方法的偽陽性排除階段12中，係先進行偽陽性排除的資料準備及處理121，其中，由於偽陽性排除階段12的模型所接受的輸入是3D影像，因此將來源資料集中有標示的結節作為正例，再從肺結節偵測階段11的交叉驗證的測試結果中篩選出偽陽性(false positive)例作為負例，組合成一個具有結節與非結節等二類別的第三資料集，其中，負例的篩選條件為：(1)預測機率＞0.2(預測信心閾值)；(2)預測區域的質心座標與任一結節的直線距離都不小於該結節半徑。將負例(偽陽性)篩選出來後，基於該具有結節與非結節等二類別的第三資料集，可從來源資料集的CT影像，以肺結節偵測階段11的模型預測出偽陽性結果的質心(center of mass)座標為中心點，切割出長40、寬40、高20的長方體立體影像。此長方體立體影像再經由前述公式1所描述的函數進行影像增強，且於影像增強中所使用的固定窗寬及窗位亦與肺結節偵測階段11時所使用的相同。Referring to FIG. 1 , in the false-positive elimination stage 12 of the deep learning lung nodule detection method, data preparation and processing 121 for false-positive elimination are first performed, wherein the input accepted by the model in the false-positive elimination stage 12 is 3D Therefore, the marked nodules in the source data set are regarded as positive cases, and the false positive cases are screened out from the cross-validation test results in stage 11 of lung nodule detection as negative cases, and combined into a nodule with nodules and non-positive cases. The third data set of two categories such as nodules, among which, the screening conditions for negative cases are: (1) prediction probability > 0.2 (prediction confidence threshold); (2) the centroid coordinates of the prediction area and the straight line distance of any nodule are not smaller than the nodule radius. After screening out negative cases (false positives), based on the third dataset with two categories of nodules and non-nodules, false positive results can be predicted from the CT images of the source dataset using the model of lung nodule detection stage 11 The coordinate of the center of mass is the center point, and a cuboid three-dimensional image with a length of 40, a width of 40, and a height of 20 is cut out. The cuboid stereoscopic image is then image enhanced by the function described in Equation 1, and the fixed window width and window level used in the image enhancement are also the same as those used in the lung nodule detection stage 11 .

圖6所示，偽陽性排除階段12的模型是由1層3D卷積層(convolution)、3層3D 混合深度卷積層(mixed depthwise convolution, MixConv)、以及3層全連接層(fully connected layer)組成的三維卷積神經網路(3D Convolutional Neural Network，3D CNN)模型122。本發明在模型中使用批次標準化(batch normalization)和丟棄法(dropout)來避免過度擬和的問題，並在層與層之間插入線性整流函數(Rectified Linear Unit, ReLU)作為激勵函數(activation function)。輸入此3D CNN模型122的長方體立體影像會先通過4層卷積層(1層3D卷積層及3層混合深度卷積層)，以轉換成特徵圖(feature map)後再被壓平並輸入全連接層。而為了充分利用CT影像中不同尺度的特徵並減少計算量，本發明使用MixConv技術，在3層3D混合深度卷積層的每一層中以五種不同大小的卷積核(kernel)對輸入張量進行運算，如圖7顯示出MixConv技術的詳細說明，其中，一MixConv層之輸入張量為X，亦即，X是一大小為(c,h,w,d)之張量，當中c為張量頻道數(channel)，h為高，w為寬，d為深，MixConv技術會將X從c維度均分為五個大小為(c⁄5,h,w,d)的部分張量G1,G2,…,G5，之後，這五個部分張量G1,G2,…,G5會各自以一個卷積核Wi(i=1~5)進行卷積運算，在此設定W1到W5的大小分別為(3,3,3)、(5,5,3)、(7,7,5)、(9,9,7)、(11,11,9)，由於考慮到CT影像在深度方向上的解析度通常較低，卷積核W2到W5在深度軸上比較短，因此在進行卷積運算時，利用適當的零填充(zero-padding)保持各張量之高、寬、深不變，經此卷積運算會對每個部份張量產出一部分輸出張量Yi(i=1~5)，再將Y1到Y5於c維度進行連接(concatenation)後產生輸出張量Y。As shown in Figure 6, the model of the false positive exclusion stage 12 is composed of 1 layer of 3D convolution layer (convolution), 3 layers of 3D mixed depthwise convolution layer (mixed depthwise convolution, MixConv), and 3 layers of fully connected layers (fully connected layers). The three-dimensional convolutional neural network (3D Convolutional Neural Network, 3D CNN) model 122. The present invention uses batch normalization and dropout in the model to avoid the problem of overfitting, and inserts a Rectified Linear Unit (ReLU) between layers as an activation function (activation function) function). The cuboid stereoscopic image input to this 3D CNN model 122 will first pass through 4 layers of convolution layers (1 layer of 3D convolution layers and 3 layers of mixed depth convolution layers) to be converted into feature maps, and then flattened and input to fully connected Floor. In order to make full use of the features of different scales in CT images and reduce the amount of calculation, the present invention uses the MixConv technology, in each layer of the 3-layer 3D mixed depth convolution layer, five different sizes of convolution kernels (kernel) are used to input tensors. Perform operations, as shown in Figure 7 for a detailed description of the MixConv technique, where the input tensor of a MixConv layer is X, that is, X is a tensor of size (c, h, w, d), where c is The number of tensor channels (channel), h is height, w is width, d is depth, MixConv technology will divide X from c dimension into five partial tensors of size (c⁄5,h,w,d) G1, G2,..., G5, after that, these five partial tensors G1, G2,..., G5 will each perform a convolution operation with a convolution kernel Wi (i=1~5), and set the W1 to W5 here. The sizes are (3,3,3), (5,5,3), (7,7,5), (9,9,7), (11,11,9), because considering the depth of CT images The resolution in the direction is usually low, and the convolution kernels W2 to W5 are relatively short on the depth axis. Therefore, when performing convolution operations, use appropriate zero-padding to maintain the height, width and depth of each tensor. Unchanged, after this convolution operation, a part of the output tensor Yi (i=1~5) will be produced for each part of the tensor, and then Y1 to Y5 will be concatenated in the c dimension to produce the output tensor Y .

偽陽性排除階段12的模型的訓練方法是在0.001的學習速率下，訓練90個epochs後完成，且為了達成資料強化(data augmentation)的效果，訓練過程中會對圖像進行隨機旋轉(rotation)，並此外，還可進一步利用Smith (Smith, L.N., A disciplined approach to neural network hyper-parameters: Part 1--learning rate, batch size, momentum, and weight decay. arXiv preprint arXiv:1803.09820, 2018.) 提出的one-cycle訓練策略調整學習速率，以達成更好的訓練效果。The training method of the model in stage 12 of false positive exclusion is to complete the training after 90 epochs at the learning rate of 0.001, and in order to achieve the effect of data augmentation, the image will be randomly rotated during the training process (rotation) , and in addition, it can be further proposed by Smith (Smith, L.N., A disciplined approach to neural network hyper-parameters: Part 1--learning rate, batch size, momentum, and weight decay. arXiv preprint arXiv:1803.09820, 2018.) The one-cycle training strategy adjusts the learning rate to achieve better training results.

由上述之說明可知，本發明藉由在訓練過程中採用的負難例挖掘(HNM)技術，可從訓練流程的角度切入，改善模型的偽陽性預測率(false positive rate)，降低偽陽性排除模型的負擔和訓練成本。且藉由在肺結節偵測階段選用mask RCNN圖像分割模型，而能夠更精準地指出肺結節在圖像中的位置。此外，更藉由在肺結節偵測階段使用較節省運算資源的2D平面圖像訓練模型，並以三通道合成(three-channel superimposition)影像處理技術增加2D圖像中的立體空間資訊，而可在訓練成本和模型表現之間取得平衡，From the above description, it can be seen that the present invention can improve the false positive rate of the model and reduce the false positive rate from the perspective of the training process by adopting the negative difficult mining (HNM) technology in the training process. Model burden and training cost. And by selecting the mask RCNN image segmentation model in the lung nodule detection stage, the position of the lung nodule in the image can be more accurately pointed out. In addition, by using the 2D plane image training model with less computing resources in the lung nodule detection stage, and using the three-channel superimposition image processing technology to increase the three-dimensional spatial information in the 2D image, it is possible to To strike a balance between training cost and model performance,

儘管本發明已透過上述實施例來說明，可理解的是，根據本發明的精神及本發明所主張的申請專利範圍，許多修飾及變化都是可能的。Although the present invention has been illustrated by the foregoing embodiments, it will be understood that many modifications and variations are possible in accordance with the spirit of the invention and the scope of the claimed invention.

11:肺結節偵測階段 12:偽陽性排除階段 111, 121:資料準備及處理 112:第一資料集 113:Mask R-CNN圖像分割模型 114:負難例挖掘 115:第二資料集 116:Mask R-CNN模型 122:3D CNN模型 S21~S23:步驟 S31~S33:步驟 S321~S329:步驟 S51~S52:步驟 11: Pulmonary Nodule Detection Phase 12: False positive exclusion stage 111, 121: Data preparation and processing 112: The first data set 113: Mask R-CNN Image Segmentation Model 114: Negative Hard Case Mining 115: Second data set 116: Mask R-CNN Model 122: 3D CNN Models S21~S23: Steps S31~S33: Steps S321~S329: Steps S51~S52: Steps

圖1是本發明的基於胸部低劑量電腦斷層影像之深度學習肺結節偵測方法的架構圖；圖2是依據本發明的方法的10等份交叉驗證的流程圖；圖3是依據本發明的方法的肺結節偵測階段的資料處理流程圖；圖4顯示依據本發明的方法的肺部區域偵測的輸出圖像；圖5是依據本發明的方法的肺結節偵測階段的模型訓練流程；圖6示意地顯示依據本發明的方法的偽陽性排除階段的3D CNN模型；以及圖7示意地顯示依據本發明的方法於3D CNN模型中使用MixConv技術。 1 is a schematic diagram of a deep learning pulmonary nodule detection method based on chest low-dose computed tomography images of the present invention; Figure 2 is a flow chart of 10-aliquot cross-validation according to the method of the present invention; 3 is a data processing flow chart of the lung nodule detection phase of the method according to the present invention; Fig. 4 shows the output image of lung region detection according to the method of the present invention; 5 is a model training process in a lung nodule detection stage according to the method of the present invention; Figure 6 schematically shows the 3D CNN model of the false positive exclusion stage of the method according to the invention; and Figure 7 schematically shows the use of the MixConv technique in a 3D CNN model according to the method of the present invention.

11:肺結節偵測階段 11: Pulmonary Nodule Detection Phase

12:偽陽性排除階段 12: False positive exclusion stage

111,121:資料準備及處理 111, 121: Data preparation and processing

112:第一資料集 112: The first data set

113:Mask R-CNN圖像分割模型 113: Mask R-CNN Image Segmentation Model

114:負難例挖掘 114: Negative Hard Case Mining

115:第二資料集 115: Second data set

116:Mask R-CNN模型 116: Mask R-CNN Model

122:3D CNN模型 122: 3D CNN Models

Claims

A deep learning pulmonary nodule detection method based on chest low-dose computed tomography images, comprising: A lung nodule detection stage, including steps: (A) Prepare a source dataset containing most CT images and multiple pulmonary nodules, and perform a cross-validation on the source dataset, wherein the cross-validation is based on multiple training sets to train multiple Mask R- CNN model and obtain multiple test results; (B) Disassemble the CT images in the source data set into most 2D slice images, and perform image enhancement, lung region detection, and three-channel synthesis on the CT images to obtain a 2D image containing multiple composites The first dataset of images; (C) Use a Mask R-CNN image segmentation model to accept the first data set for training, and use the multiple Mask R-CNN models trained by the cross-validation to perform negative hard example mining to filter out negative case; and (D) Integrate the negative and difficult examples with multiple training sets of the cross-validation to complete a second dataset with two categories of positive pulmonary nodules and hard negative examples, and use the second dataset to train a Mask R - CNN models; and A false positive exclusion stage, including steps: (E) Take the marked lung nodules in the source data set as positive cases, and then screen out false positive cases as negative cases from the cross-validation test results in the lung nodule detection stage, and combine them into a nodule with nodules and non-nodules Wait for the third data set of the second category, and then use the CT image of the source data set to use the Mask R-CNN model trained in the lung nodule detection stage to predict the centroid coordinates of the false positive results as the center point, and cut out the cuboid stereoscopic images; and (F) A 3D CNN model is trained on the cuboid stereo image.

The deep learning pulmonary nodule detection method based on chest low-dose computed tomography images as claimed in claim 1, wherein, in step (A), the cross-validation is a 10-section cross-validation, which includes the steps: (A1) The source dataset is randomly divided into 10 equal sub-data sets, including sub-data sets (0) ~ sub-data sets (9), with CT images as the unit; and (A2) with the index value i=0~9, carry out the following steps (A3)~(A5); (A3) Select the sub-data set (i) as the test set (i), and at the same time combine the other 9 equal sub-data sets into the training set (i); (A4) train a Mask R-CNN model (i) with the training set (i); and (A5) Test the performance of the Mask R-CNN model (i) with the sub-dataset (i) to obtain the test results of the Mask R-CNN model (i) in its test set (i).

The deep learning pulmonary nodule detection method based on chest low-dose computed tomography images according to claim 1, wherein, in step (B), the HU value in the CT image of the source data set is based on the following image enhancement formula Perform operations to achieve the effect of image enhancement:

, where a and b are determined by the window width and window level selected for the CT image sequence.

The deep learning lung nodule detection method based on chest low-dose computed tomography images as claimed in claim 1, wherein, in step (B), the lung region detection is performed on the image-enhanced slice images by a series of The Image Morphology function performs lung region detection to remove noise outside the lung.

The deep learning pulmonary nodule detection method based on chest low-dose computed tomography images according to claim 4, wherein the lung area detection comprises: image binarization step; boundary removal step; connected area marking step; connected area A screening step; an erosion operation step; a closure operation step; an expansion operation step; a void filling step; and an image overlay step.

The deep learning pulmonary nodule detection method based on chest low-dose computed tomography images as claimed in claim 1, wherein, in step (B), the three-channel synthesis is to use the RGB of the image for three consecutive slice images in depth. The channels are combined into a synthetic 2D image for training to increase the three-dimensional spatial information, and for the same lung nodule, three consecutive 2D images with different depths are selected as training data, so that the features of the 3D space can be extracted from the 2D images. .

The deep learning pulmonary nodule detection method based on chest low-dose computed tomography images according to claim 1, wherein the Mask R-CNN image segmentation model is a Mask R-CNN image segmentation model with ResNet50 as the backbone.

The deep learning pulmonary nodule detection method based on chest low-dose computed tomography images as claimed in claim 1, wherein the mining of negative difficult cases is based on 10 Mask R-CNN models trained by the 10 equal parts cross-validation. The respective training sets are predicted, and then a prediction result that meets a preset condition is selected from the prediction, so as to define the prediction result as a negative difficult case.

The deep learning pulmonary nodule detection method based on chest low-dose computed tomography images as claimed in claim 3, wherein, in step (E), image enhancement is further performed on the cut-out cuboid three-dimensional image using the image enhancement formula.

The deep learning pulmonary nodule detection method based on chest low-dose computed tomography images as claimed in claim 1, wherein, in step (F), the 3D CNN model is composed of 1 layer of 3D convolutional layers, 3 layers of 3D mixed depth volume It consists of built-up layers and 3 fully connected layers.

The deep learning lung nodule detection method based on chest low-dose computed tomography image as claimed in claim 10, wherein, in the 3D CNN model, batch normalization and discarding method are used to avoid overfitting, and the layer and A linear rectification function is inserted between the layers as the excitation function.

The deep learning pulmonary nodule detection method based on chest low-dose computed tomography images according to claim 10, wherein the cuboid stereoscopic image input into the 3D CNN model first passes through four convolution layers to convert into feature maps and then is compressed Flatten and enter the fully connected layer.

The deep learning pulmonary nodule detection method based on chest low-dose computed tomography images as claimed in claim 10, wherein, in the 3D CNN model, each layer of the 3-layer 3D mixed depth convolutional layer is composed of five kinds of Convolution kernels of different sizes operate on the input tensors to make full use of features at different scales in CT images and reduce computation.

A computer program product, stored in a non-transitory computer-readable medium, the computer program product having an instruction to cause a computer system to perform the chest-based low-dose computed tomography of any one of claims 1 to 13 Image-based deep learning pulmonary nodule detection method.