CN114549462A

CN114549462A - Focus detection method, device, equipment and medium based on visual angle decoupling Transformer model

Info

Publication number: CN114549462A
Application number: CN202210162528.3A
Authority: CN
Inventors: 李灏峰; 黄俊嘉; 李冠彬; 刘周; 钟贻洪; 陈影影; 王云飞; 罗德红; 万翔
Original assignee: Chinese University of Hong Kong Shenzhen
Current assignee: Chinese University of Hong Kong Shenzhen
Priority date: 2022-02-22
Filing date: 2022-02-22
Publication date: 2022-05-27

Abstract

The invention discloses a focus detection method, a device, equipment and a storage medium based on a visual angle decoupling transform model, and the method is realized and comprises the following steps: acquiring MRI image data of a user to be detected, and preprocessing the MRI image data; acquiring slice data according to the preprocessed MRI image data, and inputting the slice data into a pre-trained focus detection model; performing feature extraction on the slice data to acquire high-level feature data; fusing high-level feature data on different scales through different visual angles to obtain high-level feature prediction data; and predicting the focus of the user to be detected through high-level feature prediction data. By fusing the extracted feature data on different scales through different visual angles, the context information of the input slice can be enhanced, multi-scale feature prediction is realized, the positioning precision of the focus is improved, the early-stage tumor recognition effect is strong, and the detection accuracy is effectively improved.

Description

Lesion detection method, device, equipment and medium based on perspective decoupling Transformer model

技术领域technical field

本发明涉及深度学习及医学影像技术领域，尤其涉及一种基于视角解耦Transformer模型的病灶检测方法、装置、设备及存储介质The invention relates to the technical fields of deep learning and medical imaging, and in particular to a method, device, equipment and storage medium for lesion detection based on a perspective decoupling Transformer model

背景技术Background technique

随着经济水平的不断提高，以及医疗科技的快速发展，人们对健康有了更高的需求，因此医疗影像技术得到了创新蓬勃发展，科学利用医学图像分析对组织和细胞影像进行高效、准确的分类，可以有助于医生更好探索病灶治疗途径。With the continuous improvement of economic level and the rapid development of medical technology, people have higher demands for health. Therefore, medical imaging technology has been innovated and developed vigorously. The scientific use of medical image analysis to efficiently and accurately analyze tissue and cell images. Classification can help doctors better explore the path of treatment of lesions.

目前，传统病灶检测中，以脑瘤为例，通常利用模板匹配计算预定义的脑肿瘤模板在图像中的位置，这些方法受手工设计的特征不够精确的影响；另外，也可有利用病灶分割的方法对肿瘤位置进行识别，但这些方法需要标记每个图像的像素作为标签训练，需要专业的放射性医生配合，获取标签的价格也比较昂贵；或者在一些其他方法中，需要在训练的时候，利用大尺寸肿瘤的数据集或者用现有的2D目标检测网络预测肿瘤框对病灶进行检测，但是这些方法对小肿瘤识别度较低，缺乏3D特征的上下文融合。At present, in traditional lesion detection, taking brain tumor as an example, template matching is usually used to calculate the position of a predefined brain tumor template in the image. These methods are affected by the inaccuracy of hand-designed features; in addition, lesion segmentation can also be used. However, these methods need to mark the pixels of each image as labels for training, which requires the cooperation of professional radiology doctors, and the price of obtaining labels is relatively expensive; or in some other methods, it is necessary to The lesions are detected using datasets of large-sized tumors or predicting tumor boxes with existing 2D object detection networks, but these methods have low recognition for small tumors and lack the context fusion of 3D features.

由此可知，由于脑肿瘤在早期阶段非常小，采用目前的方式，容易错过或者混淆，难以检测，从而影响检测的精确度。It can be seen that since brain tumors are very small in the early stage, the current method is easy to miss or confuse, and it is difficult to detect, thus affecting the accuracy of detection.

发明内容SUMMARY OF THE INVENTION

基于此，有必要针对上述技术问题，提供一种基于视角解耦Transformer模型的病灶检测方法、装置、设备及存储介质，以解决现有技术中在自动脑瘤检测中，对于早期病灶较小难以检测，影响检测精确度的问题。Based on this, it is necessary to provide a lesion detection method, device, device and storage medium based on the perspective decoupling Transformer model to solve the problems in the automatic brain tumor detection in the prior art. detection, which affects the detection accuracy.

第一方面，提供了一种基于视角解耦Transformer模型的病灶检测方法，包括：In a first aspect, a lesion detection method based on a perspective decoupling Transformer model is provided, including:

获取待测用户的MRI图像数据，并对所述MRI图像数据进行预处理；Obtaining the MRI image data of the user to be tested, and preprocessing the MRI image data;

根据预处理后的MRI图像数据，获取切片数据，并输入至预先训练的病灶检测模型中；Obtain slice data according to the preprocessed MRI image data and input it into the pre-trained lesion detection model;

对所述切片数据进行特征提取，以获取高层特征数据；Feature extraction is performed on the slice data to obtain high-level feature data;

通过不同的视角在不同尺度上对所述高层特征数据进行融合，以获取高层特征预测数据；The high-level feature data is fused at different scales from different perspectives to obtain high-level feature prediction data;

通过所述高层特征预测数据对所述待测用户的病灶进行预测。The lesion of the user to be tested is predicted by using the high-level feature prediction data.

在一实施例中，所述输入至预先训练的病灶检测模型中之前，包括：In one embodiment, before the input into the pre-trained lesion detection model, it includes:

采集多个MRI样本图像数据，并通过包围盒的形式对每个MRI样本图像数据的横截面切片的病灶区域进行标记；Collect multiple MRI sample image data, and mark the lesion area of the cross-sectional slice of each MRI sample image data in the form of a bounding box;

对采集的所述MRI样本图像数据进行预处理；Preprocessing the collected MRI sample image data;

创建原始病灶检测模型，将预处理后的MRI样本图像数据转换为样本切片数据，输入至所述原始病灶检测模型中，并根据所述标记进行迭代学习，直至获取所述预先训练的病灶检测模型。Create an original lesion detection model, convert the preprocessed MRI sample image data into sample slice data, input them into the original lesion detection model, and perform iterative learning according to the labels until the pre-trained lesion detection model is obtained .

在一实施例中，所述对所述切片数据进行特征提取，以获取高层特征数据，包括：In one embodiment, the performing feature extraction on the slice data to obtain high-level feature data includes:

将所述切片数据分割成多个小块，所述小块为单通道图像；dividing the slice data into a plurality of small blocks, the small blocks are single-channel images;

将所述单通道图像嵌入至目标通道中，以获取目标通道数的图像；Embedding the single-channel image into a target channel to obtain an image with a target number of channels;

根据所述目标通道数的图像，获取所述高层特征数据。Acquire the high-level feature data according to the image of the target channel number.

在一实施例中，所述对提取的特征数据通过不同的视角在不同尺度上进行融合，以获取高层特征预测数据，包括：In one embodiment, the extracted feature data are fused at different scales through different perspectives to obtain high-level feature prediction data, including:

将所述高层特征数据划分为多个小窗口，所述小窗口包括M×M个所述小块；dividing the high-level feature data into a plurality of small windows, and the small windows include M×M small blocks;

在每个所述小窗口中进行自注意力计算，以获取上下文联系；perform self-attention computations in each of the small windows to obtain contextual connections;

根据所述上下文联系，通过不同的视角对所述高层特征数据进行融合，以获取高层特征预测数据。According to the context connection, the high-level feature data is fused through different perspectives to obtain high-level feature prediction data.

在一实施例中，所述在每个所述移动窗口中进行自注意力计算，以获取上下文联系，包括：In one embodiment, the performing self-attention calculation in each of the moving windows to obtain contextual connections includes:

将每个所述小窗口分别向目标方向偏移M/2个块区域，形成边界区域；Offset each of the small windows by M/2 block areas to the target direction to form a boundary area;

将所述边界区域补充至原始区域，以形成多个新的小窗口；supplementing the boundary area to the original area to form a plurality of new small windows;

在所述新的小窗口中进行自注意力计算，以获取上下文联系。Self-attention computations are performed in the new widget to obtain contextual connections.

在一实施例中，所述将预处理后的MRI样本图像数据转换为样本切片数据，输入至所述原始病灶检测模型中之后，包括：In one embodiment, after converting the preprocessed MRI sample image data into sample slice data and inputting it into the original lesion detection model, the method includes:

通过区域生成网络提取预设数量的候选区域中；Extract a preset number of candidate regions through the region generation network;

计算所述候选区域与真实区域之间的交并比；Calculate the intersection ratio between the candidate area and the real area;

根据所述交并比与预设交并比阈值之间的差值，将所述候选区域划分为正样本以及负样本，并对所述正样本以及所述负样本进行采样，以将所述正样本以及所述负样本之间的比例配置为预设比例；According to the difference between the intersection ratio and a preset intersection ratio threshold, the candidate area is divided into positive samples and negative samples, and the positive samples and the negative samples are sampled, so that the The ratio between the positive samples and the negative samples is configured as a preset ratio;

将符合所述预设比例的候选区域发送至池化层进行池化处理，并进行病灶预测。The candidate regions that meet the preset ratio are sent to the pooling layer for pooling processing, and lesion prediction is performed.

在一实施例中，所述根据预处理后的MRI图像数据，获取切片数据，并输入至预先训练的病灶检测模型中，包括：In one embodiment, obtaining slice data according to the preprocessed MRI image data and inputting it into a pre-trained lesion detection model includes:

从进行不同视角对预处理后的MRI图像数据进行横截面切片处理，以获取多张切片；Perform cross-sectional slice processing on the preprocessed MRI image data from different viewing angles to obtain multiple slices;

在所述多张切片中选取高度连续的切片序列输入至所述预先训练的病灶检测模型中；或者Selecting a highly continuous slice sequence from the plurality of slices and inputting it into the pre-trained lesion detection model; or

在所述多张切片中选取一张切片输入至所述预先训练的病灶检测模型中。A slice is selected from the plurality of slices and input into the pre-trained lesion detection model.

第二方面，提供了一种基于视角解耦Transformer模型的病灶检测装置，包括：In a second aspect, a lesion detection device based on a perspective decoupling Transformer model is provided, including:

图像数据获取模块，用于获取待测用户的MRI图像数据，并对所述MRI图像数据进行预处理；an image data acquisition module, used for acquiring the MRI image data of the user to be tested, and preprocessing the MRI image data;

切片数据获取模块，用于根据预处理后的MRI图像数据，获取切片数据，并输入至预先训练的病灶检测模型中；The slice data acquisition module is used to acquire slice data according to the preprocessed MRI image data, and input it into the pre-trained lesion detection model;

特征提取模块，用于对所述切片数据进行特征提取，以获取高层特征数据；a feature extraction module, used to perform feature extraction on the slice data to obtain high-level feature data;

视角解耦模块，用于通过不同的视角在不同尺度上对所述高层特征数据进行融合，以获取高层特征预测数据；A perspective decoupling module, used to fuse the high-level feature data at different scales through different perspectives to obtain high-level feature prediction data;

预测模块，用于通过所述高层特征预测数据对所述待测用户的病灶进行预测。A prediction module, configured to predict the lesion of the user to be tested by using the high-level feature prediction data.

第三方面，提供了一种计算机设备，包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令，所述处理器执行所述计算机可读指令时实现如上述所述基于视角解耦Transformer模型的病灶检测方法的步骤。In a third aspect, a computer device is provided, comprising a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, when the processor executes the computer-readable instructions The steps of implementing the method for lesion detection based on the perspective decoupling Transformer model as described above.

第四方面，提供了一个或多个可读存储介质，所述可读存储介质存储有计算机可读指令，所述计算机可读指令被处理器执行时实现上述所述基于视角解耦Transformer模型的病灶检测方法的步骤。In a fourth aspect, one or more readable storage media are provided, where the readable storage media store computer-readable instructions, and when the computer-readable instructions are executed by the processor, implement the above-mentioned perspective-based decoupling Transformer model. The steps of the lesion detection method.

上述基于视角解耦Transformer模型的病灶检测方法、装置、设备及存储介质，其方法实现，包括：获取待测用户的MRI图像数据，并对所述MRI图像数据进行预处理；根据预处理后的MRI图像数据，获取切片数据，并输入至预先训练的病灶检测模型中；对所述切片数据进行特征提取，以获取多个不同尺度的特征数据；通过不同的视角在不同尺度上对所述特征数据进行融合，以获取高层特征数据；通过所述高层特征数据对所述待测用户的病灶进行预测。本申请中，通过对切片数据进行特征提取，并对提取的特征数据通过不同视角在不同尺度上进行融合，融合增强输入切片的上下文信息，实现了多尺度特征预测，提高了对病灶的定位精度，对小肿瘤具有很强的识别效果，避免在肿瘤早期阶段发生错过或者混淆，有效提高了检测准确度。The above-mentioned method, device, device and storage medium for lesion detection based on perspective decoupling Transformer model, the method implementation includes: acquiring MRI image data of a user to be tested, and preprocessing the MRI image data; MRI image data, obtain slice data, and input it into a pre-trained lesion detection model; perform feature extraction on the slice data to obtain multiple feature data of different scales; The data are fused to obtain high-level feature data; the lesions of the user to be tested are predicted through the high-level feature data. In the present application, feature extraction is performed on slice data, and the extracted feature data is fused at different scales through different perspectives, and the context information of the input slice is fused and enhanced, thereby realizing multi-scale feature prediction and improving the localization accuracy of lesions. , has a strong identification effect on small tumors, avoids misses or confusions in the early stage of tumors, and effectively improves the detection accuracy.

附图说明Description of drawings

为了更清楚地说明本发明实施例的技术方案，下面将对本发明实施例的描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions of the embodiments of the present invention more clearly, the following briefly introduces the drawings that are used in the description of the embodiments of the present invention. Obviously, the drawings in the following description are only some embodiments of the present invention. , for those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative labor.

图1是本发明一实施例中基于视角解耦Transformer模型的病灶检测方法的一流程示意图；1 is a schematic flowchart of a lesion detection method based on a perspective decoupling Transformer model according to an embodiment of the present invention;

图2是本发明一实施例中提取高层特征数据的流程示意图；2 is a schematic flowchart of extracting high-level feature data in an embodiment of the present invention;

图3是本发明一实施例中不同视角进行视角耦合的实施场景示意图；FIG. 3 is a schematic diagram of an implementation scenario of viewing angle coupling from different viewing angles according to an embodiment of the present invention;

图4是本发明一实施例中基于视角解耦Transformer模型的病灶检测装置的一结构示意图；4 is a schematic structural diagram of a lesion detection device based on a perspective decoupling Transformer model according to an embodiment of the present invention;

图5是本发明一实施例中基于视角解耦Transformer模型的病灶检测装置的一结构示意图；5 is a schematic structural diagram of a lesion detection device based on a perspective decoupling Transformer model according to an embodiment of the present invention;

图6是本发明一实施例中计算机设备的一示意图。FIG. 6 is a schematic diagram of a computer device in an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

在一实施例中，如图1所示，提供一种基于视角解耦Transformer模型的病灶检测方法，包括如下步骤：In one embodiment, as shown in FIG. 1 , a method for detecting lesions based on a perspective decoupling Transformer model is provided, including the following steps:

在步骤S110中，获取待测用户的MRI图像数据，并对所述MRI图像数据进行预处理；In step S110, the MRI image data of the user to be tested is acquired, and the MRI image data is preprocessed;

在本申请实施例中，可通过核磁共振设备获取用户的MRI(Magnetic ResonanceImaging)，核磁共振成像。In this embodiment of the present application, an MRI (Magnetic Resonance Imaging) of the user may be acquired through a nuclear magnetic resonance device, which is a nuclear magnetic resonance imaging.

在本申请实施例中，对获取的MRI图像数据进行预处理，包括，对齐体素或者像素间距，数据归一化，以及将该MRI数据裁剪缩放至512×512的像素大小等。In this embodiment of the present application, the acquired MRI image data is preprocessed, including aligning voxels or pixel spacing, data normalization, and cropping and scaling the MRI data to a pixel size of 512×512.

在步骤S120中，根据预处理后的MRI图像数据，获取切片数据，并输入至预先训练的病灶检测模型中；In step S120, obtain slice data according to the preprocessed MRI image data, and input it into the pre-trained lesion detection model;

在本申请实施例中，获取预处理后的MRI图像数据后，可通过图像分割，将MRI图像数据从不同的视角进行横截面的切片处理，以获取该切片数据，比如，可通过二分位数分割法对该MRI图像数据进行切片处理。In this embodiment of the present application, after obtaining the preprocessed MRI image data, the MRI image data can be sliced from different perspectives through image segmentation to obtain the slice data. For example, the quantile can be obtained. The segmentation method performs slice processing on the MRI image data.

在本申请实施例中，该切片数据可为切片序列，也可为单张切片，其中，以T张切片为例，该切片序列可为连续的T张切片。In this embodiment of the present application, the slice data may be a slice sequence or a single slice, wherein, taking T slices as an example, the slice sequence may be consecutive T slices.

在步骤S130中，对所述切片数据进行特征提取，以获取高层特征数据；在本申请实施例中，可采用Swin-Transformer网络结合特征金字塔(featurepyramidnetwork，FPN)结构作为提取网络，当输入切片数据时，可通过块划分层将切片进行块分割处理，以分割成多个小块，并通过线性嵌入层将分割的小块嵌入到C通道中，通过移动窗口(swin-transformer block)以及块合并层处理后可输出高层特征图，然后，通过卷积计算以及上采样操作，可获取不同尺度融合的同通道的高层特征数据。In step S130, feature extraction is performed on the slice data to obtain high-level feature data; in this embodiment of the present application, a Swin-Transformer network combined with a feature pyramid (feature pyramid network, FPN) structure can be used as the extraction network, when the slice data is input When , the slice can be divided into multiple small blocks through the block division layer, and the divided small blocks can be embedded into the C channel through the linear embedding layer, and the divided small blocks can be embedded in the C channel through the moving window (swin-transformer block) and block merging. After layer processing, high-level feature maps can be output, and then, through convolution calculations and upsampling operations, high-level feature data of the same channel fused at different scales can be obtained.

其中，该高层特征数据可为具有高度抽象和丰富表达能力的特征矩阵。The high-level feature data may be a feature matrix with high abstraction and rich expression ability.

其中，该小块可为单通道图像，该C通道可为常数为C的向量，通过，将分割后的小块嵌入到C通道中，可以得到C通道数的图像。Wherein, the small block can be a single-channel image, and the C channel can be a vector whose constant is C. By embedding the segmented small block into the C channel, an image with the number of C channels can be obtained.

在本申请实施例中，该分割的小块可经过多次移动窗口和块合并层的处理，以便得到需要的高层特征数据，其中，该块合并层可为降采样操作，每做一次降采样操作，可对分割的小块增加一倍通道数，从而可以有效降低高层特征数据的分辨率。In the embodiment of the present application, the divided small blocks may be processed by moving the window and the block merging layer for many times, so as to obtain the required high-level feature data, wherein the block merging layer may be a down-sampling operation, and each down-sampling is performed. operation, the number of channels can be doubled for the divided small blocks, which can effectively reduce the resolution of high-level feature data.

在步骤S140中，通过不同的视角在不同尺度上对所述特征数据进行融合，以获取高层特征预测数据；In step S140, the feature data is fused at different scales from different perspectives to obtain high-level feature prediction data;

在本申请实施例中，MRI图像数据可为三维图像，分别通过W表示X轴，T表示Y轴，H表示Z轴。上述不同视角可分别表示该三维图像的不同视角，比如，T×W视角，T×H视角以及H×W视角。In this embodiment of the present application, the MRI image data may be a three-dimensional image, where W represents the X axis, T represents the Y axis, and H represents the Z axis. The above-mentioned different viewing angles may respectively represent different viewing angles of the three-dimensional image, for example, a T×W viewing angle, a T×H viewing angle, and an H×W viewing angle.

在本申请实施例中，可通过视角解耦Transformer对提取的高层特征数据通过不同视角进行不同尺度的融合，以切片数据包括T张切片的序列为例进行说明，具体的，高层特征数据可包括来自T张切片的序列的多个不同尺度的特征数据，分别在不同视角中，通过移动窗口计算注意力操作，即从不同视角将输入的高层特征数据划分成T张切片，然后可通过移动窗口将划分的切片分割成多个小窗口，在每个小窗口的内部进行自注意力计算，即，对每一张切片上的每一个窗口均进行自注意力计算，从而得到三维图像上的上下文联系，并利用移动窗口的滑动特性，将划分的T张切片进行融合，然后取中间的切片，即T/2张切片作为最终进行预测的高层特征预测数据。In the embodiment of the present application, the extracted high-level feature data can be fused at different scales through different perspectives through the perspective decoupling Transformer, and the slice data includes a sequence of T slices as an example for illustration. Specifically, the high-level feature data may include Multiple feature data of different scales from a sequence of T slices, respectively, in different perspectives, the attention operation is calculated by moving the window, that is, the input high-level feature data is divided into T slices from different perspectives, and then the window can be moved by moving the window. Divide the divided slices into multiple small windows, and perform self-attention calculation inside each small window, that is, perform self-attention calculation on each window on each slice, so as to obtain the context on the 3D image Contact, and use the sliding feature of the moving window to fuse the divided T slices, and then take the middle slice, that is, T/2 slices, as the final high-level feature prediction data for prediction.

在步骤S150中，通过所述高层特征预测数据对所述待测用户的病灶进行预测。In step S150, the lesion of the user to be tested is predicted by using the high-level feature prediction data.

在本申请实施例中，通过级联RCNN(区域卷积神经网络)模型，通过输入的高层特征预测数据对待测用户的病灶进行预测，具体的，该级联RCNN模型有多个检测模型(FasterRCNN)组成，每个检测模型都基于不同并交比阈值的正负样本训练得到，前一个检测模型的输出可作为后一个检测模型的输入，且该正负样本的并交比阈值可随着检测模型的先后顺序不断上升。In the embodiment of the present application, a cascaded RCNN (regional convolutional neural network) model is used to predict the lesions of the user to be tested by using the input high-level feature prediction data. Specifically, the cascaded RCNN model has multiple detection models (FasterRCNN). ), each detection model is trained based on positive and negative samples with different parallel-intersection ratio thresholds, the output of the previous detection model can be used as the input of the latter detection model, and the parallel-intersection ratio threshold of the positive and negative samples can be used with the detection. The order of the models goes up continuously.

在本申请实施例中，通过该级联RCNN模型，可在输入的高层特征预测数据中筛选出预设数量的候选区域，比如300个，并将该候选区域输入到ROI池化层中进行池化处理，然后进行类别分类预测以及包围盒回归预测，以确定该待测用户是否存在病灶。In the embodiment of the present application, through the cascaded RCNN model, a preset number of candidate regions, such as 300, can be selected from the input high-level feature prediction data, and the candidate regions can be input into the ROI pooling layer for pooling Then, category classification prediction and bounding box regression prediction are performed to determine whether the user to be tested has lesions.

本申请实施例提供了一种基于视角解耦Transformer模型的病灶检测方法，包括：获取待测用户的MRI图像数据，并对所述MRI图像数据进行预处理；根据预处理后的MRI图像数据，获取切片数据，并输入至预先训练的病灶检测模型中；对所述切片数据进行特征提取，以获取多个不同尺度的特征数据；通过不同的视角在不同尺度上对所述特征数据进行融合，以获取高层特征数据；通过所述高层特征数据对所述待测用户的病灶进行预测。本申请中，通过对切片数据进行特征提取，并对提取的特征数据通过不同视角在不同尺度上进行融合，融合增强输入切片的上下文信息，实现了多尺度特征预测，提高了对病灶的定位精度，对小肿瘤具有很强的识别效果，避免在肿瘤早期阶段发生错过或者混淆，有效提高了检测准确度。采用本方法可以在用户拍摄MRI图像时，在MRI二维切片中标注出疑似肿瘤的位置，帮助医生快速定位和诊断。An embodiment of the present application provides a method for detecting a lesion based on a perspective decoupling Transformer model, including: acquiring MRI image data of a user to be tested, and preprocessing the MRI image data; according to the preprocessed MRI image data, Obtain slice data and input it into a pre-trained lesion detection model; perform feature extraction on the slice data to obtain multiple feature data of different scales; fuse the feature data on different scales through different perspectives, to obtain high-level feature data; predict the lesion of the user to be tested by using the high-level feature data. In this application, feature extraction is performed on slice data, and the extracted feature data is fused at different scales through different perspectives, and the context information of the input slice is fused and enhanced, so as to realize multi-scale feature prediction and improve the localization accuracy of lesions. , has a strong identification effect on small tumors, avoids misses or confusions in the early stage of tumors, and effectively improves the detection accuracy. The method can mark the position of the suspected tumor in the MRI two-dimensional slice when the user takes the MRI image, so as to help the doctor to locate and diagnose quickly.

在一实施例中，本申请还提供了一种基于视角解耦Transformer模型的病灶检测方法的实现流程，包括如下步骤：In one embodiment, the present application also provides an implementation process of a method for detecting a lesion based on a perspective decoupling Transformer model, including the following steps:

在本申请实施例中，可通过核磁共振设备获取用户的MRI(Magnetic ResonanceImaging，核磁共振成像。In this embodiment of the present application, an MRI (Magnetic Resonance Imaging, magnetic resonance imaging, magnetic resonance imaging) of the user may be acquired through a nuclear magnetic resonance device.

在本申请一实施例中，所述根据预处理后的MRI图像数据，获取切片数据，并输入至预先训练的病灶检测模型中，包括：In an embodiment of the present application, obtaining slice data according to preprocessed MRI image data and inputting it into a pre-trained lesion detection model includes:

在本申请实施例中，获取预处理后的MRI图像数据后，可通过图像分割，将MRI图像数据从不同的视角进行横截面的切片处理，以获取该切片数据，比如，可通过二分位数分割法对该MRI图像数据进行切片。In this embodiment of the present application, after obtaining the preprocessed MRI image data, the MRI image data can be sliced from different perspectives through image segmentation to obtain the slice data. For example, the quantile can be obtained. The segmentation method slices the MRI image data.

在本申请实施例中，可为切片序列，也可为单张切片，其中，以T张切片为例，该高度连续的切片序列可为连续的T张切片组。同一MRI图像数据的T张连续切片，代表着中间切片T/2的信息，边缘处可用像素为0的切片补充。In the embodiment of the present application, it may be a slice sequence or a single slice, wherein, taking T slices as an example, the highly continuous slice sequence may be a continuous group of T slices. T consecutive slices of the same MRI image data represent the information of the middle slice T/2, and the edge can be supplemented by slices with 0 pixels.

在本申请实施例中，选取的切片或者切片序列可为包含有成像区域的切片。In this embodiment of the present application, the selected slice or slice sequence may be a slice including an imaging region.

在本申请一实施例中，所述根据预处理后的MRI图像数据，获取切片数据，并输入至预先训练的病灶检测模型中之前，包括：In an embodiment of the present application, before the slice data is acquired according to the preprocessed MRI image data and input into the pre-trained lesion detection model, the method includes:

创建原始病灶检测模型，将预处理后的MRI样本图像数据转换为样本切片数据，输入至所述原始病灶检测模型中，并根据所述标记进行迭代学习，直至获取所述预先训练的病灶检测模型Create an original lesion detection model, convert the preprocessed MRI sample image data into sample slice data, input them into the original lesion detection model, and perform iterative learning according to the labels until the pre-trained lesion detection model is obtained

在本申请实施例中，该MRI样本图像数据可为患者的MRI影像图，比如，脑部MRI影像。In this embodiment of the present application, the MRI sample image data may be an MRI image of a patient, for example, a brain MRI image.

在本申请实施例中，对采集的所述MRI样本图像数据进行预处理，包括对齐体素/像素间距，数据归一化，裁剪缩放至512×512的像素大小。In the embodiment of the present application, the collected MRI sample image data is preprocessed, including alignment of voxel/pixel spacing, data normalization, and cropping and scaling to a pixel size of 512×512.

在本申请实施例中，创建原始病灶检测模型，并将预处理后的样本切片数据输入至该原始病灶检测模型，通过该原始病灶检测模型对样本切片数据进行特征提取以及视角解耦处理后，通过级联RCNN(区域卷积神经网络)模型对病灶进行预测，In the embodiment of the present application, an original lesion detection model is created, the preprocessed sample slice data is input into the original lesion detection model, and after feature extraction and perspective decoupling processing are performed on the sample slice data through the original lesion detection model, The lesions are predicted by a cascaded RCNN (regional convolutional neural network) model,

在本申请一实施例中，所述将预处理后的MRI样本图像数据转换为样本切片数据，输入至所述原始病灶检测模型中之后，包括：In an embodiment of the present application, after converting the preprocessed MRI sample image data into sample slice data and inputting it into the original lesion detection model, the method includes:

根据所述交并比与预设交并比阈值之间的差值，将所述候选区域划分为正样本以及负样本，并对所述正样本以及所述负样本进行采样，以将所述正样本以及所述负样本之间的比例配置为预设比例；According to the difference between the intersection ratio and a preset intersection ratio threshold, the candidate region is divided into positive samples and negative samples, and the positive samples and the negative samples are sampled, so that the The ratio between the positive samples and the negative samples is configured as a preset ratio;

具体的，通过该区域卷积神经网络筛选出预设数量的候选区域，比如，2000个左右，将该候选区域输入到后续的区域卷积神经网络结构中，计算每个候选区域与真实区域之间的交并比，并根据该交并比与预设的交并比阈值之间的差异，将候选区域划分为正样本(前景)以及负样本(背景)，对正负样本进行采样，使得正负样本之间的比例符合预设比例，比如1:3(正负样本的总数量可为128个)，然后，将采样的候选区域输入至Roi池化层中进行类别分类预测和包围盒回归预测，并进行迭代学习，直到获取训练好的病灶检测模型。Specifically, a preset number of candidate regions, such as about 2000, are screened out through the regional convolutional neural network, and the candidate regions are input into the subsequent regional convolutional neural network structure, and the difference between each candidate region and the real region is calculated. According to the difference between the intersection ratio and the preset intersection ratio threshold, the candidate area is divided into positive samples (foreground) and negative samples (background), and the positive and negative samples are sampled so that The ratio between positive and negative samples conforms to a preset ratio, such as 1:3 (the total number of positive and negative samples can be 128), and then the sampled candidate regions are input into the Roi pooling layer for class classification prediction and bounding boxes Regression prediction and iterative learning are performed until a trained lesion detection model is obtained.

其中，真实区域为医生确认的真实病灶区域。The real area is the real lesion area confirmed by the doctor.

在步骤S130中，对所述切片数据进行特征提取，以获取高层特征数据；在本申请实施例中，可采用Swin-Transformer网络结合特征金字塔(featurepyramidnetwork，FPN)结构作为提取网络。In step S130, feature extraction is performed on the slice data to obtain high-level feature data; in this embodiment of the present application, a Swin-Transformer network combined with a feature pyramid network (FPN) structure may be used as the extraction network.

在本申请一实施例中，对所述切片数据进行特征提取，以获取高层特征数据，包括：In an embodiment of the present application, feature extraction is performed on the slice data to obtain high-level feature data, including:

参见图2，输入切片数据，该切片数据的大小可为1×1×H×W×T，通过块划分层将切片进行块分割处理，以分割成H/4×W/4个小块，每小块为4×4×T的单通道图像，通过线性嵌入层将其嵌入到目标通道当中，比如，C通道中，得到C×H/4×W/4×T的图像，然后将得到的C×H/4×W/4×T的图像经过多个移动窗口模块(swin-transformer block)和块合并层进行处理后可得到一系列高层特征图。Referring to Figure 2, input slice data, the size of the slice data can be 1×1×H×W×T, the slice is subjected to block division processing through the block division layer to divide it into H/4×W/4 small blocks, Each small block is a 4×4×T single-channel image, and it is embedded into the target channel through the linear embedding layer, for example, in the C channel, and the image of C×H/4×W/4×T is obtained, and then the obtained A series of high-level feature maps can be obtained after the C×H/4×W/4×T images are processed by multiple swin-transformer blocks and block merging layers.

具体的，在一实施场景中，以4个移动窗口模块为例，通过第一移动窗口模块对C×H/4×W/4×T特征图进行处理后，进行1×1卷积计算，并进行特征图相加处理后，进行3×3卷积计算，可得到融合后的C×H/4×W/4×T特征图；同时，第一移动窗口模块可将该处理后C×H/4×W/4×T特征图发送至第一块合并层进行降采样操作，形成2C×H/4×W/4×T特征图，然后输入至第二移动窗口模块进行处理，第二移动窗口模块将该2C×H/4×W/4×T特征图进行1×1卷积计算，进行特征图相加处理，并进行3×3卷积计算，然后通过2倍上采样即，与第一移动窗口模块1×1卷积计算后的C×H/4×W/4×T特征图进行融合，并再次进行3×3卷积计算后，可得到融合后C×H/8×W/8×T特征图；同时，该第二移动窗口模块还将处理后的2C×H/4×W/4×T特征图输入给第二合并层进行降采样操作，形成4C×H/4×W/4×T特征图，并输入至第三移动窗口模块进行处理后，第三移动窗口模块将该4C×H/4×W/4×T特征图进行1×1卷积计算，进行特征图相加处理，并进行3×3卷积计算，然后通过2倍上采样，即与第二移动窗口模块1×1卷积计算后的2C×H/4×W/4×T特征图进行融合，并再次进行3×3卷积计算后，可得到融合后的C×H/16×W/16×T特征图；同时，第三移动窗口还将处理后的4C×H/4×W/4×T特征图发送至第三合并层进行降采样操作，形成8C×H/4×W/4×T特征图，然后发送至第四移动窗口进行处理；第四移动窗口对8C×H/4×W/4×T特征图进行处理后，进行1×1卷积计算，并分别进行上下2倍采样后，进行3×3卷积计算，形成C×H/32×W/32×T特征图以及C×H/64×W/64×T特征图。Specifically, in an implementation scenario, taking four moving window modules as an example, after the C×H/4×W/4×T feature map is processed by the first moving window module, a 1×1 convolution calculation is performed, After the feature map is added, the 3×3 convolution calculation is performed to obtain the fused C×H/4×W/4×T feature map; at the same time, the first moving window module can process the C×H/4×W/4×T feature map. The H/4×W/4×T feature map is sent to the first merging layer for downsampling to form a 2C×H/4×W/4×T feature map, which is then input to the second moving window module for processing. The second moving window module performs 1×1 convolution calculation on the 2C×H/4×W/4×T feature map, performs feature map addition processing, and performs 3×3 convolution calculation, and then upsampling by 2 times. , fused with the C×H/4×W/4×T feature map calculated by the 1×1 convolution of the first moving window module, and after performing the 3×3 convolution calculation again, the fused C×H/ 8×W/8×T feature map; at the same time, the second moving window module also inputs the processed 2C×H/4×W/4×T feature map to the second merging layer for downsampling operation, forming 4C× The H/4×W/4×T feature map is input to the third moving window module for processing, and the third moving window module performs 1×1 convolution on the 4C×H/4×W/4×T feature map Calculate, perform feature map addition processing, and perform 3×3 convolution calculation, and then upsampling by 2 times, that is, 2C×H/4×W/4× calculated by 1×1 convolution with the second moving window module After the T feature map is fused, and the 3×3 convolution calculation is performed again, the fused C×H/16×W/16×T feature map can be obtained; at the same time, the third moving window will also process the processed 4C×H The /4×W/4×T feature map is sent to the third merging layer for downsampling to form an 8C×H/4×W/4×T feature map, and then sent to the fourth moving window for processing; the fourth moving window After the 8C×H/4×W/4×T feature map is processed, 1×1 convolution calculation is performed, and the upper and lower sampling is performed 2 times, respectively, and the 3×3 convolution calculation is performed to form C×H/32× W/32×T feature map and C×H/64×W/64×T feature map.

由此可知，在经过多次的1x1卷积和上采样操作后，可得到不同尺度融合的同通道的高层特征图，即，C×H/4×W/4×T、C×H/8×W/8×T、C×H/16×W/16×T、C×H/32×W/32×T以及C×H/64×W/64×T。It can be seen that after multiple 1x1 convolution and upsampling operations, the high-level feature maps of the same channel fused at different scales can be obtained, that is, C×H/4×W/4×T, C×H/8 ×W/8×T, C×H/16×W/16×T, C×H/32×W/32×T, and C×H/64×W/64×T.

其中，该小块可为单通道图像，该C通道，即，为常数为C的向量，通过，将分割后的小块嵌入到C通道中，可以得到C通道数的图像。Wherein, the small block can be a single-channel image, and the C channel, that is, a vector whose constant is C, by embedding the segmented small block into the C channel, an image with the number of C channels can be obtained.

在本申请实施例中，该分割的小块可经过多次移动窗口和块合并层的处理，以便得到需要的高层特征数据，其中，该块合并层可为降采样操作，每做一次降采样操作，可对分割的小块增加一倍通道数，从而可以有效降低高层特征数据的分辨率。In this embodiment of the present application, the divided small blocks can be processed by moving the window and the block merging layer for many times, so as to obtain the required high-level feature data, wherein the block merging layer can be a down-sampling operation, and each down-sampling is performed. operation, the number of channels can be doubled for the divided small blocks, which can effectively reduce the resolution of high-level feature data.

在步骤S140中，通过不同的视角在不同尺度上对所述高层特征数据进行融合，以获取高层特征预测数据；In step S140, the high-level feature data is fused at different scales from different perspectives to obtain high-level feature prediction data;

在本申请实施例中，MRI图像数据可为三维图像，可分别通过W表示X轴，T表示Y轴，H表示Z轴。上述不同视角可分别表示该三维图像的不同视角，比如，T×W视角，T×H视角以及H×W视角。In this embodiment of the present application, the MRI image data may be a three-dimensional image, and W may be used to represent the X axis, T to represent the Y axis, and H to represent the Z axis. The above-mentioned different viewing angles may respectively represent different viewing angles of the three-dimensional image, for example, a T×W viewing angle, a T×H viewing angle, and an H×W viewing angle.

在本申请一实施例中，所述通过不同的视角在不同尺度上对所述特征数据进行融合，以获取高层特征预测数据，包括：In an embodiment of the present application, the feature data is fused at different scales from different perspectives to obtain high-level feature prediction data, including:

并根据所述上下文联系，通过不同的视角对所述不同尺度的高层特征数据进行融合，以获取高层特征预测数据。And according to the context connection, the high-level feature data of different scales are fused through different perspectives to obtain high-level feature prediction data.

在本申请一实施例中，该高层特征数据包括多个不同尺度融合的同通道的高层特征图，比如，C×H/4×W/4×T、C×H/8×W/8×T、C×H/16×W/16×T、C×H/32×W/32×T以及C×H/64×W/64×T等。In an embodiment of the present application, the high-level feature data includes multiple high-level feature maps of the same channel fused at different scales, for example, C×H/4×W/4×T, C×H/8×W/8× T, C×H/16×W/16×T, C×H/32×W/32×T, C×H/64×W/64×T, etc.

在本申请一实施例中，视角解耦Transformer模型首先将提取的高层特征图，通过移动窗口模块分割成多个小窗口，且每个小窗口中均包含有M×M个，由块划分层划分出来的小块，并在每一个窗口中进行自注意力计算，具体的，首先通过注意力机制为窗口中的每一个小块均分配3个不同的向量，即，询问向量(Q)、键值向量(K)以及值向量(V)，为每一个向量计算一个分数，该分数Q·K^T，然后将分数归一化，并除以

该

为嵌入的向量，对分数施以softmax激活函数后点成值向量，得到加权的每个输入的嵌入向量的评分，最后将评分相加得到最终的输出结果Z。公式如式如下所述：In an embodiment of the present application, the perspective decoupling Transformer model first divides the extracted high-level feature map into multiple small windows through the moving window module, and each small window contains M×M, which are divided into layers by blocks. The small blocks are divided, and self-attention calculation is performed in each window. Specifically, three different vectors are allocated to each small block in the window through the attention mechanism, namely, the query vector (Q), key-value vector (K) and value vector (V), compute a score for each vector, the score Q K ^T , then normalize the score and divide by

Should

For the embedded vector, apply the softmax activation function to the score and then point it into a value vector to obtain the weighted score of each input embedded vector, and finally add the scores to obtain the final output result Z. The formula is as follows:

其中，该询问向量(Q)、键值向量(K)以及值向量(V)可通过1个不同的权值矩阵以及嵌入向量进行相乘得到的。Wherein, the query vector (Q), the key value vector (K) and the value vector (V) can be obtained by multiplying a different weight matrix and an embedded vector.

其中，该移动窗口模块包括窗口注意力模块和移动窗口注意力模块。Wherein, the moving window module includes a window attention module and a moving window attention module.

在本申请一实施例中，所述在每个所述小窗口中进行自注意力计算，以获取上下文联系，包括：In an embodiment of the present application, performing self-attention calculation in each of the small windows to obtain contextual connections includes:

在本申请实施例中，由于原始划分的小窗口之间无法进行信息传递，因此，可分别将小窗口向目标方向进行偏移，比如，可将每个小窗口分别向右侧和下方偏移M/2个块区域，此时，在右侧和下方形成一偏移区域，将该偏移区域回补到左侧和上方，即回补到原始区域，可形成多个新的小窗口，在新的小窗口中进行自注意力计算，可获取平面像素之间的相关性，从而获取上下文联系。In this embodiment of the present application, since information transmission cannot be performed between the originally divided small windows, the small windows can be shifted to the target direction respectively, for example, each small window can be shifted to the right and down respectively M/2 block areas, at this time, an offset area is formed on the right and below, and the offset area is backfilled to the left and above, that is, back to the original area, and multiple new small windows can be formed. The self-attention calculation is performed in the new small window to obtain the correlation between the plane pixels, thereby obtaining the contextual connection.

其中，该偏移区域，为原始区域偏移的区域。Wherein, the offset area is an area from which the original area is offset.

在本申请实施例中，参见图4，视角解耦Transformer模型在不同视角，比如，T×W视角，T×H视角以及H×W视角，分别通过移动窗口模块将输入的高层特征图分割为多个小窗口，然后通过分割的小窗口内部，分别计算自注意力机制，可以得到三维上下文直接的联系，即可得到二维平面像素的相关性，此时，可利用移动窗口的滑动特征将T张切片进行融合，最后可以取中间的切片，即T/2切片作为最后进行预测的高层特征预测数据。In the embodiment of the present application, referring to FIG. 4 , the perspective decoupling Transformer model in different perspectives, such as the T×W perspective, the T×H perspective, and the H×W perspective, respectively uses the moving window module to segment the input high-level feature map into Multiple small windows, and then calculate the self-attention mechanism separately through the divided small windows, the direct connection of the three-dimensional context can be obtained, and the correlation of the two-dimensional plane pixels can be obtained. At this time, the sliding feature of the moving window can be used to T slices are fused, and finally the middle slice, that is, the T/2 slice, can be taken as the final high-level feature prediction data for prediction.

其中，箭头F表示不同视角的二维平面像素中的成对相关性。该相关性可通过上述自注意力机制确定。Among them, arrows F represent pairwise correlations in 2D planar pixels of different viewing angles. This correlation can be determined by the self-attention mechanism described above.

在本申请实施例中，通过级联RCNN(区域卷积神经网络)模型，通过输入的高层特征预测数据对待测用户的病灶进行预测，具体的，该级联RCNN模型有多个检测模型(FasterRCNN)组成，每个检测模型都基于不同并交比阈值的正负样本训练得到，前一个检测模型的输出可作为后一个检测模型的输入，且该正负样本的并交比阈值可随着检测模型的先后顺序不断上升。In the embodiment of the present application, a cascaded RCNN (regional convolutional neural network) model is used to predict the lesions of the user to be tested by using the input high-level feature prediction data. Specifically, the cascaded RCNN model has multiple detection models (FasterRCNN). ), each detection model is trained based on positive and negative samples with different parallel-intersection ratio thresholds, the output of the previous detection model can be used as the input of the latter detection model, and the parallel-intersection ratio threshold of the positive and negative samples can be used with the detection. The order of the models continues to rise.

本申请实施例，通过对切片数据进行特征提取，并对提取的特征数据通过不同视角在不同尺度上进行融合，融合增强输入切片的上下文信息，实现了多尺度特征预测，提高了对病灶的定位精度，对小肿瘤具有很强的识别效果，避免在肿瘤早期阶段发生错过或者混淆，有效提高了检测准确度。In this embodiment of the present application, feature extraction is performed on slice data, and the extracted feature data is fused at different scales from different perspectives, and the context information of the input slice is fused and enhanced, thereby realizing multi-scale feature prediction and improving the localization of lesions. It has a strong identification effect on small tumors, avoids misses or confusions in the early stage of tumors, and effectively improves the detection accuracy.

应理解，上述实施例中各步骤的序号的大小并不意味着执行顺序的先后，各过程的执行顺序应以其功能和内在逻辑确定，而不应对本发明实施例的实施过程构成任何限定。It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

在一实施例中，提供一种基于视角解耦Transformer模型的病灶检测装置，该基于视角解耦Transformer模型的病灶检测装置与上述实施例中基于视角解耦Transformer模型的病灶检测方法一一对应。如图4所示，该基于视角解耦Transformer模型的病灶检测装置包括：图像数据获取模块10、切片数据获取模块20、特征提取模块30、视角解耦模块40和预测模块50。各功能模块详细说明如下：In one embodiment, a lesion detection device based on a perspective decoupling Transformer model is provided, and the lesion detection device based on a perspective decoupling Transformer model corresponds to the lesion detection method based on the perspective decoupling Transformer model in the above embodiment. As shown in FIG. 4 , the lesion detection device based on the perspective decoupling Transformer model includes: an image data acquisition module 10 , a slice data acquisition module 20 , a feature extraction module 30 , a perspective decoupling module 40 and a prediction module 50 . The detailed description of each functional module is as follows:

图像数据获取模块10，用于获取待测用户的MRI图像数据，并对所述MRI图像数据进行预处理；The image data acquisition module 10 is used for acquiring the MRI image data of the user to be tested, and preprocessing the MRI image data;

切片数据获取模块20，用于根据预处理后的MRI图像数据，获取切片数据，并输入至预先训练的病灶检测模型中；The slice data acquisition module 20 is configured to acquire slice data according to the preprocessed MRI image data, and input it into the pre-trained lesion detection model;

特征提取模块30，用于对所述切片数据进行特征提取，以获取高层特征数据；A feature extraction module 30, configured to perform feature extraction on the slice data to obtain high-level feature data;

视角解耦模块40，用于通过不同的视角在不同尺度上对所述高层特征数据进行融合，以获取高层特征预测数据；A perspective decoupling module 40, configured to fuse the high-level feature data at different scales through different perspectives to obtain high-level feature prediction data;

预测模块50，用于通过所述高层特征预测数据对所述待测用户的病灶进行预测。The prediction module 50 is configured to predict the lesion of the user to be tested by using the high-level feature prediction data.

参见图5，在本申请一实施例中，该基于视角解耦Transformer模型的病灶检测装置，还包括：预先训练的病灶检测模型生成模块60，用于Referring to FIG. 5 , in an embodiment of the present application, the lesion detection device based on the perspective decoupling Transformer model further includes: a pre-trained lesion detection model generation module 60 for

创建原始病灶检测模型，并将预处理后的MRI样本图像数据转换为样本切片数据，并输入至所述原始病灶检测模型中，并根据所述标记进行迭代学习，直至获取所述预先训练的病灶检测模型。Create an original lesion detection model, convert the preprocessed MRI sample image data into sample slice data, and input them into the original lesion detection model, and perform iterative learning according to the markers until the pre-trained lesions are obtained Check the model.

在一实施例中，该特征提取模块30，还用于：In one embodiment, the feature extraction module 30 is further configured to:

在一实施例中，视角解耦模块40，还用于：In one embodiment, the viewing angle decoupling module 40 is further configured to:

将每个所述小窗口分别向目标方向偏移M/2个块区域，形成偏移区域；Offset each of the small windows by M/2 block areas in the target direction to form an offset area;

将所述偏移区域补充至原始区域，以形成多个新的小窗口；supplementing the offset area to the original area to form a plurality of new small windows;

在实施例中，所述预先训练的病灶检测模型生成模块60，还用于：In an embodiment, the pre-trained lesion detection model generation module 60 is further used for:

在一实施例中，切片数据获取模块20，还用于：In one embodiment, the slice data acquisition module 20 is further configured to:

关于基于视角解耦Transformer模型的病灶检测装置的具体限定可以参见上文中对于基于视角解耦Transformer模型的病灶检测方法的限定，在此不再赘述。上述基于视角解耦Transformer模型的病灶检测装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中，也可以以软件形式存储于计算机设备中的存储器中，以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the lesion detection device based on the perspective decoupling Transformer model, please refer to the limitation of the lesion detection method based on the perspective decoupling Transformer model above, which will not be repeated here. Each module in the above-mentioned device for lesion detection based on the perspective decoupling Transformer model may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.

在一个实施例中，提供了一种计算机设备，该计算机设备可以是服务器，其内部结构图可以如图6所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口。其中，该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括可读存储介质。该可读存储介质存储有计算机可读指令。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种基于视角解耦Transformer模型的病灶检测方法。本实施例所提供的可读存储介质包括非易失性可读存储介质和易失性可读存储介质。In one embodiment, a computer device is provided, the computer device may be a server, and its internal structure diagram may be as shown in FIG. 6 . The computer device includes a processor, memory, and a network interface connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a readable storage medium. The readable storage medium stores computer-readable instructions. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions, when executed by the processor, implement a method for lesion detection based on a perspective decoupling Transformer model. The readable storage medium provided by this embodiment includes a non-volatile readable storage medium and a volatile readable storage medium.

一种计算机设备，包括存储器、处理器以及存储在存储器中并可在处理器上运行的计算机可读指令，其特征在于，处理器执行计算机可读指令时实现如上述基于视角解耦Transformer模型的病灶检测方法的步骤。A computer device, comprising a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, characterized in that, when the processor executes the computer-readable instructions, the above-mentioned method of decoupling the Transformer model based on perspective is implemented. The steps of the lesion detection method.

一个或多个可读存储介质，可读存储介质存储有计算机可读指令，其特征在于，计算机可读指令被处理器执行时实现如上述基于视角解耦Transformer模型的病灶检测方法的步骤。One or more readable storage media, where computer-readable instructions are stored in the readable storage media, wherein when the computer-readable instructions are executed by the processor, the steps of the method for lesion detection based on the perspective decoupling Transformer model as described above are implemented.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机可读指令来指令相关的硬件来完成，所述的计算机可读指令可存储于一非易失性可读取存储介质或易失性可读存储介质中，该计算机可读指令在执行时，可包括如上述各方法的实施例的流程。其中，本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用，均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限，RAM以多种形式可得，诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing the relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a non-volatile computer. In the read storage medium or the volatile readable storage medium, the computer-readable instructions, when executed, may include the processes of the foregoing method embodiments. Wherein, any reference to memory, storage, database or other medium used in the various embodiments provided in this application may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

所属领域的技术人员可以清楚地了解到，为了描述的方便和简洁，仅以上述各功能单元、模块的划分进行举例说明，实际应用中，可以根据需要而将上述功能分配由不同的功能单元、模块完成，即将所述装置的内部结构划分成不同的功能单元或模块，以完成以上描述的全部或者部分功能。Those skilled in the art can clearly understand that, for the convenience and simplicity of description, only the division of the above-mentioned functional units and modules is used as an example. Module completion, that is, dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.

以上所述实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围，均应包含在本发明的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it is still possible to implement the foregoing implementations. The technical solutions described in the examples are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should be included in the within the protection scope of the present invention.

Claims

1. A focus detection method based on a visual angle decoupling Transformer model is characterized by comprising the following steps:

acquiring MRI image data of a user to be detected, and preprocessing the MRI image data;

acquiring slice data according to the preprocessed MRI image data, and inputting the slice data into a pre-trained focus detection model;

performing feature extraction on the slice data to acquire high-level feature data;

fusing the high-level feature data on different scales through different visual angles to obtain high-level feature prediction data;

and predicting the focus of the user to be detected according to the high-level feature prediction data.

2. The method for visual-angle-based decoupled transform model lesion detection of claim 1, wherein the prior input into the pre-trained lesion detection model comprises:

acquiring a plurality of MRI sample image data and marking a focal region of a cross-sectional slice of each of the MRI sample image data in the form of a bounding box;

preprocessing the acquired MRI sample image data;

and creating an original focus detection model, converting the preprocessed MRI sample image data into sample slice data, inputting the sample slice data into the original focus detection model, and performing iterative learning according to the mark until the pre-trained focus detection model is obtained.

3. The visual-angle decoupling Transformer model-based lesion detection method of claim 1, wherein the feature extraction is performed on the slice data to obtain high-level feature data, comprising

Dividing the slice data into a plurality of small blocks, wherein the small blocks are single-channel images;

embedding the single-channel image into a target channel to obtain images of the number of the target channels;

and acquiring the high-level feature data according to the image with the target channel number.

4. The method for lesion detection based on perspective decoupling transform model of claim 3, wherein the fusing the high-level feature data at different scales through different perspectives to obtain high-level feature prediction data comprises:

dividing the high-level feature data into a plurality of small windows, wherein the small windows comprise M multiplied by M small blocks;

performing self-attention calculation in each small window to obtain context connection;

and according to the context relation, fusing the high-level feature data on different scales through different visual angles to obtain the high-level feature prediction data.

5. The view-decoupled lesion detection method of claim 4, wherein said performing a self-attention calculation in each of said small windows to obtain contextual connections comprises:

shifting each small window to a target direction by M/2 block areas respectively to form a shift area;

supplementing the offset area to the original area to form a plurality of new small windows;

performing a self-attention calculation in the new widget to obtain contextual connectivity.

6. The method for lesion detection based on view-angle decoupling Transformer model of claim 2, wherein the step of converting the pre-processed MRI sample image data into sample slice data, after inputting the sample slice data into the original lesion detection model, comprises:

extracting a preset number of candidate regions through a region generation network;

calculating the intersection ratio between the candidate region and the real region;

dividing the candidate region into a positive sample and a negative sample according to a difference value between the intersection ratio and a preset intersection ratio threshold, and sampling the positive sample and the negative sample to configure the proportion between the positive sample and the negative sample as a preset proportion;

and sending the candidate region according with the preset proportion to a pooling layer for pooling treatment, and performing focus prediction.

7. The visual angle decoupling Transformer model-based lesion detection method according to any one of claims 1 to 6, wherein the acquiring slice data from the preprocessed MRI image data and inputting the slice data into a pre-trained lesion detection model comprises:

performing cross-section slice processing on the preprocessed MRI image data from different visual angles to acquire a plurality of slices;

selecting a highly continuous slice sequence from the multiple slices and inputting the highly continuous slice sequence into the pre-trained lesion detection model; or,

and selecting one slice from the plurality of slices and inputting the selected slice into the pre-trained lesion detection model.

8. A focus detection device based on a visual angle decoupling Transformer model is characterized by comprising:

the image data acquisition module is used for acquiring MRI image data of a user to be detected and preprocessing the MRI image data;

the slice data acquisition module is used for acquiring slice data according to the preprocessed MRI image data and inputting the slice data into a pre-trained focus detection model;

the characteristic extraction module is used for extracting the characteristics of the slice data to obtain high-level characteristic data;

the visual angle decoupling module is used for fusing the high-level feature data on different scales through different visual angles so as to obtain high-level feature prediction data;

and the prediction module is used for predicting the focus of the user to be detected through the high-level feature prediction data.

9. A computer device comprising a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, wherein the processor when executing the computer readable instructions implements the steps of the visual decoupling transform model based lesion detection method of any one of claims 1 to 7.

10. One or more readable storage media storing computer readable instructions, wherein the computer readable instructions, when executed by a processor, implement the steps of the method for lesion detection based on a view-decoupled transform model according to any of claims 1 to 7.