CN117197137B

CN117197137B - Tissue sample analysis method and system based on hyperspectral images

Info

Publication number: CN117197137B
Application number: CN202311464558.0A
Authority: CN
Inventors: 李玮; 韩景泓; 雷晟暄; 汪子琪; 张彦海; 赵晗竹; 张彦霖; 顾夏铭; 张伟师; 徐立强
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2023-11-07
Filing date: 2023-11-07
Publication date: 2024-02-09
Anticipated expiration: 2043-11-07
Also published as: CN117197137A

Abstract

The invention relates to the technical field of image analysis, in particular to a tissue sample analysis method and system based on hyperspectral images, comprising the following steps: acquiring hyperspectral images of tissue sample slices to be detected, splicing image data of set wave bands into RGB images, subtracting the hyperspectral characteristics of ambient light of a detection area, obtaining gray images through normalization processing, and generating a two-dimensional data set; taking the generated two-dimensional data set as input of a recognition model to obtain a recognition result; the recognition model outputs a tumor malignancy degree score based on a set function, and the maximum value of the tumor malignancy degree score is used as a recognition result.

Description

Tissue sample analysis method and system based on hyperspectral images

技术领域Technical field

本发明涉及图像分析技术领域，具体为基于高光谱图像的组织样本分析方法及系统。The present invention relates to the technical field of image analysis, specifically a tissue sample analysis method and system based on hyperspectral images.

背景技术Background technique

本部分的陈述仅仅是提供了与本发明相关的背景技术信息，不必然构成在先技术。The statements in this section merely provide background technical information related to the present invention and do not necessarily constitute prior art.

高光谱图像中除了包含被拍摄目标的信息以外还包含有光谱信息，利用这一特性能够从图像中分析一些肉眼无法观察到的信息。例如从人体或动物组织样本的高光谱图像中，分析出一些图像中的细胞或是物质的形态变化，而这些作为分析目标的细胞或是物质在光谱空间中存在一定相似性，导致计算机在处理时，不易分辨出所需的特征信息，从而难以满足需求。In addition to the information of the photographed target, hyperspectral images also contain spectral information. This feature can be used to analyze some information that cannot be observed by the naked eye from the image. For example, from hyperspectral images of human or animal tissue samples, the morphological changes of cells or substances in some images can be analyzed, and these cells or substances as analysis targets have certain similarities in the spectral space, causing the computer to process At this time, it is difficult to distinguish the required feature information, making it difficult to meet the needs.

发明内容Contents of the invention

为了解决上述背景技术中存在的技术问题，本发明提供基于高光谱图像的组织样本分析方法及系统，将图像中的光谱特征和纹理特征形成特征矩阵，利用与特征矩阵相对应的标签矩阵，使得计算机能够根据标签确定对应的光谱特征和纹理特征，改善计算机不易分辨出所需特征信息的问题。In order to solve the technical problems existing in the above background technology, the present invention provides a tissue sample analysis method and system based on hyperspectral images, forming a feature matrix from the spectral features and texture features in the image, and using a label matrix corresponding to the feature matrix, so that The computer can determine the corresponding spectral features and texture features based on the labels, improving the problem that it is difficult for the computer to distinguish the required feature information.

为了实现上述目的，本发明采用如下技术方案：In order to achieve the above objects, the present invention adopts the following technical solutions:

本发明的第一个方面提供基于高光谱图像的组织样本分析方法，包括以下步骤：A first aspect of the present invention provides a tissue sample analysis method based on hyperspectral images, including the following steps:

获取待测组织样本的高光谱图像，将设定波段的图像数据拼接为RGB图像，减去检测区域的环境光高光谱特征，得到待测组织样本中目标的光谱特征和纹理特征，经归一化处理得到灰度图像并生成二维数据集；Obtain the hyperspectral image of the tissue sample to be tested, splice the image data of the set band into an RGB image, subtract the hyperspectral characteristics of the ambient light in the detection area, and obtain the spectral characteristics and texture characteristics of the target in the tissue sample to be tested. After normalization The grayscale image is obtained through processing and a two-dimensional data set is generated;

以生成的二维数据集作为识别模型的输入，得到与标签矩阵对应的评分，以评分的最大值作为识别结果；Using the generated two-dimensional data set as the input of the recognition model, the score corresponding to the label matrix is obtained, and the maximum value of the score is used as the recognition result;

其中，二维数据集具有与光谱特征和纹理特征对应的特征矩阵，特征矩阵与标签矩阵相对应。Among them, the two-dimensional data set has a feature matrix corresponding to spectral features and texture features, and the feature matrix corresponds to the label matrix.

进一步的，获取检测区域的环境光高光谱特征的过程，具体为：在拍摄平面放置无荧光特性的参照物，获取该参照物区域的高光谱图像，并提取该区域内各像素点的一维光谱数据，按列求和取平均得到一行光谱数据作为环境光的光谱特征。Further, the process of obtaining the hyperspectral characteristics of ambient light in the detection area is specifically: placing a reference object without fluorescence characteristics on the shooting plane, obtaining a hyperspectral image of the reference object area, and extracting the one-dimensional image of each pixel in the area. Spectral data is summed and averaged by columns to obtain a row of spectral data as the spectral characteristics of ambient light.

进一步的，归一化处理得到灰度图像并生成二维数据集的过程，具体为：将光谱数据值域压缩到设定区间，得到每个像素点归一化后的一维数据，即每个像素点在不同波长下的一维数据，经数据扩展后得到单通道灰度图像，生成二维数据集。Further, the process of normalizing the grayscale image and generating a two-dimensional data set is as follows: compressing the spectral data value range to a set interval, and obtaining the normalized one-dimensional data of each pixel, that is, each The one-dimensional data of each pixel at different wavelengths is expanded to obtain a single-channel grayscale image to generate a two-dimensional data set.

进一步的，归一化后的一维数据经数据扩展后得到单通道灰度图像的过程，具体为：归一化后的一维数据复制设定数量，每一像素点数据乘设定的倍数得到所有像素点的灰度图像。Further, the process of obtaining a single-channel grayscale image from the normalized one-dimensional data after data expansion is as follows: copy the normalized one-dimensional data by a set number, and multiply each pixel data by a set multiple. Get the grayscale image of all pixels.

进一步的，识别模型的训练过程，包括：Further, the training process of the recognition model includes:

获取预先制备完毕的组织样本的高光谱图像；Obtain hyperspectral images of pre-prepared tissue samples;

提取高光谱图像中的目标光谱特征和目标纹理特征，整理成为特征矩阵；Extract the target spectral features and target texture features in the hyperspectral image and organize them into a feature matrix;

特征矩阵中的每一行代表一个样本，每一列代表一个特征，将对应的样本标签整理成一个标签向量，所有标签向量形成的标签矩阵中，每一行对应一个样本，每一列对应一个目标评分；Each row in the feature matrix represents a sample, and each column represents a feature. The corresponding sample labels are organized into a label vector. In the label matrix formed by all label vectors, each row corresponds to a sample, and each column corresponds to a target score;

将特征矩阵和标签矩阵作为训练数据进行模型训练，根据设定的批次和轮数进行训练。Use the feature matrix and label matrix as training data for model training, and train according to the set batches and rounds.

进一步的，提取高光谱图像中的目标光谱特征，具体为：在二维灰度图像上选择包含目标的区域，提取区域中每个像素点对应的一维归一化光谱数据，得到统计特征平均值、方差、能量、频谱倾斜度以及峰度来描述光谱的分布和变化情况。Further, extract the target spectral features in the hyperspectral image, specifically: select the area containing the target on the two-dimensional grayscale image, extract the one-dimensional normalized spectral data corresponding to each pixel in the area, and obtain the statistical feature average value, variance, energy, spectral slope, and kurtosis to describe the distribution and changes of the spectrum.

进一步的，提取高光谱图像中的目标纹理特征，具体为：Further, the target texture features in the hyperspectral image are extracted, specifically:

基于灰度共生矩阵、灰度差异矩阵、局部二值模式中的至少一种提取纹理特征；Extract texture features based on at least one of a gray level co-occurrence matrix, a gray level difference matrix, and a local binary pattern;

通过获取图像中某一像素与其邻近像素之间的灰度关系，生成矩阵，从该矩阵中提取出对比度、能量及同质性；By obtaining the grayscale relationship between a certain pixel in the image and its neighboring pixels, a matrix is generated, and contrast, energy and homogeneity are extracted from the matrix;

通过获取图像中每个像素和其邻近像素的灰度差异，生成矩阵，从该矩阵中提取对比度和粗糙度；By obtaining the grayscale difference between each pixel and its neighboring pixels in the image, a matrix is generated from which contrast and roughness are extracted;

通过比较像素和其邻近像素的灰度值，生成二值图像，从该二值图像中提取出纹理模式和纹理方向。By comparing the grayscale values of a pixel and its neighboring pixels, a binary image is generated, and the texture pattern and texture direction are extracted from the binary image.

本发明的第二个方面提供实现上述方法所需的系统，包括：A second aspect of the present invention provides a system required to implement the above method, including:

预处理模块，被配置为：获取待测组织样本的高光谱图像，将设定波段的图像数据拼接为RGB图像，减去检测区域的环境光高光谱特征，得到待测组织样本中目标的光谱特征和纹理特征，经归一化处理得到灰度图像并生成二维数据集；The preprocessing module is configured to: obtain a hyperspectral image of the tissue sample to be tested, splice the image data of the set band into an RGB image, subtract the hyperspectral characteristics of the ambient light in the detection area, and obtain the spectrum of the target in the tissue sample to be tested. Features and texture features are normalized to obtain grayscale images and generate two-dimensional data sets;

分析模块，被配置为：以生成的二维数据集作为识别模型的输入，得到与标签矩阵对应的评分，以评分的最大值作为识别结果；The analysis module is configured to: use the generated two-dimensional data set as the input of the recognition model, obtain the score corresponding to the label matrix, and use the maximum value of the score as the recognition result;

本发明的第三个方面提供一种计算机可读存储介质。A third aspect of the invention provides a computer-readable storage medium.

一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如上述所述的基于高光谱图像的组织样本分析方法中的步骤。A computer-readable storage medium has a computer program stored thereon, and when the program is executed by a processor, the steps in the tissue sample analysis method based on hyperspectral images are implemented as described above.

本发明的第四个方面提供一种计算机设备。A fourth aspect of the invention provides a computer device.

一种计算机设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现如上述所述的基于高光谱图像的组织样本分析方法中的步骤。A computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, the tissue sample analysis method based on hyperspectral images is implemented as described above. steps in.

与现有技术相比，以上一个或多个技术方案存在以下有益效果：Compared with the existing technology, one or more of the above technical solutions have the following beneficial effects:

1、将图像中的光谱特征和纹理特征形成特征矩阵，利用与特征矩阵相对应的标签矩阵，使得计算机能够根据标签确定对应的光谱特征和纹理特征，改善计算机不易分辨出所需特征信息的问题。1. Form a feature matrix from the spectral features and texture features in the image, and use the label matrix corresponding to the feature matrix, so that the computer can determine the corresponding spectral features and texture features based on the labels, improving the problem that it is difficult for the computer to distinguish the required feature information. .

2、识别模型以组织样本的高光谱图像信息作为输入，通过在训练时引入的目标评分的最大值作为输出，得到分析结果，能够利用高光谱数据与训练阶段原始组织样本之间的相关关系，识别出组织样本的高光谱图像信息中包含的目标结果。从而更容易确定目标细胞或是物质的形态变化情况，准确率更高，客观性更强。2. The recognition model takes the hyperspectral image information of the tissue sample as input, and uses the maximum value of the target score introduced during training as the output to obtain the analysis results. It can utilize the correlation between the hyperspectral data and the original tissue sample in the training stage. Target results contained in hyperspectral image information of tissue samples are identified. This makes it easier to determine the morphological changes of target cells or substances, with higher accuracy and stronger objectivity.

3、通过数据归一化，得到每个像素点在不同波长下的一维数据，能够屏蔽环境光的干扰并且抑制数据内的白噪声干扰，压缩光谱特性值域，但可完整的保留光谱特性，防止在模型训练时产生梯度过大的问题。3. Through data normalization, one-dimensional data of each pixel at different wavelengths is obtained, which can shield the interference of ambient light and suppress the white noise interference in the data, compress the spectral characteristic value range, but retain the spectral characteristics completely. , to prevent the problem of excessive gradients during model training.

附图说明Description of drawings

构成本发明的一部分的说明书附图用来提供对本发明的进一步理解，本发明的示意性实施例及其说明用于解释本发明，并不构成对本发明的不当限定。The description and drawings that constitute a part of the present invention are used to provide a further understanding of the present invention. The illustrative embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute an improper limitation of the present invention.

图1是本发明一个或多个实施例提供的以早期前列腺肿瘤的组织样本为例进行基于高光谱图像的组织样本分析的流程示意图；Figure 1 is a schematic flow chart of tissue sample analysis based on hyperspectral images, taking tissue samples of early stage prostate tumors as an example provided by one or more embodiments of the present invention;

图2是本发明一个或多个实施例提供的基于高光谱图像的组织样本分析系统中预处理模块的工作过程示意图；Figure 2 is a schematic diagram of the working process of the preprocessing module in the tissue sample analysis system based on hyperspectral images provided by one or more embodiments of the present invention;

图3是本发明一个或多个实施例提供的基于高光谱图像的组织样本分析过程中设计损失函数用于模型优化时的示意图；Figure 3 is a schematic diagram of designing a loss function for model optimization during tissue sample analysis based on hyperspectral images provided by one or more embodiments of the present invention;

图4是本发明一个或多个实施例提供的基于高光谱图像的组织样本分析过程中引入自定义注意力机制对模型进行优化时的示意图。Figure 4 is a schematic diagram of introducing a custom attention mechanism to optimize the model in the process of tissue sample analysis based on hyperspectral images provided by one or more embodiments of the present invention.

具体实施方式Detailed ways

下面结合附图与实施例对本发明作进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and examples.

应该指出，以下详细说明都是示例性的，旨在对本发明提供进一步的说明。除非另有指明，本文使用的所有技术和科学术语具有与本发明所属技术领域的普通技术人员通常理解的相同含义。It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the present invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

正如背景技术中所介绍的，作为分析目标的细胞或是物质在光谱空间中存在一定相似性，导致计算机在处理时，不易分辨出所需的特征信息，从而难以满足需求。As introduced in the background art, the cells or substances that are the target of analysis have certain similarities in the spectral space, which makes it difficult for computers to distinguish the required feature information during processing, making it difficult to meet the demand.

因此，以下实施例给出基于高光谱图像的组织样本分析方法及系统，将图像中的光谱特征和纹理特征形成特征矩阵，利用与特征矩阵相对应的标签矩阵，使得计算机能够根据标签确定对应的光谱特征和纹理特征，改善计算机不易分辨出所需特征信息的问题Therefore, the following embodiments provide a tissue sample analysis method and system based on hyperspectral images. The spectral features and texture features in the image are formed into a feature matrix, and the label matrix corresponding to the feature matrix is used, so that the computer can determine the corresponding label based on the label. Spectral features and texture features improve the problem that it is difficult for computers to distinguish the required feature information.

以下实施例中，以临床中容易获取的前列腺组织样本为例，经拍摄高光谱图像后，以组织样本中的肿瘤细胞的形态作为分析得到的目标为例，利用早期前列腺肿瘤组织样本的高光谱图像信息作为训练数据，在识别模型完成训练后，能够识别出组织样本的高光谱图像信息，得到早期前列腺肿瘤的分析结果。In the following embodiments, a prostate tissue sample that is easily obtained in clinical practice is taken as an example. After taking a hyperspectral image, the morphology of tumor cells in the tissue sample is used as the target for analysis, and the hyperspectral image of the early prostate tumor tissue sample is used. Image information is used as training data. After the recognition model completes training, it can identify the hyperspectral image information of tissue samples and obtain the analysis results of early prostate tumors.

基础介绍1：前列腺癌是男性中常见的恶性肿瘤，通常利用血液样本进行前列腺特异抗原(PSA)筛查，以完成对于前列腺癌的检测，由于PSA是组织特异性而非肿瘤特异性的标志物，导致PSA检测对前列腺癌的特异性与敏感性都较低，从而无法掌握癌变的实际情况，影响后续的检测判断。而利用肿瘤组织进行病理分析则需将组织切片制成标本，操作复杂，耗费时间长，且人为检测受外部影响大，检验结果缺乏客观性和准确性。Basic introduction 1: Prostate cancer is a common malignant tumor in men. Blood samples are usually used for prostate-specific antigen (PSA) screening to complete the detection of prostate cancer, because PSA is a tissue-specific rather than tumor-specific marker. , resulting in low specificity and sensitivity of PSA testing for prostate cancer, making it impossible to grasp the actual situation of canceration and affecting subsequent testing judgments. However, using tumor tissue for pathological analysis requires tissue sections to be made into specimens, which is complex and time-consuming. In addition, manual detection is greatly affected by external influences, and the test results lack objectivity and accuracy.

基础介绍2：高光谱成像技术是基于非常多窄波段的影像数据技术，它将成像技术与光谱技术相结合，探测目标的二维几何空间及一维光谱信息，获取高光谱分辨率的连续、窄波段的图像数据。常见的高光谱成像技术包括光栅分光、声光可调谐滤波分光、棱镜分光以及芯片镀膜等。Basic introduction 2: Hyperspectral imaging technology is based on very narrow-band image data technology. It combines imaging technology with spectral technology to detect the two-dimensional geometric space and one-dimensional spectral information of the target, and obtain continuous, high spectral resolution. Narrowband image data. Common hyperspectral imaging technologies include grating spectroscopy, acousto-optic tunable filter spectroscopy, prism spectroscopy, and chip coating.

实施例一：Example 1:

如图1-图4所示，基于高光谱图像的组织样本分析方法，包括以下步骤：As shown in Figures 1-4, the tissue sample analysis method based on hyperspectral images includes the following steps:

早期前列腺肿瘤的光谱特征提取过程，具体为：The spectral feature extraction process of early prostate tumors is specifically:

选择肿瘤区域：在二维灰度图像上选择包含肿瘤的区域。可以使用手动标注、阈值分割等方法来获取肿瘤区域。提取每个像素点的一维归一化光谱数据：针对选定的肿瘤区域，提取每个像素点对应的一维归一化光谱数据。根据之前的数据处理步骤，每个像素点的一维数据是在不同波长下的归一化数值。计算统计特征：对于每个像素点的一维光谱数据，计算统计特征平均值（Mean）、方差（Variance）、能量（Energy）、频谱倾斜度（Skewness）、峰度（Kurtosis）来描述光谱的分布和变化情况。这些统计特征可以通过计算库来实现。Select tumor area: Select the area containing the tumor on the 2D grayscale image. Tumor regions can be obtained using methods such as manual annotation and threshold segmentation. Extract the one-dimensional normalized spectral data of each pixel: For the selected tumor area, extract the one-dimensional normalized spectral data corresponding to each pixel. According to the previous data processing steps, the one-dimensional data of each pixel is the normalized value at different wavelengths. Calculate statistical features: For the one-dimensional spectral data of each pixel, calculate the statistical features mean, variance, energy, spectrum slope (Skewness), and kurtosis (Kurtosis) to describe the spectrum. distribution and changes. These statistical characteristics can be implemented through computational libraries.

早期前列腺肿瘤的纹理特征提取过程，具体为：使用灰度共生矩阵（GLCM）、灰度差异矩阵（GLDM）、局部二值模式（LBP）来提取纹理特征。GLCM通过计算图像中某一像素与其邻近像素之间的灰度关系，生成一个矩阵。然后从这个矩阵中提取出特征，对比度、能量、同质性。GLDM 通过计算图像中每个像素和其邻近像素的灰度差异，生成一个矩阵。然后，从这个矩阵中提取对比度、粗糙度。LBP通过比较像素和其邻近像素的灰度值，生成一个新的二值图像。然后，从这张二值图中可以提取出纹理模式、纹理方向。The texture feature extraction process of early prostate tumors is specifically: using gray level co-occurrence matrix (GLCM), gray level difference matrix (GLDM), and local binary pattern (LBP) to extract texture features. GLCM generates a matrix by calculating the grayscale relationship between a certain pixel in the image and its neighboring pixels. Then features are extracted from this matrix, contrast, energy, homogeneity. GLDM generates a matrix by calculating the grayscale difference between each pixel in the image and its neighboring pixels. Then, contrast and roughness are extracted from this matrix. LBP generates a new binary image by comparing the grayscale values of a pixel and its neighboring pixels. Then, the texture pattern and texture direction can be extracted from this binary image.

特征数据与标签数据整理的过程，具体为：特征数据整理：将提取的特征（光谱特征、纹理特征）整理成一个矩阵（特征矩阵），每一行都代表一个样本，每一列都代表一个特征。标签数据整理：将对应的样本标签整理成一个向量（标签向量），每个元素在其中都代表着一个样本的标注。早期前列腺肿瘤Gleason评分为2-4分，共3个可能值，将具有3个可能值的分类特征转换为3个二进制特征，每个特征都表示一个可能的类别，每个Gleason评分都可以被视为一个独立的类别。将所有标签向量构成标签矩阵，其中每一行对应一个样本，每一列对应一个Gleason评分。The process of sorting feature data and label data is specifically: Feature data sorting: Organize the extracted features (spectral features, texture features) into a matrix (feature matrix), with each row representing a sample and each column representing a feature. Label data sorting: Organize the corresponding sample labels into a vector (label vector), in which each element represents the label of a sample. The Gleason score of early prostate tumors is 2-4 points, with a total of 3 possible values. The categorical features with 3 possible values are converted into 3 binary features. Each feature represents a possible category. Each Gleason score can be treated as a separate category. All label vectors are formed into a label matrix, where each row corresponds to a sample and each column corresponds to a Gleason score.

前列腺早期肿瘤识别模型建立过程。具体为：The establishment process of prostate early tumor identification model. Specifically:

构建模型架构：选用ResNet50卷积神经网络模型。模型的结构有卷积层，池化层，全连接层等多种形式。Build the model architecture: Select the ResNet50 convolutional neural network model. The structure of the model includes convolutional layers, pooling layers, fully connected layers and other forms.

输入层：指定输入数据的形状，根据特征矩阵的列数设置输入层的维度。Input layer: Specify the shape of the input data and set the dimensions of the input layer according to the number of columns of the feature matrix.

卷积层：通过增加卷积层的方式，提炼出形象的空间特点。设置合适的卷积核大小、步长和填充方式，以适应输入数据的特征。Convolutional layer: By adding convolutional layers, the spatial characteristics of the image are extracted. Set the appropriate convolution kernel size, stride, and padding method to adapt to the characteristics of the input data.

激活函数：为卷积层添加适当的激活函数，如ReLU函数，以引入非线性特性。Activation function: Add appropriate activation functions, such as ReLU functions, to the convolutional layer to introduce nonlinear characteristics.

池化层：添加池化层进行特征降维和空间信息的保留，选择合适的池化核大小和步长。Pooling layer: Add a pooling layer to reduce feature dimensionality and retain spatial information, and select the appropriate pooling kernel size and step size.

全连接层：将池化层的输出展平，并添加全连接层进行分类，设置合适的神经元数量和激活函数。Fully connected layer: Flatten the output of the pooling layer, add a fully connected layer for classification, and set the appropriate number of neurons and activation function.

输出层：设置输出层的维度，与标签矩阵的列数相匹配，使用softmax函数作为输出层的激活函数，以获取分类结果的概率分布。Output layer: Set the dimension of the output layer to match the number of columns of the label matrix, and use the softmax function as the activation function of the output layer to obtain the probability distribution of the classification results.

编译模型:定义损耗函数和优化程序，交叉熵可以作为一个损耗函数，ADAM优化器进行参数优化。Compile the model: Define the loss function and optimization program. Cross entropy can be used as a loss function, and the ADAM optimizer performs parameter optimization.

模型训练：将特征矩阵和标签矩阵作为训练数据，利用Fit()函数进行模型训练，指定合适的批次大小和训练轮数。Model training: Use the feature matrix and label matrix as training data, use the Fit() function to train the model, and specify the appropriate batch size and number of training rounds.

模型评估:使用测试集对训练好的模型进行评估，计算精确度、精确率、召回率、F1-SCORE等指标。Model evaluation: Use the test set to evaluate the trained model and calculate indicators such as precision, precision, recall, and F1-SCORE.

如图3所示，设计特定的损失函数的过程，具体为：As shown in Figure 3, the process of designing a specific loss function is as follows:

定义早期检测惩罚因子：这个因子设定为1.5，以使早期检测错误的损失更大。损失函数的计算:模型训练时，根据预测结果和真实标签，计算损失函数。对于每个样本，根据预测的类别和真实的类别计算损失。加权损失函数：根据预测的类别和真实的类别，计算加权损失函数。对于早期肿瘤检测错误，将损失函数乘以早期检测惩罚因子，使其具有更大的损失。损失函数求和：对所有样本的加权损失函数进行求和，得到最终的损失函数。可根据需要追加调整优化。模型的训练和优化：使用定义的加权损失函数进行模型的训练和优化。选择适当的优化器和学习率，并根据实际调整优化参数。模型评估:使用测试集对训练好的模型进行评估，计算精确度、精确率、召回率、F1-SCORE等指标。Define early detection penalty factor: This factor is set to 1.5 to make early detection errors more costly. Calculation of loss function: During model training, the loss function is calculated based on the prediction results and real labels. For each sample, the loss is calculated based on the predicted class and the true class. Weighted loss function: Calculate the weighted loss function based on the predicted category and the true category. For early tumor detection errors, the loss function is multiplied by the early detection penalty factor to make it have a larger loss. Loss function summation: Sum the weighted loss functions of all samples to obtain the final loss function. Additional adjustments and optimizations can be made as needed. Model training and optimization: Use the defined weighted loss function for model training and optimization. Choose the appropriate optimizer and learning rate, and adjust the optimization parameters according to reality. Model evaluation: Use the test set to evaluate the trained model and calculate indicators such as precision, precision, recall, and F1-SCORE.

模型融合的过程，具体为：通过训练多个具有不同初始参数的卷积神经网络模型，并使用投票进行模型融合。然后，通过评估融合模型的性能指标来验证其有效性，并根据需要进行进一步的调整和优化。The process of model fusion is specifically: training multiple convolutional neural network models with different initial parameters, and using voting for model fusion. Then, the effectiveness of the fusion model is verified by evaluating its performance indicators, and further adjustments and optimizations are made as needed.

引入注意力机制的过程，具体为：添加注意力模块，定义注意力权重，特征加权融合，模型训练和优化，模型评估和调整。The process of introducing the attention mechanism is specifically: adding the attention module, defining the attention weight, feature weighted fusion, model training and optimization, and model evaluation and adjustment.

模型评估和优化的过程，具体为:采用交叉验证进行验证。分割数据集，数据集要分成训练集和测试集两个部分。训练模型，平均性能评估，参数调整，模型优化，结果分析，通过绘制学习曲线、混淆矩阵等可视化工具，帮助理解模型的性能。The process of model evaluation and optimization is as follows: cross-validation is used for verification. Split the data set into two parts: training set and test set. Model training, average performance evaluation, parameter adjustment, model optimization, result analysis, and visualization tools such as drawing learning curves and confusion matrices to help understand the performance of the model.

以前列腺肿瘤样本的高光谱图像分析过程为例，具体的：Taking the hyperspectral image analysis process of prostate tumor samples as an example, specifically:

步骤S1，收集早期前列腺肿瘤样本。Step S1, collect early prostate tumor samples.

收集早期前列腺肿瘤手术患者的肿瘤样本，样本低温保存。Tumor samples from early-stage prostate tumor surgery patients were collected and cryopreserved.

本实施例要获取早期前列腺肿瘤样本的高光谱图像信息作为训练数据，早期前列腺肿瘤样本来源于已进行前列腺穿刺或前列腺肿瘤切除手术的病人早期前列腺肿瘤组织，在告知其研究实验目的以及可能存在的风险，并经患者同意后将早期前列腺肿瘤样本，病理结果为良性的单独一组，病理结果为恶性的根据Gleason评分系统（行业内常用的前列腺癌组织学分级的方法）评分，相同评分的为一组，形成不同分级的早期前列腺肿瘤样本，将其全部置于低温放置。In this embodiment, hyperspectral image information of early prostate tumor samples is obtained as training data. Early prostate tumor samples are derived from early prostate tumor tissues of patients who have undergone prostate puncture or prostate tumor resection surgery. After informing them of the purpose of the research experiment and possible existing risk, and with the patient's consent, early-stage prostate tumor samples were divided into a separate group with benign pathological results, and malignant ones were scored according to the Gleason scoring system (a method commonly used in the industry for histological grading of prostate cancer). Those with the same score were A group of early prostate tumor samples of different grades were formed and all were placed at low temperature.

步骤S2，获取环境光光谱特性。Step S2: Obtain the spectral characteristics of ambient light.

本实施例中，在拍摄平面放置一块无荧光特性的白色漫反射参照物，获取该参照物区域的高光谱图像，对该区域内各像素点提取一维光谱数据，数据存入csv文件中，按列求和取平均得到一行光谱数据作为环境光的光谱特征。In this embodiment, a white diffuse reflection reference object with no fluorescence characteristics is placed on the shooting plane, a hyperspectral image of the reference object area is obtained, one-dimensional spectral data is extracted for each pixel in the area, and the data is stored in a csv file. Sum and average by column to obtain a row of spectral data as the spectral characteristics of ambient light.

步骤S3，基于高光谱扫描早期前列腺肿瘤样本。Step S3: Scan early prostate tumor samples based on hyperspectral.

将经过预处理的标本解冻后，通过高光谱相机拍摄，获取高光谱原始图像。After the preprocessed specimens are thawed, they are photographed by a hyperspectral camera to obtain original hyperspectral images.

本实施例的高光谱图像包括光谱维和立体空间维，同时采用SpecView-F软件收集高光谱数据。这里SpecView-F采集软件针对GaiaSky系列机载光谱成像系统、GaiaTracer文检系统和Image-λ-F系列高光谱相机开发，主要用来实现对光谱成像系统的控制、图像采集等功能。The hyperspectral image in this embodiment includes spectral dimensions and three-dimensional spatial dimensions, and SpecView-F software is used to collect hyperspectral data. The SpecView-F acquisition software here is developed for the GaiaSky series airborne spectral imaging system, GaiaTracer document inspection system and Image-λ-F series hyperspectral cameras. It is mainly used to realize the control of the spectral imaging system, image acquisition and other functions.

步骤S4，对采集的高光谱数据进行图形分析和数据处理。Step S4: Perform graphic analysis and data processing on the collected hyperspectral data.

获取指定波段的图像数据，将多个波段数据拼接为一个RGB图像，生成一维数据集，再减去环境光光谱特征（环境降噪），经数据归一化后绘制灰度图像，生成二维数据集。Obtain the image data of the specified band, splice the multiple band data into an RGB image, generate a one-dimensional data set, then subtract the ambient light spectral characteristics (environmental noise reduction), draw the grayscale image after normalizing the data, and generate a two-dimensional data set. dimensional data set.

本实施例中，对光谱仪采集得到的原始图像数据进行处理，首先从指定的文件中读取数据，加载指定的.mat文件，获取指定波段的图像数据，并将其展示在屏幕上。通过cat函数将红、绿、蓝三个波段数据拼接为一个RGB图像，R通道波长选择范围为[630nm，650nm]，G通道波长选择范围为[530nm，550nm]，B通道波长选择范围为[450nm，470nm]，获取图像各像素点的一维光谱数据。再减去环境光光谱特征（环境降噪），即将肿瘤高光谱图像中各像素点的一维光谱数据减去环境光平均光谱特征。In this embodiment, the original image data collected by the spectrometer is processed. First, the data is read from the specified file, the specified .mat file is loaded, the image data of the specified band is obtained, and displayed on the screen. The red, green, and blue band data are spliced into an RGB image through the cat function. The R channel wavelength selection range is [630nm, 650nm], the G channel wavelength selection range is [530nm, 550nm], and the B channel wavelength selection range is [ 450nm, 470nm] to obtain one-dimensional spectral data of each pixel of the image. Then subtract the ambient light spectral characteristics (environmental noise reduction), that is, subtract the average spectral characteristics of the ambient light from the one-dimensional spectral data of each pixel in the tumor hyperspectral image.

再进行数据归一化，将光谱数据值域压缩为[-1，1]，获取每个像素点归一化后的一维数据，即每个像素点在不同波长下的一维数据。通过这一步骤能够屏蔽环境光的干扰并且抑制数据内的白噪声干扰，压缩光谱特性值域，但可完整的保留光谱特性，防止在模型训练时产生梯度过大的问题。Then perform data normalization, compress the spectrum data value range to [-1, 1], and obtain the normalized one-dimensional data of each pixel, that is, the one-dimensional data of each pixel at different wavelengths. Through this step, the interference of ambient light can be shielded and the white noise interference in the data can be suppressed, and the spectral characteristic value range can be compressed, but the spectral characteristics can be completely retained to prevent the problem of excessive gradients during model training.

对归一化后的一维数据进行数据扩展，生成单通道灰度图像，具体包括：对归一化后的一维数据进行复制，复制32行后，每一像素点数据乘255并保存为灰度图像。通过上述过程获取所有像素点的灰度图像，以此作为二维数据。需要指出的是，此处也可将归一化后的一维数据复制大于32行，得到的二维数据可以输入至更复杂的深度学习模型中，以便获取更复杂的特征。Perform data expansion on the normalized one-dimensional data to generate a single-channel grayscale image, which includes: copying the normalized one-dimensional data. After copying 32 lines, multiply each pixel data by 255 and save it as Grayscale image. Through the above process, the grayscale image of all pixels is obtained as two-dimensional data. It should be pointed out that the normalized one-dimensional data can also be copied here for more than 32 rows, and the resulting two-dimensional data can be input into a more complex deep learning model to obtain more complex features.

步骤S5，提取高光谱图像中早期前列腺肿瘤的特征。Step S5: Extract features of early prostate tumors in the hyperspectral image.

提取肿瘤光谱特征：Extract tumor spectral features:

选择感兴趣的肿瘤区域，具体为：在二维灰度图像上选择包含肿瘤的感兴趣区域。可以使用手动标注、阈值分割等方法来获取肿瘤区域。Select the tumor area of interest, specifically: select the area of interest containing the tumor on the two-dimensional grayscale image. Tumor regions can be obtained using methods such as manual annotation and threshold segmentation.

提取每个像素点的一维归一化光谱数据，具体为：针对选定的肿瘤区域，提取每个像素点对应的一维归一化光谱数据。根据之前的数据处理步骤，每个像素点的一维数据是在不同波长下的归一化数值。Extract the one-dimensional normalized spectral data of each pixel, specifically: extract the one-dimensional normalized spectral data corresponding to each pixel for the selected tumor area. According to the previous data processing steps, the one-dimensional data of each pixel is the normalized value at different wavelengths.

计算统计特征，具体为：对于每个像素点的一维光谱数据，计算统计特征来描述光谱的分布和变化情况。计算一维光谱数据的平均值，反映光谱的中心趋势。计算一维光谱数据的方差，反映光谱的离散程度。计算一维光谱数据的能量，反映光谱的幅度信息。计算一维光谱数据的频谱倾斜度，反映光谱分布的不对称性。计算一维光谱数据的峰度，反映光谱分布的尖锐程度。Calculate statistical features, specifically: for the one-dimensional spectral data of each pixel point, calculate statistical features to describe the distribution and changes of the spectrum. Calculate the average of one-dimensional spectral data to reflect the central trend of the spectrum. Calculate the variance of one-dimensional spectral data to reflect the dispersion of the spectrum. Calculate the energy of one-dimensional spectral data and reflect the amplitude information of the spectrum. Calculate the spectral slope of one-dimensional spectral data to reflect the asymmetry of the spectral distribution. Calculate the kurtosis of one-dimensional spectral data to reflect the sharpness of the spectral distribution.

对于每个像素点，将计算得到的统计特征存储在相应的位置上，形成一个特征图。每个像素点都对应一个特征向量，其中包含了光谱的统计特征信息。For each pixel, the calculated statistical features are stored at the corresponding location to form a feature map. Each pixel corresponds to a feature vector, which contains statistical characteristic information of the spectrum.

提取肿瘤纹理特征。利用灰度共生体矩阵(GLCM)对纹理特征进行提取，具体为：定义邻近像素的距离和方向：根据图像的特点和需求，选择合适的邻近像素距离和方向，如水平、垂直、对角线等。Extract tumor texture features. The gray level symbiosis matrix (GLCM) is used to extract texture features, specifically: define the distance and direction of adjacent pixels: select the appropriate distance and direction of adjacent pixels according to the characteristics and needs of the image, such as horizontal, vertical, and diagonal lines. wait.

灰度量化，具体为：将图像的灰度级离散化为一组离散值，使用灰度分层或直方图均衡化等方法进行灰度量化。Gray quantization, specifically: discretize the gray level of the image into a set of discrete values, and use methods such as gray layering or histogram equalization to perform gray quantization.

构建灰度共生矩阵，具体为：对于每个像素，计算其与指定距离和方向上邻近像素的灰度值对。根据邻近像素的灰度对，统计它们在灰度共生矩阵中的出现次数。Construct a gray-level co-occurrence matrix, specifically: for each pixel, calculate its gray-level value pair with neighboring pixels in the specified distance and direction. According to the gray-level pairs of neighboring pixels, count their occurrence times in the gray-level co-occurrence matrix.

提取统计特征，具体为：提取灰度共生矩阵的各种统计特性：对比度，能量，同质性，关联性。这些特征可以通过计算灰度共生矩阵的不同组合得到，对比度可以计算邻近像素的灰度差异的平方和，能量可以计算元素平方和等灰度共生矩阵。Extract statistical features, specifically: extract various statistical properties of the gray-level co-occurrence matrix: contrast, energy, homogeneity, and correlation. These features can be obtained by calculating different combinations of gray-level co-occurrence matrices, contrast can be calculated by calculating the sum of squares of gray-level differences of adjacent pixels, and energy can be calculated by calculating the sum of squared elements of gray-level co-occurrence matrices.

使用灰度差异矩阵（GLDM）来提取纹理特征，取与灰度共生体矩阵相同的邻近像素的距离和方向。The gray level difference matrix (GLDM) is used to extract texture features, taking the same distance and direction of neighboring pixels as the gray level symbiosis matrix.

计算灰度差异，对于每个像素，计算其与指定距离和方向上邻近像素的灰度差异值。Calculate the grayscale difference. For each pixel, calculate the grayscale difference value between it and neighboring pixels at the specified distance and direction.

构建灰度差异矩阵，根据邻近像素的灰度差异值，统计它们在灰度差异矩阵中的出现次数。Construct a grayscale difference matrix, and count the number of times they appear in the grayscale difference matrix based on the grayscale difference values of adjacent pixels.

提取统计特征。各种统计特征从灰度差异矩阵中提取：对比度、粗糙度。这些特征可以通过计算灰度差异矩阵的不同组合得到，对比度可以计算邻近像素的灰度差异的平均值，粗糙度可以计算灰度差异矩阵的标准差等。Extract statistical features. Various statistical features are extracted from the grayscale difference matrix: contrast, roughness. These features can be obtained by calculating different combinations of grayscale difference matrices, contrast can calculate the average of the grayscale differences of neighboring pixels, roughness can calculate the standard deviation of the grayscale difference matrix, etc.

使用局部二值模式（LBP）来提取纹理特征，具体为：取与灰度共生体矩阵相同的邻近像素的距离和方向。Use local binary pattern (LBP) to extract texture features, specifically: take the same distance and direction of neighboring pixels as the gray-level symbiosis matrix.

中心像素与邻近像素比较，具体为：对于每个像素，将其灰度值与邻近像素的灰度值进行比较，得到二进制编码。The central pixel is compared with neighboring pixels, specifically: for each pixel, its gray value is compared with the gray value of neighboring pixels to obtain a binary code.

构建二值图像，具体为：根据比较结果，将二进制编码转化为一个二值图像，其中每个像素代表了一个局部纹理模式。Construct a binary image, specifically: according to the comparison result, convert the binary code into a binary image, in which each pixel represents a local texture pattern.

提取统计特征：从二值图像中提取各种统计特征：纹理模式频率、纹理模式分布。这些特征可以通过计算纹理模式的直方图、均值、方差等得到。Extract statistical features: Extract various statistical features from binary images: texture pattern frequency, texture pattern distribution. These features can be obtained by calculating the histogram, mean, variance, etc. of the texture pattern.

步骤S6，特征数据与标签数据整理。Step S6: Organize feature data and label data.

特征数据整理：将提取的特征（光谱特征、纹理特征）整理成一个矩阵（或者叫特征矩阵），每一行都代表一个样本，每一列都代表一个特点。这个矩阵通常以numpy数组的形式表示。Feature data sorting: Organize the extracted features (spectral features, texture features) into a matrix (or feature matrix), with each row representing a sample and each column representing a feature. This matrix is usually represented as a numpy array.

标签数据整理：将对应的样本标签（Gleason评分）整理成一个向量（或者叫标签向量），每个元素在其中都代表着一个样本的标注。因早期前列腺肿瘤Gleason评分为2-4分，共3个可能值，因而将具有3个可能值的分类特征转换为3个二进制特征，每个特征都表示一个可能的类别，每个Gleason评分都可以被视为一个独立的类别。如下：Label data sorting: Organize the corresponding sample labels (Gleason scores) into a vector (or label vector), in which each element represents the label of a sample. Because the Gleason score of early prostate tumors is 2-4 points, with a total of 3 possible values, the categorical features with 3 possible values are converted into 3 binary features, each feature represents a possible category, and each Gleason score is Can be considered as an independent category. as follows:

2-＞[1，0，0]2->[1, 0, 0]

3-＞[0，1，0]3->[0, 1, 0]

4-＞[0，0，1]4->[0, 0, 1]

将所有标签向量构成标签矩阵，其中每一行对应一个样本，每一列对应一个Gleason评分，值为1表示该样本的评分为该列对应的评分，值为0表示不是。All label vectors are formed into a label matrix, in which each row corresponds to a sample and each column corresponds to a Gleason score. A value of 1 indicates that the score of the sample is the score corresponding to the column, and a value of 0 indicates that it is not.

步骤S7，建立早期前列腺肿瘤识别模型。Step S7: Establish an early prostate tumor recognition model.

选用ResNet50网络模型，引入残差连接，解决了深度网络中的梯度消失和表示瓶颈问题，能够训练出更深的网络。The ResNet50 network model is selected and residual connections are introduced to solve the gradient disappearance and representation bottleneck problems in deep networks, and can train deeper networks.

输入层：指定输入数据形状，根据特征矩阵的列数设置输入层的维度。Input layer: Specify the shape of the input data and set the dimensions of the input layer according to the number of columns of the feature matrix.

卷积层中，卷积核大小可以选择3x3或5x5的卷积核。卷面核数可根据实际设定卷面核数。例如32、64、128等。步长：一般设置为1。填充方式可以选择"valid"或"same"，根据输入数据的大小和模型需求来决定。激活功能一般使用ReLU激活功能。In the convolutional layer, the convolution kernel size can be selected as 3x3 or 5x5 convolution kernel. The number of roll surface cores can be set according to the actual number of roll surface cores. For example, 32, 64, 128, etc. Step size: Generally set to 1. The filling method can be selected as "valid" or "same", depending on the size of the input data and model requirements. The activation function generally uses the ReLU activation function.

池化层中，池化核大小一般选择2x2的池化核。步长一般设置为2。填充方式可以选择"valid"或"same"，根据输入数据的大小和模型需求来决定。In the pooling layer, the pooling core size is generally 2x2. The step size is generally set to 2. The filling method can be selected as "valid" or "same", depending on the size of the input data and model requirements.

全连接层中，神经元数量根据实际情况设置，可以尝试不同的数值，以寻找最佳的表现。激活功能一般使用ReLU激活功能。In the fully connected layer, the number of neurons is set according to the actual situation, and you can try different values to find the best performance. The activation function generally uses the ReLU activation function.

输出层中，输出层的维度根据标签矩阵的列数设置输出层的维度，与类别数量相同。In the output layer, the dimension of the output layer is set according to the number of columns of the label matrix, which is the same as the number of categories.

激活函数利用softmax函数，将输出转换成概率分布的分类结果。The activation function uses the softmax function to convert the output into a classification result of a probability distribution.

损失函数和优化器：对于多分类任务，选择交叉熵损失函数（categorical_crossentropy）作为损失函数。因为Gleason的评分是3个能力值。可以把每一个是一个独立的门类。将标签矩阵转换为3个二进制特征。因此，损失函数将根据3个类别之间的交叉熵进行计算。Loss function and optimizer: For multi-classification tasks, choose the cross-entropy loss function (categorical_crossentropy) as the loss function. Because Gleason's score is 3 ability points. Each one can be considered an independent category. Convert the label matrix into 3 binary features. Therefore, the loss function will be calculated based on the cross-entropy between the 3 categories.

此外，根据特定的损失函数，以考虑早期肿瘤检测的重要性和错误惩罚的差异，采用加权损失函数的方式实现，具体为：定义早期检测惩罚因子：首先，根据问题的特定需求，定义一个早期检测惩罚因子，表示对早期肿瘤检测错误的惩罚。这个因子定为一个正数1.5，以使早期检测错误的损失更大。In addition, according to a specific loss function, to consider the importance of early tumor detection and the difference in error penalties, a weighted loss function is used to implement it, specifically as follows: Define the early detection penalty factor: First, define an early detection penalty factor according to the specific needs of the problem Detection penalty factor, which represents the penalty for early tumor detection errors. This factor is set to a positive number of 1.5 to make early detection errors more costly.

计算损失函数：在模型训练过程中，根据预测结果和真实标签向量每个元素的差值，计算损失函数。对于每个样本，根据预测的类别和真实的类别计算损失。Calculate the loss function: During the model training process, the loss function is calculated based on the difference between the prediction result and each element of the true label vector. For each sample, the loss is calculated based on the predicted class and the true class.

加权损失函数：根据预测的类别和真实的类别，计算加权损失函数。对于早期肿瘤检测错误，将损失函数乘以早期检测惩罚因子，使其具有更大的损失。Weighted loss function: Calculate the weighted loss function based on the predicted category and the true category. For early tumor detection errors, the loss function is multiplied by the early detection penalty factor to make it have a larger loss.

损失函数求和：对所有样本的加权损失函数进行求和，得到最终的损失函数。可根据需要追加调整优化。Loss function summation: Sum the weighted loss functions of all samples to obtain the final loss function. Additional adjustments and optimizations can be made as needed.

模型的训练和优化：使用定义的加权损失函数进行模型的训练和优化。选择适当的优化器和学习率，并根据实际调整优化参数。Model training and optimization: Use the defined weighted loss function for model training and optimization. Choose the appropriate optimizer and learning rate, and adjust the optimization parameters according to reality.

模型评估：使用测试集对训练好的模型进行评估，计算精确度、精确率、召回率、F1-SCORE等指标。可以通过混淆矩阵等方法来进一步分析模型的性能。Model evaluation: Use the test set to evaluate the trained model and calculate indicators such as precision, precision, recall, and F1-SCORE. The performance of the model can be further analyzed through methods such as confusion matrices.

通过设计特定的损失函数，模型能够更加关注早期肿瘤检测的准确性，并根据惩罚因子进行错误的惩罚。这有助于提高模型在早期肿瘤检测方面的性能。需要针对真题和数据的特点，适当地做一些调整和优化。以获得最佳的性能和效果。By designing a specific loss function, the model can pay more attention to the accuracy of early tumor detection and penalize errors based on the penalty factor. This helps improve the model's performance in early tumor detection. It is necessary to make some appropriate adjustments and optimizations based on the characteristics of the real questions and data. for optimal performance and results.

对于优化器的选择，Adam优化器是可以使用的。并设定适合自己的学习率等参数。Adam优化器结合了动量和自适应学习率的特性，常用于深度学习任务，能够在训练过程中自动调整学习率。学习率可根据实际情况作适当调整。初始学习率设定为0.001。并根据训练过程中的性能调整学习率的衰减率或其他参数。For optimizer selection, the Adam optimizer can be used. And set parameters such as learning rate that suit you. The Adam optimizer combines the characteristics of momentum and adaptive learning rate. It is often used in deep learning tasks and can automatically adjust the learning rate during the training process. The learning rate can be adjusted appropriately according to the actual situation. The initial learning rate is set to 0.001. And adjust the decay rate or other parameters of the learning rate based on the performance during training.

模型训练：批次大小：根据实际情况设置，通常选择较小的批次大小，如32、64等；Model training: Batch size: Set according to the actual situation, usually choose a smaller batch size, such as 32, 64, etc.;

训练轮数量：按实际设定。通常选择足够的训练轮数以达到收敛。Number of training wheels: According to the actual setting. Usually a sufficient number of training epochs is chosen to achieve convergence.

模型评估：在评价阶段，采用多种指标对性能进行评价。包括准确率、精确率、召回率、F1-SCORE等。以下是对这些指标的计算和分析方法的详细说明：Model evaluation: In the evaluation stage, a variety of indicators are used to evaluate performance. Including accuracy rate, precision rate, recall rate, F1-SCORE, etc. Below is a detailed description of how these indicators are calculated and analyzed:

准确度(Accuracy)：准确度是分类模型中最常用的评价指标之一，表示分类正确的样本样本总量的比例。计算公式如下：准确率=(预测正确的样本数)/(总样本数)。Accuracy: Accuracy is one of the most commonly used evaluation indicators in classification models, indicating the proportion of the total number of correctly classified samples. The calculation formula is as follows: accuracy = (number of correctly predicted samples)/(total number of samples).

精确率（Precision）：精确率测量的是模型预测为正类的样本中真正为正类的比例。计算公式如下：精确率=(真正为正类别的样本数)/(预测为正类别的样本数)。Precision: Precision measures the proportion of samples predicted as positive by the model that are actually positive. The calculation formula is as follows: Accuracy = (number of samples that are actually positive categories)/(number of samples that are predicted to be positive categories).

召回率（Recall）：召回率衡量了一个模型能够正确地识别一个类别的样本。即模型成功捕捉到正类别样本的比例。计算公式如下：召回率=(真正为正类别的样本数)/(真实为正类别的样本数)。Recall: Recall measures the ability of a model to correctly identify samples of a category. That is, the proportion of positive category samples that the model successfully captures. The calculation formula is as follows: recall rate = (number of samples that are truly positive categories)/(number of samples that are truly positive categories).

F1-score：F1-SCORE是综合考虑了精确率和召回率的评估指标。为精确率与召回率的调和平均数。其计算公式为：F1-score=2*(精确率*召回率)/(精确率+召回率)。F1-score: F1-SCORE is an evaluation index that comprehensively considers precision and recall. is the harmonic mean of precision and recall. The calculation formula is: F1-score=2*(precision rate*recall rate)/(precision rate+recall rate).

这些指标可以通过混淆矩阵进一步分析模型的性能。混淆矩阵是一个二维矩阵，用来表示模型预测结果和真实标签的对应关系。它包含四个重要指标：真正例(1Positive，TP)、假正例(0Positive，FP)、真反例(1Negative，TN)、假反例(。FN）。These indicators can further analyze the performance of the model through the confusion matrix. The confusion matrix is a two-dimensional matrix used to represent the correspondence between model prediction results and real labels. It contains four important indicators: true example (1Positive, TP), false positive example (0Positive, FP), true counterexample (1Negative, TN), and false counterexample (.FN).

TP：模型将正类别样本预测为正类别的数量。TP: The number of positive category samples predicted by the model as positive categories.

FP：模型将负类别样本预测为正类别的数量。FP: The number of negative class samples predicted by the model as positive classes.

TN：模型将负类别样本预测为负类别的数量。TN: The number of negative class samples predicted by the model as negative classes.

FN：模型将正类别样本预测为负类别的数量。FN: The number of positive category samples predicted by the model as negative categories.

通过混淆矩阵可以对以上指标进行计算。以下是计算步骤：The above indicators can be calculated through the confusion matrix. The following are the calculation steps:

根据模型预测结果和真实标签，将样本分类为TP、FP、TN、FN四类。According to the model prediction results and real labels, the samples are classified into four categories: TP, FP, TN, and FN.

根据分类结果计算准确率、精确率、召回率、F1-SCORE。Calculate accuracy, precision, recall, and F1-SCORE based on the classification results.

混淆矩阵和上述指标的计算和分析可以帮助了解模型在不同类别上的性能，进而对模型进行调整和优化。需要注意的是，上述参数仅作为参考，具体的参数选择应根据实际情况和数据集的特点进行调整和优化。在模型的训练过程中，可以使用交叉验证等技术来评估模型在不同数据子集上的性能，以选择最佳的参数和模型配置。The calculation and analysis of the confusion matrix and the above indicators can help understand the performance of the model on different categories, and then adjust and optimize the model. It should be noted that the above parameters are only for reference, and the specific parameter selection should be adjusted and optimized according to the actual situation and the characteristics of the data set. During the training process of the model, techniques such as cross-validation can be used to evaluate the performance of the model on different data subsets to select the best parameters and model configurations.

通过计算这些指标，可以评估和比较模型的性能。准确度衡量了分类的总体准确度。精确率和召回率提供了对正类别预测的准确性和完整性的评估，F1-score综合了精确率和召回率，可以更全面地评估模型的性能。By calculating these metrics, model performance can be evaluated and compared. Accuracy measures the overall accuracy of classification. Precision and recall provide an assessment of the accuracy and completeness of positive class predictions, and F1-score combines precision and recall to provide a more comprehensive assessment of the model's performance.

此外，通过绘制ROC曲线(接收器工作特性曲线)和计算AUC(曲线下面积)来评估模型的分类能力和区分性。ROC曲线是以召回率为横轴，以1一特异度(0PositiveRate)为纵轴的曲线。AUC代表的是ROC曲线下的面积，可以用来对模型的总体表现进行评价。In addition, the classification ability and discrimination of the model were evaluated by plotting the ROC curve (receiver operating characteristic curve) and calculating the AUC (area under the curve). The ROC curve is a curve with recall rate on the horizontal axis and 1-specificity (0PositiveRate) on the vertical axis. AUC represents the area under the ROC curve and can be used to evaluate the overall performance of the model.

步骤S8，模型融合。Step S8, model fusion.

模型融合是将多个模型结合在一起的预测结果。从而提高整体预报的技术的鲁棒性。Model fusion is the prediction result of combining multiple models. Thereby improving the robustness of the overall forecast technology.

在本实施例中，通过训练多个具有不同初始参数的卷积神经网络模型，并使用投票进行模型融合。然后，通过评估融合模型的性能指标来验证其有效性，并根据需要进行进一步的调整和优化。In this embodiment, multiple convolutional neural network models with different initial parameters are trained and model fusion is performed using voting. Then, the effectiveness of the fusion model is verified by evaluating its performance indicators, and further adjustments and optimizations are made as needed.

以下是模型融合中使用投票策略的具体步骤：The following are the specific steps for using voting strategies in model fusion:

训练多个独立的卷积神经网络模型：训练多个具有不同初始参数的卷积神经网络模型。确保每个模型都经过充分的训练，并在相同的训练集上进行训练以确保公平性。Train multiple independent convolutional neural network models: Train multiple convolutional neural network models with different initial parameters. Make sure each model is fully trained and trained on the same training set to ensure fairness.

利用测试集对各模型进行预测。使用独立的测试集（与训练集和验证集不重叠）对每个训练好的模型进行预测。对于每个测试样本，获取每个模型的预测结果。Use the test set to make predictions for each model. Make predictions for each trained model using an independent test set (non-overlapping with the training and validation sets). For each test sample, obtain the prediction results of each model.

进行投票：对于每个样本，在所有模型的预测结果中进行投票。可以采用多数投票原则，选择获得最高票数的类别作为最终的预测结果。如果存在票数相等的情况，可以选择随机选取一个类别或使用其他决策规则。Take a vote: For each sample, take a vote among the predictions of all models. The majority voting principle can be used to select the category with the highest number of votes as the final prediction result. If there is a tie vote, you can choose to randomly select a category or use another decision rule.

计算性能指标：使用模型融合后的预测结果，计算模型的性能指标，如准确率、精确率、召回率、F1-SCORE等。可以使用混淆矩阵来进一步分析模型的性能。Calculate performance indicators: Use the prediction results after model fusion to calculate the performance indicators of the model, such as accuracy, precision, recall, F1-SCORE, etc. A confusion matrix can be used to further analyze the performance of the model.

调整和优化：根据模型融合的性能指标，进行进一步的调整和优化参数。可以尝试增加或减少参与投票的模型数量，以找到最佳的模型融合策略。Adjustment and optimization: Make further adjustments and optimize parameters based on the performance indicators of model fusion. You can try increasing or decreasing the number of models participating in voting to find the best model fusion strategy.

需要注意的是，在进行模型融合时，确保参与投票的模型具有多样性，即模型之间有一定的差异性。这可以通过调整模型的架构、优化参数、使用不同的损失函数等方法来实现。It should be noted that when performing model fusion, ensure that the models participating in voting are diverse, that is, there is a certain degree of difference between the models. This can be achieved by adjusting the model's architecture, optimizing parameters, using different loss functions, etc.

最后，通过模型融合可以综合利用多个模型的预测结果，提高整体预测的准确性和稳健性。但请记住，模型融合也需要权衡计算资源和时间成本，因为训练和预测多个模型可能需要更多的计算资源和时间。Finally, through model fusion, the prediction results of multiple models can be comprehensively utilized to improve the accuracy and robustness of the overall prediction. But keep in mind that model fusion also comes with a trade-off in computational resources and time costs, as training and predicting multiple models may require more computational resources and time.

步骤S9：引入自定义注意力机制，如图4所示。Step S9: Introduce a custom attention mechanism, as shown in Figure 4.

定义注意力模块：该模块接受特征图（ResNet50的中间层的输出）作为输入，然后通过卷积、BatchNormalization和sigmoid激活函数输出一个同样大小的权重矩阵。Define the attention module: This module accepts the feature map (the output of the middle layer of ResNet50) as input, and then outputs a weight matrix of the same size through convolution, BatchNormalization and sigmoid activation function.

在ResNet中插入注意力模块：在ResNet的适当位置（比如某个残差块的输出后）插入上述定义的注意力模块。然后，使用注意力模块的输出（权重矩阵）对特征图进行加权，得到新的特征图。这个过程是一个“注意力加权”的过程，让网络更加关注那些权重较大（注意力模块输出接近1）的区域。Insert the attention module in ResNet: Insert the attention module defined above at an appropriate position in ResNet (such as after the output of a residual block). Then, the feature map is weighted using the output of the attention module (weight matrix) to obtain a new feature map. This process is an "attention weighting" process, which allows the network to pay more attention to those areas with larger weights (the output of the attention module is close to 1).

模型的训练和优化：使用加权特征表示进行模型训练和优化。选择设计好的特定损失函数和优化器，并根据实际情况进行参数调整和优化。Model training and optimization: Use weighted feature representations for model training and optimization. Select the designed specific loss function and optimizer, and adjust and optimize parameters according to the actual situation.

模型评估：使用测试集对训练好的模型进行评估，计算准确率、精确率、召回率、F1-score等指标。Model evaluation: Use the test set to evaluate the trained model and calculate indicators such as accuracy, precision, recall, and F1-score.

在实际应用中，还可以进行注意力的可视化，以帮助理解模型对图像的关注点。In practical applications, attention visualization can also be performed to help understand the model's focus on the image.

通过引入注意力机制，卷积神经网络可以更加准确地聚焦于图像中的重要特征，提高模型的性能和表现力。By introducing the attention mechanism, the convolutional neural network can more accurately focus on important features in the image, improving the performance and expressiveness of the model.

步骤S10：模型评估和优化。Step S10: Model evaluation and optimization.

通过交叉验证等方法对模型进行性能评估。并根据评估结果调整模型的参数，优化模型的性能。以下是具体的步骤：Evaluate the performance of the model through methods such as cross-validation. And adjust the parameters of the model based on the evaluation results to optimize the performance of the model. Here are the specific steps:

划分数据集：首先，将图像数据集分为训练集和测试集两部分。在交叉验证过程中，训练集进一步分成训练集与验证集两部分。进行10折交叉验证，训练数据划分为10个子集，每次使用9个子集作为训练图像数据，其余子集作为验证图像数据。Divide the data set: First, divide the image data set into two parts: the training set and the test set. During the cross-validation process, the training set is further divided into two parts: training set and validation set. Perform 10-fold cross-validation, divide the training data into 10 subsets, use 9 subsets each time as training image data, and the remaining subsets as verification image data.

训练模型：然后，对于每一次的训练/验证数据划分，使用训练数据训练模型，并利用验证数据评估模型。记录每次评估的结果。Train the model: Then, for each training/validation data split, train the model using the training data and evaluate the model using the validation data. Document the results of each assessment.

平均性能评估：在10次训练和验证后，计算所有验证结果的平均值，得到模型的平均性能评估。Average performance evaluation: After 10 times of training and verification, calculate the average of all verification results to obtain the average performance evaluation of the model.

参数调整：根据模型的平均性能评估，调整模型的参数，例如学习率、卷积核的大小和数量、全连接层的神经元数量等。然后重复划分数据集、训练模型、平均性能评估，直到找到最优的模型参数。Parameter adjustment: Based on the average performance evaluation of the model, adjust the parameters of the model, such as the learning rate, the size and number of convolution kernels, the number of neurons in the fully connected layer, etc. Then repeatedly divide the data set, train the model, and evaluate the average performance until the optimal model parameters are found.

模型优化：在找到最优的模型参数后，使用这些参数训练模型，并使用测试集进行最终的性能评估。如果模型的性能仍有提升空间，可以尝试使用更复杂的模型结构，或引入正则化、批标准化等技术优化模型。Model optimization: After finding the optimal model parameters, use these parameters to train the model, and use the test set for final performance evaluation. If there is still room for improvement in the performance of the model, you can try to use a more complex model structure, or introduce techniques such as regularization and batch normalization to optimize the model.

结果分析：模型训练和优化完成后，对训练过程和结果进行分析。可以通过绘制学习曲线、混淆矩阵等可视化工具，帮助理解模型的性能。如果模型在某些类别的预测性能较差，可以进一步分析数据，找出可能的原因。Result analysis: After the model training and optimization are completed, analyze the training process and results. Visual tools such as learning curves and confusion matrices can be used to help understand the performance of the model. If the model's predictive performance is poor in certain categories, you can further analyze the data to identify possible reasons.

步骤S11：实施预测。Step S11: Implement prediction.

在优化后的模型中输入预处理后的新数据。由前列腺早期肿瘤恶性程度预测得出。Enter the preprocessed new data into the optimized model. Predicted by the malignancy of early prostate tumors.

上述方法无需进行复杂的肿瘤组织切片的制作，直接采用高光谱相机拍摄所需鉴定区域的高光谱图像并对高光谱图像进行数据处理，通过早期前列腺肿瘤识别模型实现对区域早期前列腺肿瘤识别并完成恶性程度评分，所有操作只需一套设备即可完成，整个流程耗时时间短，大大节省了时间。The above method does not require the preparation of complex tumor tissue slices. It directly uses a hyperspectral camera to capture hyperspectral images of the required identification area and performs data processing on the hyperspectral images. The early prostate tumor recognition model is used to realize and complete the identification of regional early prostate tumors. To score the degree of malignancy, all operations can be completed with only one set of equipment. The entire process is short and time-consuming, greatly saving time.

利用高光谱检测早期前列腺肿瘤，通过病理结果将患者分级，通过分析不同的高光谱特征，从而证明基于高光谱图像识别早期前列腺肿瘤的价值。Use hyperspectral to detect early prostate tumors, classify patients through pathological results, and analyze different hyperspectral features to prove the value of identifying early prostate tumors based on hyperspectral images.

利用人工智能学习不同病理分级前列腺癌患者组织样本的高光谱数据，得出高光谱数据与早期前列腺癌恶性程度的相关关系，实现对前列腺癌病人病理结果的预测。Artificial intelligence is used to learn hyperspectral data from tissue samples of prostate cancer patients with different pathological grades, and the correlation between hyperspectral data and the malignancy of early prostate cancer is obtained, so as to predict the pathological results of prostate cancer patients.

结合光谱特征和纹理特征识别，光谱特征能够提供关于不同波段下的光谱强度和变化情况的信息，而纹理特征则提供了图像中像素之间的空间关系和结构信息。它们具有互补的特点，结合使用可以捕捉到更丰富和多样化的特征信息，提高识别的准确性和鲁棒性，增强模型对前列腺肿瘤的识别能力。Combining spectral features and texture feature recognition, spectral features can provide information about spectral intensity and changes in different bands, while texture features provide spatial relationships and structural information between pixels in the image. They have complementary characteristics, and their combined use can capture richer and more diverse feature information, improve the accuracy and robustness of identification, and enhance the model's ability to identify prostate tumors.

光谱特征和纹理特征对干扰和噪声的响应方式不同。光谱特征对光谱强度的变化敏感，能够帮助抵抗光照和杂散光的影响。纹理特征则可以在图像的局部区域中捕捉到细微的纹理变化，对光照不均匀和图像噪声具有一定的鲁棒性。通过结合两者，再结合环境去噪，可以增强模型对环境光等干扰因素的抵抗能力，提高前列腺肿瘤的识别准确性。Spectral and textural features respond differently to interference and noise. Spectral signatures are sensitive to changes in spectral intensity and can help resist the effects of illumination and stray light. Texture features can capture subtle texture changes in local areas of the image, and are robust to uneven lighting and image noise. By combining the two, combined with environmental denoising, the model's resistance to interference factors such as ambient light can be enhanced and the accuracy of prostate tumor identification can be improved.

设计损失函数，通过设计具体的损失函数，对早期肿瘤检测错误进行更大的惩罚，可以使模型更加关注早期阶段的肿瘤，并更加努力地减少这类错误。这种惩罚机制可以提供一种额外的刺激，鼓励模型更加敏感地捕捉到早期肿瘤的特征，从而提高早期检测肿瘤的准确度。这种设计的损失函数有助于引导模型在优化过程中更加重视早期肿瘤检测，提高对肿瘤早期敏感度。通过加大对早期肿瘤检测错误的惩罚，模型将倾向于减少对这类错误的发生，以提高整体的性能和早期识别能力。Designing a loss function that imposes a greater penalty on early-stage tumor detection errors by designing a specific loss function can make the model pay more attention to early-stage tumors and work harder to reduce such errors. This penalty mechanism can provide an additional stimulus to encourage the model to more sensitively capture the characteristics of early tumors, thereby improving the accuracy of early detection of tumors. This designed loss function helps guide the model to pay more attention to early tumor detection during the optimization process and improve the early sensitivity of tumors. By increasing the penalty for early tumor detection errors, the model will tend to reduce the occurrence of such errors to improve overall performance and early identification capabilities.

引入自定义注意力机制，可以帮助模型在处理高光谱图像时关注到与前列腺肿瘤相关的重要特征。通过对特定频谱或特定区域进行注意力调整，模型可以更加准确地捕捉到与前列腺肿瘤相关的信息，从而提高识别性能。自定义注意力机制可以通过赋予关注度权重，降低冗余波段的影响，从而减少模型在冗余信息上的计算和学习负担，提高模型的效率和准确性。早期前列腺肿瘤的光谱特征可能会因个体差异、病变类型和大小等因素而有所变化。自定义注意力机制可以根据输入数据的不同特点和需求，动态调整注意力权重，使模型能够自适应地关注到不同样本中具有区分度和重要性的特征，提高识别的鲁棒性和泛化能力。Introducing a custom attention mechanism can help the model pay attention to important features related to prostate tumors when processing hyperspectral images. By adjusting attention to specific spectra or specific regions, the model can more accurately capture information related to prostate tumors, thereby improving recognition performance. The custom attention mechanism can reduce the impact of redundant bands by assigning attention weights, thereby reducing the model's calculation and learning burden on redundant information and improving the efficiency and accuracy of the model. The spectral characteristics of early-stage prostate tumors may vary depending on factors such as individual differences, lesion type and size. The custom attention mechanism can dynamically adjust the attention weight according to the different characteristics and needs of the input data, so that the model can adaptively focus on distinguishing and important features in different samples, improving the robustness and generalization of recognition. ability.

仅需要人为操作获取肿瘤高光谱图像以及进行图像分析与数据处理，无需更多的人为操作，分类结果客观准确，人力、物力成本大大节省。降低了操作门槛。Only manual operations are required to obtain tumor hyperspectral images and perform image analysis and data processing. No more manual operations are required. The classification results are objective and accurate, and the manpower and material costs are greatly saved. The operating threshold is lowered.

采用早期前列腺肿瘤患者术后肿瘤作为样本，在临床上容易获取，样本空间充足，能不断优化完善识别模型。Using postoperative tumors from patients with early-stage prostate tumors as samples is easy to obtain clinically, has sufficient sample space, and can continuously optimize and improve the recognition model.

实施例二：Example 2:

实现上述方法的系统，包括：Systems that implement the above methods include:

实施例三：Embodiment three:

本实施例提供了一种计算机可读存储介质，其上存储有计算机程序，该程序被处理器执行时实现如上述实施例一所述的基于高光谱图像的组织样本分析方法中的步骤。This embodiment provides a computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, the steps in the tissue sample analysis method based on hyperspectral images as described in the first embodiment are implemented.

实施例四：Embodiment 4:

本实施例提供了一种计算机设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现如上述实施例一所述的基于高光谱图像的组织样本分析方法中的步骤。This embodiment provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the program, it implements the above-mentioned method based on the first embodiment. Steps in the tissue sample analysis method using hyperspectral images.

以上实施例二至四中涉及的各步骤或网络与实施例一相对应，具体实施方式可参见实施例一的相关说明部分。术语“计算机可读存储介质”应该理解为包括一个或多个指令集的单个介质或多个介质；还应当被理解为包括任何介质，所述任何介质能够存储、编码或承载用于由处理器执行的指令集并使处理器执行本发明中的任一方法。Each step or network involved in the above second to fourth embodiments corresponds to the first embodiment. For specific implementation methods, please refer to the relevant description part of the first embodiment. The term "computer-readable storage medium" shall be understood to include a single medium or multiple media that includes one or more sets of instructions; and shall also be understood to include any medium capable of storing, encoding, or carrying instructions for use by a processor. The executed instruction set causes the processor to perform any method in the present invention.

以上所述仅为本发明的优选实施例而已，并不用于限制本发明，对于本领域的技术人员来说，本发明可以有各种更改和变化。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection scope of the present invention.

Claims

1. The tissue sample analysis method based on hyperspectral image is characterized by comprising the following steps:

acquiring a hyperspectral image of a tissue sample to be detected, splicing image data of a set wave band into an RGB image, subtracting the hyperspectral characteristic of ambient light of a detection area to obtain the spectral characteristic and the texture characteristic of a target in the tissue sample to be detected, carrying out normalization processing to obtain a gray image and generating a two-dimensional data set;

the generated two-dimensional data set is used as input of the recognition model, the score corresponding to the label matrix is obtained, and the maximum value of the score is used as a recognition result;

wherein the two-dimensional dataset has a feature matrix corresponding to the spectral features and the texture features, the feature matrix corresponding to the tag matrix;

the process of introducing the attention mechanism is as follows: adding an attention module, defining attention weights, carrying out feature weighted fusion, model training and optimization, and model evaluation and adjustment; definition of attention module: the module receives the feature diagram as input, and outputs a weight matrix with the same size through convolution, batchnormal and sigmoid activation functions; inserting an attention module in the model: inserting the attention module defined above into the model; weighting the feature map by using the output of the attention module to obtain a new feature map; training and optimizing the model: model training and optimization using the weighted feature representation; selecting a designed specific loss function and an optimizer, and carrying out parameter adjustment and optimization according to actual conditions; model evaluation: evaluating the trained model by using a test set, and calculating the indexes of accuracy, precision, recall rate and F1-score;

The process of designing a specific loss function is specifically: defining early detection penalty factors: this factor is set to 1.5; calculating a loss function according to the prediction result and the real label during model training; for each sample, calculating a loss from the predicted class and the true class; weighted loss function: calculating a weighted loss function according to the predicted category and the real category; for early tumor detection errors, multiplying the loss function by an early detection penalty factor; and (3) summing a loss function: summing the weighted loss functions of all the samples to obtain a final loss function; the optimization can be additionally adjusted according to the requirement; training and optimizing the model: training and optimizing the model by using the defined weighted loss function; selecting an optimizer and a learning rate, and adjusting optimization parameters according to actual conditions; evaluating the trained model by using a test set, and calculating the accuracy, the precision, the recall rate and the F1-SCORE index;

the normalization processing is performed to obtain a gray image and generate a two-dimensional data set, specifically: compressing the spectrum data value range to a set interval to obtain one-dimensional data normalized by each pixel point, namely one-dimensional data of each pixel point under different wavelengths, and obtaining a single-channel gray level image after data expansion to generate a two-dimensional data set;

Each row in the feature matrix represents one sample, each column represents one feature, corresponding sample labels are arranged into a label vector, each row corresponds to one sample in the label matrix formed by all label vectors, and each column corresponds to one target score;

and training the model by taking the feature matrix and the label matrix as training data, and training according to the set batch and the set number of rounds.

2. The tissue sample analysis method based on hyperspectral image as claimed in claim 1, wherein the process of obtaining hyperspectral characteristics of ambient light in the detection area is specifically: and placing a reference object without fluorescence characteristic on a shooting plane, acquiring a hyperspectral image of a reference object area, extracting one-dimensional spectrum data of each pixel point in the area, and averaging according to column summation to obtain one line of spectrum data as spectrum characteristics of ambient light.

3. The tissue sample analysis method based on hyperspectral image as claimed in claim 1, wherein the process of obtaining single-channel gray scale image from normalized one-dimensional data after data expansion is specifically as follows: and copying the normalized one-dimensional data by a set number, and multiplying the data of each pixel point by a set multiple to obtain gray images of all the pixel points.

4. The hyperspectral image based tissue sample analysis method as claimed in claim 1 wherein the training process of the recognition model includes:

acquiring hyperspectral images of tissue samples prepared in advance;

and extracting target spectral features and target texture features in the hyperspectral image, and finishing the target spectral features and the target texture features into a feature matrix.

5. The tissue sample analysis method based on hyperspectral image as claimed in claim 4, wherein the extraction of the target spectral feature in the hyperspectral image is specifically: selecting a region containing a target on a two-dimensional gray level image, extracting one-dimensional normalized spectrum data corresponding to each pixel point in the region, and obtaining a statistical characteristic average value, a variance, energy, spectrum gradient and kurtosis to describe the distribution and change condition of the spectrum.

6. The tissue sample analysis method based on hyperspectral image as claimed in claim 4, wherein extracting the target texture feature in the hyperspectral image comprises:

texture features are extracted based on at least one of a gray level co-occurrence matrix, a gray level difference matrix, and a local binary pattern.

7. The hyperspectral image based tissue sample analysis method as claimed in claim 6, wherein extracting the target texture feature in the hyperspectral image further comprises:

Generating a matrix by acquiring the gray scale relation between a certain pixel and the adjacent pixels in the image, and extracting contrast, energy and homogeneity from the matrix;

generating a matrix by acquiring gray level differences of each pixel and adjacent pixels in the image, and extracting contrast and roughness from the matrix;

a binary image is generated by comparing the gray values of the pixel and its neighboring pixels, and a texture pattern and a texture direction are extracted from the binary image.

8. Tissue sample analysis system based on hyperspectral images, based on a tissue sample analysis method based on hyperspectral images according to any one of claims 1 to 7, characterized in that it comprises:

a preprocessing module configured to: acquiring a hyperspectral image of a tissue sample to be detected, splicing image data of a set wave band into an RGB image, subtracting the hyperspectral characteristic of ambient light of a detection area to obtain the spectral characteristic and the texture characteristic of a target in the tissue sample to be detected, carrying out normalization processing to obtain a gray image and generating a two-dimensional data set;

an analysis module configured to: the generated two-dimensional data set is used as input of the recognition model, the score corresponding to the label matrix is obtained, and the maximum value of the score is used as a recognition result;

Wherein the two-dimensional dataset has feature matrices corresponding to spectral features and texture features, the feature matrices corresponding to the tag matrices.