CN116128784A

CN116128784A - Image processing method, device, storage medium and terminal

Info

Publication number: CN116128784A
Application number: CN202111319114.9A
Authority: CN
Inventors: 王静雯
Original assignee: Guangzhou Shiyuan Electronics Thecnology Co Ltd; Guangzhou Shiyuan Artificial Intelligence Innovation Research Institute Co Ltd
Current assignee: Guangzhou Shiyuan Electronics Thecnology Co Ltd; Guangzhou Shiyuan Artificial Intelligence Innovation Research Institute Co Ltd
Priority date: 2021-11-09
Filing date: 2021-11-09
Publication date: 2023-05-16

Abstract

The invention discloses an image processing method, an image processing device, a storage medium and a terminal, wherein the method comprises the following steps: acquiring a lung CT scanning image to be processed; inputting the image into a pre-trained image processing model; outputting a plurality of lung nodule parameter values corresponding to the lung CT scan image; the image processing model is generated based on final target vector training, the final target vector is generated based on the existing clinical information and a plurality of predictive vectors, and the plurality of predictive vector features are generated based on feature fusion between feature vectors of each image in the existing historical lung CT scanning image sequence and shooting interval duration between adjacent images. According to the method and the device, feature fusion is carried out on time sequence and clinical information according to a plurality of prediction vectors generated by CT images shot continuously for many years, so that final feature values are richer, the accuracy of a trained model is higher, and the accuracy of identifying lung nodule parameters in images is improved.

Description

Image processing method, device, storage medium and terminal

Technical Field

The present invention relates to the field of deep learning technologies, and in particular, to an image processing method, an image processing device, a storage medium, and a terminal.

Background

Global cancer statistics show that there will be 1810 ten thousand new cancer cases and 960 dying cancer cases in 2018, where the lung cancer accounts for the largest proportion of new and dying cases, 11.6% and 18.4% respectively, and about 1 death in every 5 lung cancer patients. The survival rate of lung cancer is closely related to clinical stage of lung cancer in diagnosis. Because early symptoms of lung cancer are not obvious, the diagnosis is often late, the opportunity of surgical treatment is lost, and the prognosis is poor.

Among existing lung cancer screening means, low-dose Computed Tomography (CT) has long been considered as a potential early screening tool, and for high-risk groups of lung cancer, the lung cancer mortality rate can be reduced by 20% at present. However, due to equipment and personnel costs, and task complexity, it is challenging to achieve accurate output of specific parameters of lung nodules in CT images. Pulmonary nodules exhibit a wide range of shapes and features, and therefore identification of the parameters describing them is difficult, thereby reducing the accuracy of identifying the pulmonary nodule parameters in the image.

Disclosure of Invention

The embodiment of the application provides an image processing method, an image processing device, a storage medium and a terminal. The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments. Its sole purpose is to present some concepts in a simplified form as a prelude to the more detailed description that is presented later.

In a first aspect, an embodiment of the present application provides an image processing method, including:

acquiring a lung CT scanning image to be processed;

inputting the lung CT scanning image into a pre-trained image processing model; the image processing model is generated based on final target vector training, the final target vector is generated by feature fusion based on the existing clinical information and a plurality of predictive vectors, and the plurality of predictive vector features are generated by feature fusion based on shooting interval duration between feature vectors of each image in the existing historical lung CT scanning image sequence and adjacent images;

and outputting a plurality of lung nodule parameter values corresponding to the lung CT scan image.

Optionally, generating the pre-trained image processing model comprises:

constructing an image processing model; the image processing model comprises a lung nodule detection network, a feature extractor, a normalization layer and a full connection layer;

acquiring an existing historical lung CT scanning image sequence of a plurality of continuous periods;

generating a plurality of combined feature vector sequences according to the historical lung CT scanning image sequences, the lung nodule detection network, the feature extractor and the normalization layer;

Calculating a first search key and first information content of each feature vector in each combined feature vector sequence according to the full connection layer;

acquiring shooting interval duration of a historical lung CT scanning image corresponding to each feature vector, and generating a plurality of prediction feature vectors after feature fusion according to a first search key, first information content and shooting interval duration of each feature vector in each combined feature vector sequence;

extracting the existing clinical information, and generating a final target vector after feature fusion is carried out on the clinical information and a plurality of predicted feature vectors;

a cross entropy loss value is calculated based on the final target vector and when the cross entropy loss value reaches a minimum, a pre-trained image processing model is generated.

Optionally, generating a plurality of combined feature vector sequences from the historical lung CT scan image sequence, the lung nodule detection network, the feature extractor, the normalization layer, includes:

determining an image with the smallest interval from the current moment in the historical lung CT scanning image sequence as a target lung CT scanning image;

inputting the CT scan image of the target lung into a lung nodule detection network, and outputting a plurality of lung nodule data corresponding to the CT scan image of the target lung;

Cutting a plurality of CT image cutting sequences with preset sizes from each historical lung CT scanning image according to the lung nodule data;

inputting each CT image slicing sequence into a feature extractor, and outputting a plurality of first feature vectors of each CT image slicing sequence;

extracting the right center feature of each first feature vector in the plurality of first feature vectors, inputting the right center feature into the normalization layer, and generating a plurality of second feature vectors of each CT image cutting sequence;

and selecting and combining a plurality of feature vectors of the same lung nodule on different time sequences from a plurality of second feature vectors of each CT image cutting sequence to obtain a plurality of combined feature vector sequences.

Optionally, generating a plurality of predicted feature vectors after feature fusion according to the first search key, the first information content and the shooting interval duration of each feature vector in each combined feature vector sequence includes:

fusing each feature vector in each combined feature vector sequence with the shooting interval duration corresponding to each feature vector to generate a first fusion feature of each feature vector;

calculating the attention weight value of each feature vector according to the first fusion feature of each feature vector;

the attention weight value of each feature vector and the corresponding first information content are multiplied and summed to generate a plurality of predictive feature vectors.

Optionally, generating the final target vector after feature fusion is performed on the clinical information and the plurality of prediction feature vectors, including:

constructing key feature vectors according to the existing clinical information;

calculating a second search key and second information content of each predictive feature vector in the plurality of predictive feature vectors;

calculating the attention weight of each predictive feature vector based on the key feature vector and the second search key of each predictive feature vector;

and multiplying and summing the attention weight of each prediction feature vector and the corresponding second information content to generate a final target vector.

Optionally, the generating the pre-trained image processing model when the cross entropy loss value reaches a minimum includes:

and when the cross entropy loss value does not reach the minimum, updating network parameters of the image processing model, and continuing to execute the step of acquiring the existing historical lung CT scanning image sequence for a plurality of continuous periods.

In a second aspect, an embodiment of the present application provides an image processing apparatus, including:

the image acquisition module is used for acquiring a lung CT scanning image to be processed;

the image input module is used for inputting the lung CT scanning image into a pre-trained image processing model; the image processing model is generated based on final target vector training, the final target vector is generated based on the existing clinical information and a plurality of prediction vectors, and the plurality of prediction vector features are generated based on feature fusion between feature vectors of each image in the existing historical lung CT scanning image sequence and shooting interval time between adjacent images;

And the image output module is used for outputting a plurality of lung nodule parameter values corresponding to the lung CT scan image.

Optionally, the apparatus further includes:

the model building module is used for building an image processing model; the image processing model comprises a lung nodule detection network, a feature extractor, a normalization layer and a full connection layer;

the image sequence acquisition module is used for acquiring the existing historical lung CT scanning image sequences in a plurality of continuous periods;

the combined feature vector sequence generation module is used for generating a plurality of combined feature vector sequences according to the historical lung CT scanning image sequence, a lung nodule detection network, a feature extractor and a normalization layer;

the parameter calculation module is used for calculating a first search key and first information content of each feature vector in each combined feature vector sequence according to the full connection layer;

the feature fusion module is used for acquiring shooting interval duration of the historical lung CT scanning image corresponding to each feature vector, and generating a plurality of prediction feature vectors after feature fusion according to a first search key, first information content and shooting interval duration of each feature vector in each combined feature vector sequence;

The final target vector generation module is used for extracting the existing clinical information, and generating a final target vector after feature fusion is carried out on the clinical information and the plurality of prediction feature vectors;

and the model generation module is used for calculating a cross entropy loss value based on the final target vector and generating a pre-trained image processing model when the cross entropy loss value reaches the minimum.

In a third aspect, embodiments of the present application provide a computer storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor and to perform the above-described method steps.

In a fourth aspect, embodiments of the present application provide a terminal, which may include: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps described above.

The technical scheme provided by the embodiment of the application can comprise the following beneficial effects:

in the embodiment of the application, an image processing device firstly acquires a lung CT scanning image to be processed, then inputs the image into a pre-trained image processing model, and finally outputs a plurality of lung nodule parameter values corresponding to the lung CT scanning image; the image processing model is generated based on final target vector training, the final target vector is generated by feature fusion based on the existing clinical information and a plurality of predictive vectors, and the plurality of predictive vector features are generated by feature fusion based on shooting interval duration between feature vectors of each image in the existing historical lung CT scanning image sequence and adjacent images. According to the method and the device, the feature fusion is carried out on the plurality of prediction vectors generated according to the existing CT images in a plurality of continuous periods and the shooting interval time between the adjacent images, and then the feature fusion is carried out on the plurality of fused features and the vectors corresponding to the existing clinical information, so that the final feature value is richer, the training model accuracy is higher, and the accuracy of identifying the lung nodule parameters in the images is further improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

Fig. 1 is a schematic flow chart of an image processing method according to an embodiment of the present application;

FIG. 2 is a flowchart of an image processing model training method according to an embodiment of the present disclosure;

FIG. 3 is a process schematic diagram of an image processing model training process provided in an embodiment of the present application;

FIG. 4 is a schematic illustration of lung node changes at different moments in time provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of a process for fusing identical nodule features at different timings according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a process for fusing existing clinical information with unified nodule features according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;

fig. 8 is a schematic structural view of another image processing apparatus provided in the embodiment of the present application;

Fig. 9 is a schematic structural diagram of a combined feature vector sequence generating module according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a feature fusion module according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a final target vector generation module according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Detailed Description

The following description and the drawings sufficiently illustrate specific embodiments of the invention to enable those skilled in the art to practice them.

It should be understood that the described embodiments are merely some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention as detailed in the accompanying claims.

In the description of the present invention, it should be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art. Furthermore, in the description of the present invention, unless otherwise indicated, "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.

The application provides an image processing method, an image processing device, a storage medium and a terminal, so as to solve the problems in the related technical problems. In the technical scheme provided by the application, since the feature fusion is carried out on a plurality of prediction vectors generated according to the existing CT images in a plurality of continuous periods and shooting interval time between adjacent images, and then the feature fusion is carried out on a plurality of fused features and vectors corresponding to the existing clinical information, the final feature value is richer, the training model accuracy is higher, the accuracy of identifying lung nodule parameters in the images is further improved, and the detailed description is carried out by adopting an exemplary embodiment.

The image processing method provided in the embodiment of the present application will be described in detail with reference to fig. 1 to 6. The method may be implemented in dependence on a computer program, and may be run on an image processing device based on von neumann systems. The computer program may be integrated in the application or may run as a stand-alone tool class application.

Referring to fig. 1, a flowchart of an image processing method is provided in an embodiment of the present application. As shown in fig. 1, the method of the embodiment of the present application may include the following steps:

s101, acquiring a lung CT scanning image to be processed;

the CT is an electronic computer tomography, which uses precisely collimated X-ray beams, gamma rays, ultrasonic waves and the like to scan a section around a certain part of a human body together with a detector with extremely high sensitivity, has the characteristics of quick scanning time, clear images and the like, and can be used for checking various diseases; the rays used can be classified differently according to the type: x-ray CT (X-CT), gamma-ray CT (gamma-CT), and the like. The lung CT scan image is an image obtained by continuously scanning the lung of a human body by adopting electronic computer tomography.

In general, the lung CT scan image to be processed may be obtained from a local gallery or may be a CT scan image received in real time.

In one possible implementation, when the image processing to be processed is required, a local gallery is first opened, a large number of CT scan images are stored in the gallery in advance, and then the lung CT scan image to be processed is searched and acquired from the large number of CT scan images.

In another possible implementation manner, when image processing to be processed is required, firstly, the user terminal and the CT scanner are in communication connection, after receiving a scanning instruction, the CT scanner performs multiple times of scanning on the image to be processed, then the lung CT scanned image generated after scanning is sent to the user terminal, and the user terminal can obtain the lung CT scanned image to be processed.

S102, inputting the lung CT scanning image into a pre-trained image processing model;

the image processing model is generated based on final target vector training, the final target vector is generated by feature fusion based on the existing clinical information and a plurality of predictive vectors, and the plurality of predictive vector features are generated by feature fusion based on the shooting interval duration between the feature vector of each image in the existing historical lung CT scanning image sequence and the adjacent image.

Typically, the pre-trained image processing model is a mathematical model that predicts the risk of lung cancer, and the image processing model is composed of a lung nodule detection network, a feature extractor, a normalization layer, and a fully connected layer.

Specifically, the lung nodule detection network is a neural network for detecting lung nodules in lung CT scan images, the network is trained in advance, and the network can output lung nodules existing in CT images only by inputting the lung CT scan images to be processed into the lung nodule detection network.

In one possible implementation manner, after acquiring the lung CT scan image based on step S101, the user terminal first accesses a pre-trained image processing model, then inputs the lung CT scan image into the pre-trained image processing model for processing, and finally outputs a plurality of lung nodule parameter values corresponding to the lung CT scan image after the model processing is finished.

And S103, outputting a plurality of lung nodule parameter values corresponding to the lung CT scan image.

In one possible implementation, after obtaining the plurality of lung nodule parameter values, the plurality of lung nodule parameter values may be reference data for a doctor's disease diagnosis from which the doctor may determine a current condition.

Referring to fig. 2, a flowchart of an image processing model training method is provided in an embodiment of the present application. As shown in fig. 2, the method of the embodiment of the present application may include the following steps:

s201, constructing an image processing model;

the image processing model comprises a lung nodule detection network, a feature extractor, a normalization layer and a full connection layer;

in the embodiment of the application, an initial image processing model needs to be built in training the image processing model, firstly, a lung nodule detection network, a feature extractor, a normalization layer and a full connection layer which are existing at present are obtained according to model construction parameters, and then the image processing model is built according to the lung nodule detection network, the feature extractor, the normalization layer and the full connection layer.

S202, acquiring an existing historical lung CT scanning image sequence of a plurality of continuous periods;

wherein, the historical lung CT scan image sequence has a plurality of lung CT scan images, and the plurality of lung CT scan images are continuous for a plurality of periods.

For example, a history of a plurality of consecutive periods of timeThe data set of the lung CT scan image in the lung CT scan image sequence is:

wherein n represents the number of CT pictures contained in the dataset, " >

CT pictures representing the ith patient taken in the T-th year, similarly +.>

Representing CT pictures taken by the ith patient in the T-1 th year, y _i A label for whether the patient has cancer within one year.

S203, generating a plurality of combined feature vector sequences according to the historical lung CT scanning image sequences, a lung nodule detection network, a feature extractor and a normalization layer;

in the embodiment of the application, after a historical lung CT scan image sequence is obtained, an image with the smallest distance from the current moment in the historical lung CT scan image sequence is first determined as a target lung CT scan image, then the target lung CT scan image is input into a lung nodule detection network, a plurality of lung nodule data corresponding to the target lung CT scan image are output, then a plurality of CT image dicing sequences with preset sizes are cut from each historical lung CT scan image according to the plurality of lung nodule data, then each CT image dicing sequence is input into a feature extractor, a plurality of first feature vectors of each CT image dicing sequence are output, a plurality of second feature vectors of each first feature vector in the plurality of first feature vectors are extracted, and after a normalization layer is input, a plurality of feature vectors of each CT image dicing sequence are generated, and finally a plurality of feature vectors of the same lung nodule on different time sequences are selected from the plurality of second feature vectors of each CT image dicing sequence and combined, so that a plurality of combined feature vector sequences are obtained.

In one possible implementation, such as shown in FIG. 3, a sequence of images is first CT scanned from the historic lung

In which a historical lung CT scan closest to the current time is determined, for example +.>

And then->

Input in pulmonary nodule detection network D output +.>

Corresponding plurality of lung nodule data

Where m represents the number of nodules predicted to be contained in the sample, (x) _j ,y _j ,z _j ,p _j ) Representing the center point coordinates of each nodule and the probability of the nodule, selecting the n lung nodules with the highest probability from the m lung nodules according to the probability of each lung nodule, and then using the n lung nodule center point coordinates (x _j ,y _j ,z _j ) Centered, in each case in the sequence of historic lung CT scan images +.>

And->

The cubes of 96X 96 were cut up to give predicted cut blocks of +.>

And->

The predicted slices are input into the feature extractor E, a plurality of first feature vectors of each CT image slice sequence are output, and finally 3 groups of feature vectors with the size of 5 multiplied by 64 multiplied by 24 are output. Since the nodules are all located at the very center of each segment and have a substantially smaller volume, it is also necessary to extract the very center of each feature vector to obtainFeatures 5 x 64 x 12 and through the maxpooling layer, the feature vectors of the CT scanning images at different periods are finally obtained as follows:

And->

The size of the feature vectors is 5 multiplied by 64, and finally, a plurality of feature vectors of the same lung nodule on different time sequences are selected from the feature vectors of CT scanning images on different periods to be combined, so that a plurality of combined feature vector sequences are obtained.

For example, as shown in fig. 4, the same patient takes CT images at adjacent times, and the change trend of the nodule is changed. Wherein (1) - (2) it can be seen that the nodule increases significantly with time and that the nodule sizes of (3) - (4) do not change significantly on images taken at different times.

S204, calculating a first search key and first information content of each feature vector in each combined feature vector sequence according to the full connection layer;

in one possible implementation, for example, as shown in fig. 5, after combining a plurality of feature vectors of the same lung nodule at different time sequences to obtain a plurality of combined feature vector sequences, calculating a first search key and a first information content of each feature vector in each combined feature vector sequence according to a full connection layer, where a calculation formula is as follows:

q ⁱ ＝W _q f ⁱ ,v ⁱ ＝W _v f ⁱ ,i＝{T-2,T-1,T}；

wherein W is _q And W is _v All are all full connection layers, f ⁱ As the feature vector of 1×64, q of 1×6 size and v of 1×64 size are obtained after calculation of formula 1. Where q represents the search key for each feature and v represents the information content of each feature.

S205, acquiring shooting interval duration of a historical lung CT scanning image corresponding to each feature vector, and generating a plurality of predicted feature vectors after feature fusion according to a first search key, first information content and shooting interval duration of each feature vector in each combined feature vector sequence;

in the embodiment of the application, when generating a plurality of prediction feature vectors, each feature vector in each combined feature vector sequence and shooting interval duration corresponding to the feature vector are fused to generate first fusion features of each feature vector, then an attention weight value of each feature vector is calculated according to the first fusion features of each feature vector, and finally the attention weight value of each feature vector and first information content corresponding to the attention weight value of each feature vector are multiplied and summed to generate a plurality of prediction feature vectors.

For example, as shown in fig. 5, after obtaining the first search key and the first information content of each feature vector, since the CT image is a representation of the nodule photographed at different times, we add days of different time intervals to the vector q, so q becomes a 1×7 vector. The addition of time intervals facilitates the network learning of more rich and useful information. Then, the attention value of each nodule feature vector is calculated, and the attention weight of each moment searching key is calculated by using the searching key at the moment T and the searching keys at the moment T-2 and T-1 respectively because the latest shot nodule feature is the most useful feature information. The calculation formula is as follows:

s ⁱ ＝<q ⁱ ,q _m >,i＝{T-2,T-1,T}；

Wherein,,<.,.>is a vector inner product operation. Let q _m ＝q ^T 。

Finally, looking up the attention weight a of the key at each moment ⁱ And a first information content v corresponding thereto ⁱ After multiplication and addition, a plurality of vectors { h } with the size of 1×64 are generated ₁ ,h ₂ ,h ₃ ,h ₄ ,h ₅ }. The calculation formula is as follows:

s206, extracting the existing clinical information, and generating a final target vector after feature fusion is carried out on the clinical information and a plurality of predicted feature vectors;

in the embodiment of the application, firstly, a key feature vector is constructed according to the existing clinical information, then, a second search key and a second information content of each predictive feature vector in a plurality of predictive feature vectors are calculated, then, the attention weight of each predictive feature vector is calculated based on the key feature vector and the second search key of each predictive feature vector, and finally, the attention weight of each predictive feature vector and the second information content corresponding to the attention weight of each predictive feature vector are multiplied and summed to generate a final target vector.

In one possible implementation, for example as shown in fig. 6, 7 clinical information of relatively great importance is first selected: { age, gender, whether smoking, location of maximum nodule in CT map, diameter of maximum nodule in CT map, edge of maximum nodule in CT map, total number of nodules in CT map }. Constructing the vector into a vector of 1 multiplied by 7, and finally obtaining a vector of 1 multiplied by 6, namely a key feature vector q through one layer of full connection _m . Then a plurality of vectors { h } with the size of 1×64 ₁ ,h ₂ ,h ₃ ,h ₄ ,h ₅ After the transformation, q of 1×6 size and v of 1×64 size can be obtained. Where q represents the search key for each feature, v represents the information content of each feature, and then by the key feature vector q _m And q of 1×6 size calculates an attention value of each nodule feature vector, the calculation formula of the attention value being:

s _i ＝<q _i ,q _m >i= {1,2,..5 }, wherein<.,.>Is a vector inner product operation. Q herein _m Is the vector obtained from the clinical information in the first step, and thus the attention weight of each nodule is calculated separately using the clinical attributes. And multiplying and adding each attention value and the corresponding information vector v to obtain a final target vector b. The calculation formula of the final target vector b is:

s207, calculating a cross entropy loss value based on the final target vector, and generating a pre-trained image processing model when the cross entropy loss value reaches the minimum.

In the embodiment of the present application, after the final target vector b is obtained, the cancer probability of the lung cancer patient is calculated according to the final target vector, then the cross entropy loss value is calculated according to the cancer probability of the lung cancer patient, and finally when the cross entropy loss value reaches the minimum, a pre-trained image processing model is generated, or when the cross entropy loss value does not reach the minimum, the network parameters of the image processing model are updated, and the step of obtaining the historical lung CT scan image sequence of the existing continuous multiple periods is continuously performed, so that the model training process is continuously performed. The calculation formula of the cross entropy loss value is as follows:

Wherein y is _i A label for whether the patient suffers from cancer within one year, p _i Is the cancer probability of a lung cancer patient.

The probability calculation formula of the cancer finally comprises the following steps: p=w _b b, wherein W is _b Network parameters for the image processing model.

Further, to enhance the interpretability of the model, seven clinical information with greater impact on lung cancer was used and normalized. When the self-attention mechanism is used for fusing the nodule characteristics, the normalized clinical information and each nodule are compared to calculate the weight. Thus, the proportion of the nodules with similar clinical attributes is increased, the prediction accuracy can be improved, and the medical interpretability of the model is greatly enhanced.

The following are examples of the apparatus of the present invention that may be used to perform the method embodiments of the present invention. For details not disclosed in the embodiments of the apparatus of the present invention, please refer to the embodiments of the method of the present invention.

Referring to fig. 7, a schematic diagram of an image processing apparatus according to an exemplary embodiment of the present invention is shown. The image processing apparatus may be implemented as all or part of the terminal by software, hardware or a combination of both. The apparatus 1 comprises an image acquisition module 10, an image input module 20, a cancer risk determination module 30.

An image acquisition module 10 for acquiring a lung CT scan image to be processed;

an image input module 20 for inputting the lung CT scan image into a pre-trained image processing model; the image processing model is generated based on final target vector training, the final target vector is generated by feature fusion based on the existing clinical information and a plurality of predictive vectors, and the plurality of predictive vector features are generated by feature fusion based on shooting interval duration between feature vectors of each image in the existing historical lung CT scanning image sequence and adjacent images;

the image output module 30 is configured to output a plurality of lung nodule parameter values corresponding to the lung CT scan image.

Optionally, as shown in fig. 8, the apparatus 1 further includes:

a model construction module 40 for constructing an image processing model; the image processing model comprises a lung nodule detection network, a feature extractor, a normalization layer and a full connection layer;

an image sequence acquisition module 50 for acquiring an existing historical lung CT scan image sequence for a plurality of consecutive time periods;

a combined feature vector sequence generating module 60, configured to generate a plurality of combined feature vector sequences according to the historical lung CT scan image sequence, the lung nodule detection network, the feature extractor, and the normalization layer;

a parameter calculation module 70, configured to calculate a first search key and a first information content of each feature vector in each of the combined feature vector sequences according to the full connection layer;

the feature fusion module 80 is configured to obtain a shooting interval duration of a historical lung CT scan image corresponding to each feature vector, and generate a plurality of predicted feature vectors after feature fusion according to a first search key, a first information content and a shooting interval duration of each feature vector in each combined feature vector sequence;

a final target vector generation module 90, configured to extract the existing clinical information, and generate a final target vector after feature fusion with the plurality of predicted feature vectors according to the clinical information;

The model generation module 100 is configured to calculate a cross entropy loss value based on the final target vector, and generate a pre-trained image processing model when the cross entropy loss value reaches a minimum.

Optionally, as shown in fig. 9, the combined feature vector sequence generating module 60 includes:

a target lung CT scan image determining unit 601, configured to determine an image with a smallest distance from a current time interval in the historical lung CT scan image sequence as a target lung CT scan image;

a plurality of lung nodule data output units 602, configured to input the target lung CT scan image into the lung nodule detection network, and output a plurality of lung nodule data corresponding to the target lung CT scan image;

a CT image dicing unit 603, configured to dice a plurality of CT image dicing sequences of a preset size from each of the historical lung CT scan images according to the plurality of lung nodule data;

a first feature vector output unit 604, configured to input each of the CT image slice sequences into the feature extractor, and output a plurality of first feature vectors of each of the CT image slice sequences;

a second feature vector output unit 605, configured to extract a positive center feature of each of the plurality of first feature vectors and generate a plurality of second feature vectors of each CT image slice sequence after inputting the positive center feature into the normalization layer;

The combined feature vector sequence generating unit 606 is configured to select and combine a plurality of feature vectors of the same lung nodule on different time sequences from the plurality of second feature vectors of each CT image slice sequence, so as to obtain a plurality of combined feature vector sequences.

Optionally, as shown in fig. 10, the feature fusion module 80 includes:

a first fusion feature generating unit 801, configured to fuse each feature vector in each combined feature vector sequence with the shooting interval duration corresponding to each feature vector, and generate a first fusion feature of each feature vector;

an attention weight calculation unit 802, configured to calculate an attention weight of each feature vector according to the first fused feature of each feature vector;

a predictive feature vector generating unit 803 is configured to multiply and sum the attention weight value of each feature vector and the first information content corresponding to the attention weight value of each feature vector, and generate a plurality of predictive feature vectors.

Optionally, as shown in fig. 11, the final target vector generating module 90 includes:

a key feature vector construction unit 901, configured to construct a key feature vector according to the existing clinical information;

A parameter calculating unit 902, configured to calculate a second search key and a second information content of each of the plurality of prediction feature vectors;

an attention weight calculation unit 903, configured to calculate an attention weight of each predicted feature vector based on the key feature vector and a second search key of each predicted feature vector;

a final target vector generating unit 904, configured to multiply and sum the attention weight of each prediction feature vector and the second information content corresponding to the attention weight of each prediction feature vector, to generate a final target vector.

It should be noted that, in the image processing apparatus provided in the foregoing embodiment, when the image processing method is executed, only the division of the foregoing functional modules is used as an example, in practical application, the foregoing functional allocation may be performed by different functional modules, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the image processing apparatus and the image processing method provided in the foregoing embodiments belong to the same concept, which represents a detailed implementation process in the method embodiment, and are not described herein again.

The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.

The present invention also provides a computer readable medium having stored thereon program instructions which, when executed by a processor, implement the image processing method provided by the above-described respective method embodiments.

The invention also provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the image processing method of the various method embodiments described above.

Referring to fig. 12, a schematic structural diagram of a terminal is provided in an embodiment of the present application. As shown in fig. 12, terminal 1000 can include: at least one processor 1001, at least one network interface 1004, a user interface 1003, a memory 1005, at least one communication bus 1002.

Wherein the communication bus 1002 is used to enable connected communication between these components.

The user interface 1003 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 1003 may further include a standard wired interface and a wireless interface.

The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.

Wherein the processor 1001 may include one or more processing cores. The processor 1001 connects various parts within the entire electronic device 1000 using various interfaces and lines, and performs various functions of the electronic device 1000 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 1005, and invoking data stored in the memory 1005. Alternatively, the processor 1001 may be implemented in at least one hardware form of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 1001 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 1001 and may be implemented by a single chip.

The Memory 1005 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 1005 includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). The memory 1005 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 1005 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the above-described respective method embodiments, etc.; the storage data area may store data or the like referred to in the above respective method embodiments. The memory 1005 may also optionally be at least one storage device located remotely from the processor 1001. As shown in fig. 12, an operating system, a network communication module, a user interface module, and an image processing application program may be included in the memory 1005, which is one type of computer storage medium.

In the terminal 1000 shown in fig. 12, a user interface 1003 is mainly used for providing an input interface for a user, and acquiring data input by the user; and the processor 1001 may be configured to call an image processing application program stored in the memory 1005, and specifically perform the following operations:

Acquiring a lung CT scanning image to be processed;

In one embodiment, the processor 1001, when executing the generation of the pre-trained image processing model, specifically performs the following operations:

In one embodiment, the processor 1001, when executing the generation of a plurality of combined feature vector sequences from the historical lung CT scan image sequence, the lung nodule detection network, the feature extractor, the normalization layer, specifically performs the following:

In one embodiment, the processor 1001 specifically performs the following operations when performing feature fusion according to the first search key, the first information content, and the shooting interval duration of each feature vector in each combined feature vector sequence to generate a plurality of predicted feature vectors:

In one embodiment, the processor 1001, when performing feature fusion with a plurality of predicted feature vectors according to clinical information to generate a final target vector, specifically performs the following operations:

In one embodiment, the processor 1001, when executing the generation of the pre-trained image processing model when the cross entropy loss value reaches a minimum, specifically performs the following operations:

when the cross entropy loss value does not reach the minimum, updating network parameters of the image processing model, and continuing to perform the step of acquiring the existing historical lung CT scan image sequence for a plurality of consecutive periods.

Those skilled in the art will appreciate that implementing all or part of the above-described embodiment methods may be accomplished by computer programs to instruct related hardware, and that the image processing program may be stored in a computer readable storage medium, which when executed may include the embodiment methods described above. The storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, or the like.

The foregoing disclosure is only illustrative of the preferred embodiments of the present application and is not intended to limit the scope of the claims herein, as the equivalent of the claims herein shall be construed to fall within the scope of the claims herein.

Claims

1. An image processing method, the method comprising:

acquiring a lung CT scanning image to be processed;

inputting the lung CT scanning image into a pre-trained image processing model; the image processing model is generated based on final target vector training, the final target vector is generated based on the existing clinical information and a plurality of prediction vectors, and the plurality of prediction vector features are generated based on feature fusion between feature vectors of each image in the existing historical lung CT scanning image sequence and shooting interval time between adjacent images;

2. The method of claim 1, wherein generating the pre-trained image processing model comprises:

generating a plurality of combined feature vector sequences according to the historical lung CT scanning image sequence, a lung nodule detection network, a feature extractor and a normalization layer;

acquiring shooting interval time of a historical lung CT scanning image corresponding to each feature vector, and generating a plurality of predicted feature vectors after feature fusion according to a first search key, first information content and shooting interval time of each feature vector in each combined feature vector sequence;

extracting the existing clinical information, and generating a final target vector after feature fusion is carried out on the clinical information and the plurality of predicted feature vectors;

And calculating a cross entropy loss value based on the final target vector, and generating a pre-trained image processing model when the cross entropy loss value reaches the minimum.

3. The method of claim 2, wherein the generating a plurality of combined feature vector sequences from the historical lung CT scan image sequence, a lung nodule detection network, a feature extractor, a normalization layer comprises:

inputting the target lung CT scanning image into the lung nodule detection network, and outputting a plurality of lung nodule data corresponding to the target lung CT scanning image;

cutting a plurality of CT image cutting sequences with preset sizes from each of the historical lung CT scanning images according to the lung nodule data;

inputting each CT image slicing sequence into the feature extractor, and outputting a plurality of first feature vectors of each CT image slicing sequence;

And selecting and combining a plurality of feature vectors of the same lung nodule on different time sequences from the plurality of second feature vectors of each CT image block sequence to obtain a plurality of combined feature vector sequences.

4. The method of claim 3, wherein generating a plurality of predicted feature vectors after feature fusion according to the first search key, the first information content, and the shooting interval duration of each feature vector in each of the combined feature vector sequences comprises:

and multiplying and summing the attention weight value of each feature vector and the first information content corresponding to the attention weight value of each feature vector to generate a plurality of prediction feature vectors.

5. The method of claim 4, wherein generating a final target vector from the feature fusion of the clinical information and the plurality of predicted feature vectors comprises:

calculating the attention weight of each prediction feature vector based on the key feature vector and a second search key of each prediction feature vector;

and multiplying and summing the attention weight of each prediction feature vector and the second information content corresponding to the attention weight of each prediction feature vector to generate a final target vector.

6. The method of claim 2, wherein generating a pre-trained image processing model when the cross entropy loss value reaches a minimum comprises:

7. An image processing apparatus, characterized in that the apparatus comprises:

8. The apparatus of claim 7, wherein the apparatus further comprises:

9. A computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the method steps of any of claims 1-6.

10. A terminal, comprising: a processor and a memory; wherein the memory stores a computer program adapted to be loaded by the processor and to perform the method steps of any of claims 1-6.