CN116797828A

CN116797828A - Method and device for processing oral full-view film and readable storage medium

Info

Publication number: CN116797828A
Application number: CN202310700778.2A
Authority: CN
Inventors: 韩婧; 刘剑楠; 翟广涛
Original assignee: Ninth Peoples Hospital Shanghai Jiaotong University School of Medicine
Current assignee: Ninth Peoples Hospital Shanghai Jiaotong University School of Medicine
Priority date: 2023-06-14
Filing date: 2023-06-14
Publication date: 2023-09-22

Abstract

The invention relates to a method and a device for processing oral full-view film and a readable storage medium, and relates to the technical field of image processing, wherein the method comprises the following steps: acquiring oral cavity full-scenery patches, classifying and marking the oral cavity full-scenery patches, and establishing an oral cavity full-scenery patch database; acquiring an original oral panoramic film, and reconstructing the original oral panoramic film to obtain a high-resolution image; and constructing and training a recognition network and a classification network, and recognizing the high-resolution image by using the trained recognition network and the trained classification network to obtain a recognition result. The invention extracts the multi-level features of the image based on the convolutional neural network in the deep learning, maps the global features into the high-resolution image by using the multi-level perceptron, designs the classification network based on the DenseNet and the attention mechanism module and the identification network based on the YOLO-X, and can more accurately identify and classify the oral full-view film.

Description

Method and device for processing oral full-view film and readable storage medium

Technical Field

The invention relates to the technical field of image processing, in particular to a method and a device for processing oral cavity full-view pictures and a readable storage medium.

Background

Currently, the jawbone is located in the middle and lower third of the face, which is an important tissue organ for maintaining facial form and function. The full-view film of the oral cavity is the most commonly used and economical and convenient imaging examination means for the dental examination, the medical image classification based on the deep learning method is the current computer-aided processing and analyzing of medical image big data, and certain results are achieved in the artificial intelligence research in the oral field.

However, the recognition of the oral cavity full-view film is mainly realized by manpower, is easily influenced by personal experience, and reduces the recognition efficiency and accuracy.

Therefore, how to provide an oral panorama processing method capable of solving the above-mentioned problems is a problem that a person skilled in the art needs to solve.

The chinese patent application CN113516639a provides a training method of a detection model based on oral panorama X-ray film, in the process of training the model, resolution improvement processing is not performed on an input image, the resolution of the input image cannot be ensured, and in the subsequent training process, confidence parameters and loss parameters need to be calculated, the overall processing process is complex, and the training accuracy of the model cannot be ensured.

The Chinese patent application CN113888535A discloses a wisdom tooth resistance generation type identification method and a wisdom tooth resistance generation type identification system, and the specific treatment process mainly comprises the following steps: processing the acquired oral root tip image to be identified by using a trained wisdom tooth segmentation model to obtain a segmentation image; cutting wisdom tooth areas of the segmented image to obtain wisdom tooth area images; the training method has the advantages that the trained anti-living type recognition model is utilized to process the wisdom tooth region image, the feature enhancement pretreatment is only carried out on the teeth region contained in the image in the process of training the wisdom tooth segmentation model, no treatment is carried out on the image in the subsequent training process of the recognition network, the resolution improvement treatment is not carried out on the input image, the training precision of the model is reduced, the structure of the segmentation model and the recognition model is not improved, and the recognition precision is lower.

Disclosure of Invention

The present invention is directed to a method and apparatus for processing oral full-view film and a readable storage medium for overcoming the defects of the prior art.

The aim of the invention can be achieved by the following technical scheme:

as a first aspect of the present invention, there is provided a method for processing an oral cavity full-scope film, comprising the steps of:

acquiring oral cavity full-scenery patches, classifying and marking the oral cavity full-scenery patches, and establishing an oral cavity full-scenery patch database;

acquiring an original oral panoramic film, and reconstructing the original oral panoramic film to obtain a high-resolution image;

constructing and training an identification network and a classification network;

and carrying out recognition processing on the high-resolution image by using the trained recognition network and the trained classification network to obtain a recognition result.

Preferably, the specific process of reconstructing the original oral full-view film comprises the following steps:

downsampling the original oral panorama sheet;

establishing a reconstruction network, and extracting high-dimensional characteristics and corresponding coordinates of the original oral cavity full-view film subjected to downsampling treatment by using the reconstruction network;

upsampling the high-dimensional features and combining the coordinates to form a feature matrix;

and processing the feature matrix by using the reconstruction network to obtain a high-resolution image.

Preferably, the specific structure of the reconstruction network comprises an encoder and a decoder which are sequentially connected, and the encoder adopts a convolutional neural network to extract multi-level characteristics of the image; the decoder uses a form of hidden function to map global features into high resolution images using a multi-layer perceptron.

Preferably, the construction and training of the identification network and the classification network; the high-resolution image is identified by utilizing the trained identification network and the classification network, and the specific processing process for obtaining the identification result comprises the following steps:

dividing images contained in the oral panorama database into a training set and a testing set;

constructing the identification network and the classification network, training the identification network and the classification network by utilizing the training set, and storing the identification network and the classification network when the loss value gradually decreases to be stable;

and respectively testing the identification network and the classification network by using the test set, and carrying out identification processing on the high-resolution image by using the tested identification network and the classification network.

Preferably, in the process of training the classification network, firstly, the ROI of the interest is extracted from the training set and the testing set, and the training and testing are performed on the classification network by using the ROI of the training set and the testing set.

Preferably, the identification network is a YOLO-X network, and the classification network includes a DenseNet and an attention mechanism module connected in sequence.

As a second aspect of the present invention, there is provided a processing apparatus using a method for processing an oral full-film according to any one of the above, comprising:

the construction module is used for acquiring oral cavity full-view pictures, classifying and marking the oral cavity full-view pictures and establishing an oral cavity full-view picture database;

the reconstruction module comprises a reconstruction network and is used for acquiring an original oral panoramic film and reconstructing the original oral panoramic film to obtain a high-resolution image;

the recognition module comprises a recognition network and a classification network, and is used for constructing and training the recognition network and the classification network, and recognizing the high-resolution image by using the trained recognition network and the trained classification network to obtain a recognition result.

Preferably, the specific structure of the reconstruction network comprises an encoder and a decoder which are connected in sequence;

the encoder comprises a convolutional neural network and is used for extracting multi-level features of an image;

the decoder takes the form of a hidden function comprising a multi-layer perceptron for mapping global features into high resolution images.

As a third aspect of the present invention, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the oral full-film processing method according to any one of claims 1 to 6.

Compared with the prior art, the invention has the following beneficial effects:

the invention discloses a method, a device and a storage medium for processing full-view oral cavity pictures, which are based on a convolutional neural network in deep learning and design a classification network to correctly identify and classify common jaw diseases in normal full-view oral cavity pictures.

1) The super-resolution reconstruction method is used for improving the problem that the diagnosis is affected due to low resolution in remote transmission and analysis of the oral cavity full-view film, and realizing efficient transmission and identification. In the established reconstruction network, the encoder adopts a convolutional neural network to extract the multi-level characteristics of the image; the decoder takes the form of a hidden function. A multi-layer perceptron MLP is used to map global features into high resolution images. Compared with a digital image represented in a discrete pixel form, the hidden function can establish a mapping relation between coordinates and a high-resolution image through a continuous function (MLP) by means of the advantage of coordinate approximation continuity, so that a more fine and high-fidelity high-resolution image can be reconstructed.

2) And analyzing and processing the full-view image of the oral cavity based on a convolutional neural network in deep learning, automatically identifying and marking the tissue structure, and qualitatively analyzing the focus to realize intelligent classification. The identification network adopts a YOLO-X network, and positions tumors in a rectangular frame in the form of candidate frames, so that position information is provided for a subsequent intelligent classification algorithm; the classification network comprises DenseNet and an attention mechanism module which are connected in sequence, and the dense connection structure of the DenseNet can solve the degradation problem caused by the increase of the network depth, and meanwhile, the dense connection also has the advantages of enhancing the image signal transmission and the image characteristic reuse, thereby being beneficial to constructing a deeper intelligent tumor classification network. The attention mechanism module consists of two parts, channel attention and spatial attention, which can show interdependencies between modeled network channels and between network spaces. The network is applied to the processing of the oral cavity full-view film, and can adaptively calibrate the channel and the space weight, thereby being beneficial to distinguishing the characteristics contained in the image.

In summary, the invention realizes the preliminary screening and classification of the jawbone diseases by applying the deep convolution network in the oral cavity full-scene analysis, reduces the workload of doctors, and is beneficial to and relieves the problems of the continuous increase of the oral cavity medical field, the medical resource shortage and the maldistribution.

Drawings

FIG. 1 is a general flow chart of a method for processing full view of an oral cavity according to the present invention;

FIG. 2 is a schematic block diagram of a reconstruction network according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a structure of an identification network and a classification network according to an embodiment of the present invention;

fig. 4 is a schematic block diagram of a device for processing full-view oral films according to the present invention.

Detailed Description

The invention will now be described in detail with reference to the drawings and specific examples. The present embodiment is implemented on the premise of the technical scheme of the present invention, and a detailed implementation manner and a specific operation process are given, but the protection scope of the present invention is not limited to the following examples.

Example 1

Referring to fig. 1, the embodiment of the invention discloses a method for processing oral cavity full-view tablets, which comprises the following steps:

s1, acquiring oral cavity full-view pictures, classifying and marking the oral cavity full-view pictures, and establishing an oral cavity full-view picture database;

s2, acquiring an original oral cavity full-view film, and reconstructing the original oral cavity full-view film to obtain a high-resolution image;

and S3, constructing and training a recognition network and a classification network, and recognizing the high-resolution image by using the trained recognition network and the trained classification network to obtain a recognition result.

Specifically, the specific process for establishing the database of the oral cavity full-view film comprises the following steps:

according to the jawbone disease, four common diseases, namely osteomyelitis, cyst, benign tumor and malignant tumor cases are selected from pathological diagnosis system, about 500-600 cases are selected for first diagnosis treatment, and the recurrent cases after treatment are excluded. The young patient with orthodontic treatment or orthognathic treatment is selected from the imaging system with about 500-600 normal full-view shots with more complete dentitions.

And searching oral panoramic X-ray films in an image system according to the pathological categories, and selecting panoramic X-ray films before treatment instead of panoramic films after treatment. And classifying and packaging panoramic sheets of different major diseases, marking, and roughly marking the positions and the ranges of the diseases on the panoramic sheets so as to facilitate subsequent analysis.

In a specific embodiment, the specific process of reconstructing the original oral full view comprises:

downsampling an original oral panoramic sheet;

up-sampling the high-dimensional features and combining the coordinates to form a feature matrix;

and processing the feature matrix by using a reconstruction network to obtain a high-resolution image.

Referring to fig. 2, in a specific embodiment, the specific structure of the reconstruction network includes an encoder and a decoder connected in sequence, the encoder uses a convolutional neural network to extract the multi-level features of the image, and the decoder uses a form of hidden function. The hidden function uses a multi-layer perceptron (MLP) to map global features (coordinates + multi-level features) into high resolution images. Compared with a digital image expressed in a discrete pixel form, the hidden function can establish a mapping relation between coordinates and a high-resolution image through a continuous function (MLP) by means of the advantage of approximate continuity of coordinates when the network is applied to the processing of the oral cavity full-view film, so that a more fine and high-fidelity high-resolution image can be reconstructed.

In a specific embodiment, the specific processing procedure for constructing and training the recognition network and the classification network, and performing recognition processing on the high-resolution image by using the trained recognition network and the trained classification network to obtain a recognition result includes:

dividing images contained in the oral panorama database into a training set and a test set, wherein the data set can comprise 2500 Zhang Gusui of inflammation, cyst, benign tumor and malignant tumor pictures according to 7: the scale of 3 is divided into a training set and a test set, and 4 types of images follow even distribution in the 2 data sets;

constructing an identification network and a classification network, training the identification network and the classification network by using a training set, and storing the identification network and the classification network when the loss value gradually decreases to be stable;

and testing the identification network and the classification network by using the test set respectively, and identifying the high-resolution image by using the tested identification network and the tested classification network.

In a specific embodiment, during the training of the classification network, first, the region of interest (Region of Insterst, ROI) is extracted from the training set and the test set, and the classification network is trained and tested using the ROI of the training set and the test set.

Referring to fig. 3, in a specific embodiment, the identification network is a YOLO-X network, the YOLO-X network locates the tumor in a rectangular frame in the form of a candidate frame, provides location information for a subsequent intelligent classification algorithm, and the classification network includes a DenseNet and an attention mechanism module which are sequentially connected, so that the problem of degradation caused by the increase of network depth can be solved by a dense connection structure of the DenseNet, and meanwhile, the dense connection also has the advantages of enhancing the propagation of image signals and reuse of image features, thereby being beneficial to constructing a deeper intelligent classification network for the tumor. The attention mechanism module consists of two parts, namely channel attention and space attention, and can explicitly model the interdependence between network channels and between network spaces, and the network can be applied to the processing of the oral cavity full-view film, so that the self-adaptive weight of the channels and the spaces can be calibrated, and the characteristics contained in the images can be differentiated.

Example 2

Referring to fig. 4, an embodiment of the present invention further provides a processing apparatus using the method for processing oral full-view film according to any one of the above embodiments, including:

the construction module is used for acquiring the oral cavity full-view pictures, classifying and marking the oral cavity full-view pictures and establishing an oral cavity full-view picture database;

the reconstruction module is used for acquiring an original oral cavity full-view film and reconstructing the original oral cavity full-view film to obtain a high-resolution image; the reconstruction module comprises a specific structure of a reconstruction network and comprises an encoder and a decoder which are connected in sequence; the encoder comprises a convolutional neural network for extracting multi-level features of the image; the decoder takes the form of a hidden function comprising a multi-layer perceptron for mapping global features into high resolution images.

The recognition module is used for constructing and training a recognition network and a classification network, and recognizing and processing the high-resolution image by utilizing the trained recognition network and the trained classification network to obtain a recognition result. The identification network is a YOLO-X network, and the classification network comprises a DenseNet and an attention mechanism module which are connected in sequence.

Example 3

The embodiment of the invention also provides a computer readable storage medium, and a computer program is stored on the computer readable storage medium, and when the computer program is executed by a processor, the method for processing the oral cavity full-scope film according to any one of the above embodiments is realized.

The foregoing describes in detail preferred embodiments of the present invention. It should be understood that numerous modifications and variations can be made in accordance with the concepts of the invention by one of ordinary skill in the art without undue burden. Therefore, all technical solutions which can be obtained by logic analysis, reasoning or limited experiments based on the prior art by the person skilled in the art according to the inventive concept shall be within the scope of protection defined by the claims.

Claims

1. The method for processing the full-view oral cavity film is characterized by comprising the following steps of:

2. The method according to claim 1, wherein the specific process of reconstructing the original oral full view comprises:

downsampling the original oral panorama sheet;

3. The method according to claim 2, wherein the specific structure of the reconstruction network comprises an encoder and a decoder which are sequentially connected, and the encoder adopts a convolutional neural network to extract multi-level features of the image; the decoder uses a form of hidden function to map global features into high resolution images using a multi-layer perceptron.

4. The method according to claim 1, wherein the construction and training of the recognition network and the classification network; the high-resolution image is identified by utilizing the trained identification network and the classification network, and the specific processing process for obtaining the identification result comprises the following steps:

5. The method according to claim 4, wherein in training the classification network, the training set and the test set are first used to extract a region of interest ROI, and the training set and the test set are used to train and test the classification network.

6. The method according to claim 4, wherein the identification network is a YOLO-X network, and the classification network comprises a DenseNet and an attention mechanism module connected in sequence.

7. A processing apparatus for using the method for processing an oral full-film according to any one of claims 1 to 6, comprising:

8. A processing device according to claim 7, wherein the specific structure of the reconstruction network comprises an encoder and a decoder connected in sequence;

9. The processing device of claim 7, wherein the identification network is a YOLO-X network and the classification network comprises a DenseNet and attention mechanism module connected in sequence.

10. A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, which when executed by a processor, implements the oral full view film processing method according to any one of claims 1 to 6.