Disclosure of Invention
The application provides a bone marrow image cell image map detection and classification method based on deep learning, which specifically comprises the following steps: collecting and marking bone marrow image cell images, and determining a training set and a test set according to the bone marrow image cell images; preprocessing and data amplification are carried out on the data in the training set; performing preprocessing and data amplification in response to the data in the training set, and constructing a target detection model; constructing a target detection model comprises introducing general scoring loss and optimizing the target detection model according to the general scoring loss; in response to the completion of the construction of the target detection model, training the target detection model according to the processed training set; and responding to the completion of the training of the target detection model, inputting the test set into the target detection model, and outputting a classification detection result.
As above, wherein the myeloid image is a microscopic real myeloid image, the training set and the test set are randomly divided according to the real myeloid image.
The above method, wherein collecting and labeling the myeloid image, further comprises marking a target rectangular frame of all cells in the real myeloid image by using a labeling tool, generating coordinates of a top left corner vertex and a bottom right corner vertex of the target rectangular frame in a pixel coordinate system, and labeling corresponding categories, thereby generating a labeling file.
The method for preprocessing and amplifying the data in the training set comprises the steps of screening samples in the training set, removing samples of non-labeled and non-labeled categories, and taking the categories with the number of the cell samples being more than 100 to construct the data set.
As described above, the step of constructing the target detection model includes using the Faster R-CNN model as the basic model, optimizing the Faster R-CNN model, and using the optimized Faster R-CNN model as the target detection model in this embodiment.
As above, wherein Fast R-CNN is composed of two network parts, RPN and Fast R-CNN;
the integrity loss function for the RPN network is defined as:
wherein, L ({ p)
i},{t
i}) is a weighted sum, L
clsAs a loss function of the classification task, L
regAs a loss function of the regression task, t
iParameterize the processed coordinate vector for the ith stencil frame,
parameterizing a target rectangular frame corresponding to the ith template frame to obtain a processed coordinate vector p
iThe predicted probability of the ith stencil box,
for the real label corresponding to the ith template frame, N
clsFor the number of all stencil boxes, N
regFor the number of all cases, { p }
iRepresents the set of prediction probabilities for all stencil boxes, { t }
iRepresents the set of predicted coordinate vectors, λ, for all stencil boxes
1Are weighting coefficients.
As above, wherein the loss function L of the classification taskclsThe concrete expression is as follows:
pithe predicted probability of the ith template frame is N, and the number of the predicted template frames is N.
As above, wherein the loss function L of the regression taskregThe concrete expression is as follows:
wherein for any parameter z in c, d, w, h, the following is defined:
loss adoption of regression tasks
Calculating a function, wherein c and d are coordinates of the center point of the candidate rectangular frame, w and h are the length and width of the candidate rectangular frame respectively,
i-th coordinate value, v, of a predicted rectangular frame of the u-th class object
iThe ith coordinate value corresponding to the target rectangular frame is pointed.
As above, the candidate rectangular frame predicted by the RPN network is processed, specifically, the coordinates of the center point of the candidate rectangular frame and the length and width of the candidate rectangular frame are parameterized, so as to generate the 4-dimensional vector t.
A bone marrow image cell image detection classification system based on deep learning specifically comprises: the device comprises a data acquisition unit, a processing unit, a model building unit, a training unit and an output unit.
The data acquisition unit is used for acquiring, collecting and labeling the bone marrow image cell image map, and determining a training set and a test set according to the bone marrow image cell image map;
the processing unit is used for preprocessing and amplifying data in the training set;
the model construction unit is used for constructing a target detection model;
the training unit is used for training the target detection model according to the processed training set;
and the output unit is used for inputting the test set into the target detection model to obtain a detection result.
The application has the following beneficial effects:
the data used for constructing the target detection model in the application is simple in source, the required data is derived from the myeloid metaplasia cells under the real microscope visual field, a complex experiment process is not needed, and the cost is low. The target detection model constructed by the method is designed based on deep learning target detection and classification algorithm, the target cells can be automatically and accurately calibrated and 21 types of cells can be classified, and the classification result is efficient and accurate.
Detailed Description
The technical solutions in the embodiments of the present application are clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The application relates to a bone marrow image cell image detection and classification method and a bone marrow image cell image detection and classification system based on deep learning. According to the method and the device, the target cells can be automatically and accurately calibrated, and 21 types of cells can be classified.
Example one
As shown in fig. 1, the method for detecting and classifying myeloid images based on deep learning provided by the present application specifically includes the following steps:
step S110: collecting and labeling the bone marrow image cell image, and determining a training set and a testing set according to the bone marrow image cell image.
The embodiment stores data and marking information in a uniform format, and randomly divides a training set and a test set.
In this embodiment, the total number of bone marrow cell real images collected by the HDS-BFS high-speed micro-scanning imaging system of hidrosun is 4451, the resolution is 4000 × 3000, and the image of the real bone marrow cell under the microscope is shown in fig. 3. 4451 bone marrow cell real images were used as samples in the training set and the test set, respectively.
Further, marking a target rectangular frame of all cells in the real bone marrow image cell image by using a marking tool, generating coordinates of a top left corner vertex and a bottom right corner vertex of the target rectangular frame under a pixel coordinate system, and marking corresponding categories of all cells so as to generate a marking file. The labeling file comprises an X-axis coordinate and a Y-axis coordinate of the top left corner vertex of the target rectangular frame, and an X-axis coordinate and a Y-axis coordinate of the bottom right corner vertex.
Specifically, 21 classes of targets are eosinophil, promyelocytic, megaerythrocytic, heterolymphocyte, megaerythrocytic, neutrophilic, mesoerythrocytic, neutrophilic, naive lymphocyte, naive plasma cell, neutrophilic baculocyte, naive monocyte, late erythrocytic, primordial lymphocyte, primordial granulocyte, abnormal promyelocytic, and primordial monocyte, respectively.
Further, 80% of the images were randomly selected as samples of the training set, and the remaining 20% were selected as samples of the testing set.
Step S120: and preprocessing and data amplification are carried out on the data in the training set.
Specifically, firstly, the samples of the training set are screened, the samples without labels and non-set label categories are removed, and a data set is constructed by taking the categories with the number of cell samples greater than 100, wherein the data set needs to include 21 categories.
Further, the training data set is expanded by 5 data enhancement methods, namely image horizontal inversion, vertical inversion, picture rotation, picture translation and picture addition of gaussian noise, on the samples in the remaining 3653 training sets.
The data enhancement method is a method related in the prior art, and how to operate is not described herein in detail.
Step S130: and responding to the data in the training set for preprocessing and data amplification, and constructing a target detection model.
The target detection model constructed in the embodiment is a mathematical model for establishing bone marrow morphological graph classification based on a deep learning technology, and cells entering a visual field are classified and detected through the constructed recognition model.
Specifically, the fast R-CNN model is used as a basic model in the embodiment, the model is optimized on the basis of the basic model, and the optimized fast R-CNN model is used as a target detection model in the embodiment. In practical application, the bone marrow cell image is input, and each detected cell type and position can be output.
Wherein, the Fast R-CNN is composed of two Network parts of RPN (Region Proposal Network) and Fast R-CNN, wherein the RPN is used for selecting candidate rectangular frames, and the Fast R-CNN is used for target accurate classification and regression.
The candidate rectangular frame is a candidate rectangular frame corresponding to a plurality of categories in the bone marrow cell real image, and one category corresponds to one candidate rectangular frame. The number of candidate rectangular frames in this embodiment is plural.
The loss function of the RPN network is specifically expressed as:
the complete loss function of the RPN network is the weighted sum of the loss function and the regression function of the classification task, and the complete loss function of the RPN network is specifically expressed as follows:
wherein, L ({ p)
i},{t
i}) is a weighted sum of the sum,
in order to classify the loss function of the task,
as a loss function of the regression task, t
iParameterize the processed coordinate vector for the ith stencil frame,
parameterizing a real target rectangular frame corresponding to the ith template frame to obtain a processed coordinate vector p
iThe predicted probability of the ith stencil box,
for the real label corresponding to the ith template frame (if the template frame is a positive example value, 1, otherwise 0), N
clsFor the number of all stencil boxes, N
regFor the number of all cases, { p }
iRepresents the set of prediction probabilities for all stencil boxes, { t }
iRepresents the set of predicted coordinate vectors for all stencil boxes, λ is the weighting coefficient,in the present invention, 1 is defaulted.
In particular, the loss function L of the classification taskclsThe concrete expression is as follows:
pithe predicted probability of the ith template frame is N, and the number of the predicted template frames is N. The classification task is a two-classification task, namely predicting the probability that the current prediction area belongs to the foreground cell or the background according to a loss function.
Wherein the loss function L of the regression taskregThe concrete expression is as follows:
wherein for any parameter z in c, d, w, h, the following is defined:
wherein c, d, w and h are respectively the coordinates of the center point of the candidate rectangular frame and the length width of the candidate rectangular frame,
i-th coordinate value, v, of a candidate rectangular frame for the u-th class object
iThe ith coordinate value corresponding to the target rectangular frame is pointed.
Further, in order to ensure the translation invariance and the length and width consistency of the coordinates, the center point coordinates of the candidate rectangular frame and the length and width of the candidate rectangular frame are parameterized to generate a 4-dimensional vector t, and the processed center point coordinates (t) of the candidate rectangular frame are 4 coordinates in the 4-dimensional vector tc,t* c,td,t* d) Expressed as:
wherein (c, d) represents barycentric coordinates of the candidate matrix frame before the parameterization, wherein c represents coordinates of the X-axis of the candidate matrix frame before the parameterization, d represents coordinates of the Y-axis of the candidate matrix frame before the parameterization, and (c)a,da) Coordinates representing the template frame (anchor), (c)*,d*) Coordinates representing the target rectangle, (w)a,ha) Indicating the length and width of the template frame.
Length width (t) of the processed rectangular framew,t* w,th,t* h) Expressed as:
where w, h denote the length and width of the candidate rectangular box before parameterization, (w*,y*) Represents the length and width of the target rectangular box, (w)a,ha) Indicating the length and width of the template frame.
Wherein the probability of the candidate rectangular frame is predicted according to the RPN network, and the probability model considers the probability that the predicted candidate rectangular frame is a real target.
Further, in the Fast R-CNN network portion, since the candidate rectangular box is selected by the Fast R-CNN network using the RPN network, a RoI posing layer is added to the Fast R-CNN network, and the RoI posing layer performs Region of Interest Pooling (Region of Interest) down-sampling to obtain features of the same size, specifically, a 7 × 7 feature map. And then flattening the characteristic diagram, and obtaining an output vector after a series of full-connection layers.
Furthermore, classification prediction and regression prediction are carried out on the obtained feature vectors, wherein the classification prediction task is a two-classification task in the invention, namely, a foreground cell and a background are obtained, and the classification prediction method can be obtained by the prior art. The Fast R-CNN network has the ability to obtain the target class.
The loss function in the regression prediction is specifically expressed as:
L(p,u,tu,v)=Lcls(p,u)+λ2[u≥1]Lreg(tu,v)
wherein, L (p, u, t)
uV) is a loss function, L
clsIs a loss function of the classification task, L
regIs the loss function of the regression task, p is the softmax probability distribution predicted by the classifier, u is the true classification label of the corresponding target, t
uCoordinate vector of corresponding category u predicted by regressor corresponding to prediction rectangular box
v is the coordinate vector corresponding to the real target rectangular box (v
x,v
y,v
w,v
h),λ
2Is a weighting coefficient, and defaults to 1 in the present invention.
The RPN network and the Fast R-CNN network form a Fast R-CNN network, and the probability of candidate rectangular frames in the bone marrow elephant cells is predicted twice through the Fast R-CNN network.
Wherein the present embodiment introduces a loss of generality after predicting the candidate rectangle box.
Because the frequency difference of different types of marrow elephant cells is large, in order to process the problem of extreme data imbalance, the embodiment provides general scoring loss, and the general scoring loss is added into the original fast R-CNN network, namely, positive and negative samples are evaluated through a classification probability value without adopting a heuristic sampling training method, so that the problem of long tail distribution of the marrow elephant cells is solved.
Wherein obtaining a loss of a general score specifically comprises the steps of:
step S1301: a scoring task is determined.
The scoring task is to divide positive and negative samples.
Specifically, since the candidate rectangular frame and the target rectangular frame are obtained as described above, the intersection ratio (IOU) of the candidate rectangular frame and the target rectangular frame is taken as the specified threshold. If the candidate rectangular box exceeds a specified threshold, the candidate rectangular box is a positive sample, otherwise, the candidate rectangular box is a negative sample. Wherein the positive set of samples is N and the negative set of samples is N.
Preferably, the IOU is a standard that measures the accuracy of detecting the corresponding object in a particular data set. IoU is a simple measurement criterion that could be used IoU to measure whenever the task of finding a prediction range in the output.
Step S1302: and predicting the score difference value of the candidate rectangular frame according to the scoring task.
Specifically, the score difference of each two candidate rectangular frames is calculated in the positive and negative samples, wherein the score difference of each two candidate rectangular frames is specifically expressed as:
wherein s isi,sjAnd respectively obtaining two candidate rectangular frame scores, wherein the candidate rectangular frame score is the score which is simultaneously output by the RPN network and the Fast R-CNN along with the probability of the candidate rectangular frame after the candidate rectangular frame is predicted.
Further, a step function is defined according to the score difference between the candidate rectangular boxes:
wherein xi<xjIllustrating that the score of candidate rectangle box i is less than the score, x, of candidate rectangle box ji>xjIt is illustrated that the score of candidate rectangular box i is less than the score of candidate rectangular box j.
Step S1303: and obtaining the loss of the general score according to the score difference.
Specifically, the loss of the general score is determined according to a step function derived from the score difference.
Wherein, the general score loss is specifically expressed as:
wherein lossRS(i) Representing the original ordering loss (i.e., the ordering loss taking into account all positive and negative samples), loss* RS(i) Indicating a loss of ordering inside the positive sample (i.e., only the loss of ordering for the positive sample is considered).
In particular, raw ordering lossRS(i) The concrete expression is as follows:
where r is the ranking position (i.e., the rank of the candidate rectangular box i score among all the predicted rectangular boxes), rpRepresenting the rank position in the positive samples (only the positive samples are counted), λ is the weighting factor, H (x)ij) As a step function, yjThe maximum IoU value for the candidate rectangular box and the rectangular box intersected by it. N denotes a positive sample set, P denotes a negative sample set, and j denotes a candidate rectangular box belonging to the positive sample set or the negative sample set.
The ordering penalty for the interior of the positive sample is specifically expressed as:
as can be seen from the above equation, for positive samples with unchanged relative sorting positions, the original sorting loss is the same as the value of the loss after sorting, and the original sorting loss and the loss after sorting cancel each other in the final loss calculation, so that the influence on the positive samples with correct sorting is weakened. The positive samples originally sorted after the negative samples have larger score loss, that is, the value of the general score loss is larger, and the general score loss is added into the original Faster R-CNN network to form the target detection model provided by the embodiment, and the general score loss can promote the target detection model to enhance the detection capability of the positive samples.
Step S140: and responding to the completion of the construction of the target detection model, and training the target detection model according to the processed training set.
Specifically, in this embodiment, Momentum (Momentum) algorithm is selected as a gradient descent method to perform a catenary on the target detection model, where Momentum is selected to be 0.9, the learning rate is initialized to 0.001, the weight attenuation is 0.0001, and multiple rounds of training are performed on the target detection model.
Wherein, each iteration Batch size is 1, the learning rate attenuation adopts linear attenuation, the learning rate attenuation of each 4 rounds is 0.3 times of the original learning rate attenuation, and the model iteration training is 20 rounds.
Step S150: and responding to the completion of the training of the target detection model, inputting the test set into the target detection model, and outputting a classification detection result.
Specifically, after the training of the target detection model is completed, the test set is subjected to the data preprocessing which is the same as that in the synchronization step S120, the processed test set is input into the target detection model, and finally, the predicted target type and the coordinates of the upper left corner and the lower right corner of the target rectangular frame are output and displayed in the bone marrow image cell image map in a visualized manner, as shown in fig. 4, that is, a bone marrow image cell analysis result map is formed.
Example two
As shown in fig. 2, the present application provides a bone marrow image cell image detecting and classifying system based on deep learning, which specifically includes: data acquisition unit 210, processing unit 220, model construction unit 230, training unit 240, and output unit 250.
The data acquiring unit 210 is used for acquiring the collected and labeled bone marrow image cell image, and determining a training set and a testing set according to the bone marrow image cell image.
Preferably, the data acquisition unit comprises an optical microscope, an image digitizing device and a digital microscope. In the process of collecting and labeling the bone marrow image cytogram, the bone marrow smear specimen is optically amplified by a microscope, and the morphological change of cells can be directly observed through an ocular lens.
The digital microscope integrates an optical microscope and an image digitization device, and directly outputs a digital microscopic image, namely a bone marrow image cell image collected by the application. Wherein the image digitization device adopts a high-definition digital camera.
The processing unit 220 is connected to the data acquisition 210 for preprocessing and data amplification of the data in the training set.
The model construction unit 230 is connected to the processing unit 220 for constructing the object detection model.
The training unit 240 is connected to the model building unit 230, and is configured to perform training of the target detection model according to the processed training set.
The output unit 250 is connected to the training unit 240, and is configured to input the test set to the target detection model, so as to obtain a detection result.
The application has the following beneficial effects:
the data used for constructing the target detection model in the application is simple in source, the required data is derived from the myeloid metaplasia cells under the real microscope visual field, a complex experiment process is not needed, and the cost is low. The target detection model constructed by the method is designed based on deep learning target detection and classification algorithm, the target cells can be automatically and accurately calibrated and 21 types of cells can be classified, and the classification result is efficient and accurate.
Although the present application has been described with reference to examples, which are intended to be illustrative only and not to be limiting of the application, changes, additions and/or deletions may be made to the embodiments without departing from the scope of the application.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.