[go: up one dir, main page]

CN114359303B - Image segmentation method and device - Google Patents

Image segmentation method and device Download PDF

Info

Publication number
CN114359303B
CN114359303B CN202111623073.2A CN202111623073A CN114359303B CN 114359303 B CN114359303 B CN 114359303B CN 202111623073 A CN202111623073 A CN 202111623073A CN 114359303 B CN114359303 B CN 114359303B
Authority
CN
China
Prior art keywords
image data
predicted
loss value
image
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111623073.2A
Other languages
Chinese (zh)
Other versions
CN114359303A (en
Inventor
汪龙
王康
刘德龙
陈波扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202111623073.2A priority Critical patent/CN114359303B/en
Publication of CN114359303A publication Critical patent/CN114359303A/en
Application granted granted Critical
Publication of CN114359303B publication Critical patent/CN114359303B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

本公开提供图像分割方法及装置。包括:将待分割图像输入至预先训练的图像分割网络,得到分割后的图像;图像分割网络通过以下方式训练:将真实图像的小样本数据和目标合成图像数据分别输入对应的图像分割网络,得到第一预测图像数据和第二预测图像数据;真实图像的小样本数据和目标合成图像数据包含相同的目标对象;基于第二预测图像数据和目标合成图像标注数据,得到第一损失值;根据第一预测图像数据和第二预测图像数据,得到第二损失值;利用第一损失值和第二损失值对图像分割网络的参数调节后,返回将真实图像的小样本数据和目标合成图像数据分别输入对应的图像分割网络的步骤,直至第一损失值和第二损失值满足指定条件,提高图像分割效果。

The present disclosure provides an image segmentation method and device. The method comprises: inputting an image to be segmented into a pre-trained image segmentation network to obtain a segmented image; the image segmentation network is trained in the following manner: inputting small sample data of a real image and target synthetic image data into a corresponding image segmentation network respectively to obtain first predicted image data and second predicted image data; the small sample data of the real image and the target synthetic image data contain the same target object; obtaining a first loss value based on the second predicted image data and the target synthetic image annotation data; obtaining a second loss value based on the first predicted image data and the second predicted image data; after adjusting the parameters of the image segmentation network using the first loss value and the second loss value, returning to the step of inputting small sample data of the real image and the target synthetic image data into the corresponding image segmentation network respectively, until the first loss value and the second loss value meet the specified conditions, thereby improving the image segmentation effect.

Description

Image segmentation method and device
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to an image segmentation method and apparatus.
Background
The deep learning technology plays an important role in the intelligent agriculture field, and particularly provides a theoretical basis for the research and development of the agricultural picking robot based on the fruit and vegetable image recognition of the convolutional neural network. However, the background is complex in the actual field or greenhouse environment, the shielding phenomenon is serious, and the rapid and accurate identification of fruits under the shielding of stems, leaves, buds and other objects is challenging.
Currently, in the way of image segmentation by a neural network, a large number of training samples for labeling images are needed in the training process of the neural network. However, in the actual environment of fruits and vegetables, the data sample resources are very deficient, the image segmentation model cannot be sufficiently trained, and the image segmentation effect of the image segmentation model after inspection is poor.
Disclosure of Invention
Exemplary embodiments of the present disclosure provide an image segmentation method, apparatus, electronic device, and computer storage medium for improving the image segmentation effect.
A first aspect of the present disclosure provides an image segmentation method, the method comprising:
inputting an image to be segmented into a pre-trained image segmentation network to obtain a segmented image;
wherein the image segmentation network is trained by:
Respectively inputting small sample data and target synthetic image data of a real image into a corresponding image segmentation network to obtain first predicted image data and second predicted image data, wherein the first predicted image data is obtained based on the small sample data of the real image, the second predicted image data is obtained based on the target synthetic image data, and the small sample data of the real image and the image data in the target synthetic image data contain the same target object;
obtaining a first loss value based on the second predicted image data and preset target synthetic image annotation data;
obtaining a second loss value according to the first predicted image data and the second predicted image data;
And after the parameters of the image segmentation networks are adjusted by using the first loss value and the second loss value, returning to the step of respectively inputting the small sample data of the real image and the target synthetic image data into the corresponding image segmentation network until the first loss value and the second loss value meet the specified condition.
In this embodiment, the difference between the predicted image of the target synthetic image data and the true annotation data is reduced by adjusting the image segmentation network based on the first loss value obtained by the second predicted image data and the preset target synthetic image annotation data, and the difference between the predicted image of the true image and the predicted image of the target synthetic image data is reduced by adjusting the image segmentation network based on the second loss value obtained by the first predicted image data and the second predicted image data, so that the difference between the predicted image data of the true image and the annotation data is further reduced, and the image segmentation effect is improved.
In one embodiment, the obtaining the first loss value based on the second predicted image data and the preset target synthetic image annotation data includes:
for any pixel point in the second predicted image data, obtaining an intermediate loss value of the pixel point based on a first predicted object of the pixel point in the second predicted image data and a labeling object of the pixel point in the target synthetic image labeling data;
And obtaining the first loss value according to the intermediate loss value corresponding to each pixel point in the second predicted image data.
According to the method, the intermediate loss value of each pixel point is obtained through the first predicted object of each pixel point in the second predicted image data and the labeling object of each pixel point in the target synthetic image labeling data, and the first loss value is obtained based on each intermediate loss, so that the obtained first loss value is more accurate.
In one embodiment, the first loss value is obtained by the following formula:
Wherein L seg is the first loss value, H is the high of the second predicted image data, W is the wide of the second predicted image data, A first prediction object of a pixel point with an ordinate of h and an abscissa of w in the second prediction data,The pixel point with the ordinate being h and the abscissa being w is a labeling object in the target synthetic image labeling data, wherein h is [0, H ], and w is [0, W ].
In one embodiment, the obtaining a second loss value according to the first predicted image data and the second predicted image data includes:
inputting the first predicted image data into a full convolution neural network to obtain a first predicted image characteristic;
inputting the second predicted image data into a full convolution neural network to obtain second predicted image characteristics;
and obtaining the second loss value based on the first predicted image characteristic and the second predicted image characteristic.
According to the embodiment, the first predicted image characteristic and the second predicted image characteristic are obtained through the full convolution neural network, and the second loss value is obtained based on the first predicted image characteristic and the second predicted image characteristic, so that more detail characteristics can be obtained through the full convolution neural network, and the second loss value is determined more accurately.
In one embodiment, the second loss value is obtained by:
Obtaining a first predicted value corresponding to the first predicted image data by using a second predicted object of each pixel point in a first predicted image feature corresponding to the first predicted image data;
obtaining a second predicted value corresponding to the second predicted image data according to a third predicted object of each pixel point in a second predicted image feature corresponding to the second predicted image data;
And obtaining the second loss value based on the first predicted value and the second predicted value.
In one embodiment, before the small sample data of the real image and the target composite image data are respectively input into the corresponding image segmentation network, the method further comprises:
and inputting the small sample data and the synthetic image data of the real image into a pre-trained cyclic coincidence countermeasure network to obtain the target synthetic image data.
According to the embodiment, the small sample data and the synthetic image data of the real image are input into the pre-trained cyclic coincidence countermeasure network to obtain the target synthetic image data, so that the synthetic image data is closer to the real image on the color distribution, and the accuracy of image segmentation is further improved.
A second aspect of the present disclosure provides an image segmentation apparatus, the apparatus comprising:
the segmentation module is used for inputting the image to be segmented into a pre-trained image segmentation network to obtain a segmented image;
wherein the image segmentation network is trained by:
The prediction module is used for respectively inputting small sample data of a real image and target synthetic image data into a corresponding image segmentation network to obtain first prediction image data and second prediction image data, wherein the first prediction image data is obtained based on the small sample data of the real image, the second prediction image data is obtained based on the target synthetic image data, and the small sample data of the real image and the image data in the target synthetic image data contain the same target object;
the first loss value determining module is used for obtaining a first loss value based on the second predicted image data and preset target synthetic image annotation data;
a second loss value determining module, configured to obtain a second loss value according to the first predicted image data and the second predicted image data;
And the adjusting module is used for returning to the step of respectively inputting the small sample data of the real image and the target synthetic image data into the corresponding image segmentation network after the parameters of the image segmentation network are adjusted by using the first loss value and the second loss value until the first loss value and the second loss value meet the specified condition.
In one embodiment, the first loss value determining module is specifically configured to:
for any pixel point in the second predicted image data, obtaining an intermediate loss value of the pixel point based on a first predicted object of the pixel point in the second predicted image data and a labeling object of the pixel point in the target synthetic image labeling data;
And obtaining the first loss value according to the intermediate loss value corresponding to each pixel point in the second predicted image data.
In one embodiment, the first loss value determining module is specifically configured to:
the first loss value is obtained by the following formula:
Wherein L seg is the first loss value, H is the high of the second predicted image data, W is the wide of the second predicted image data, A first prediction object of a pixel point with an ordinate of h and an abscissa of w in the second prediction data,The pixel point with the ordinate being h and the abscissa being w is a labeling object in the target synthetic image labeling data, wherein h is [0, H ], and w is [0, W ].
In one embodiment, the second loss value determining module is specifically configured to:
inputting the first predicted image data into a full convolution neural network to obtain a first predicted image characteristic;
inputting the second predicted image data into a full convolution neural network to obtain second predicted image characteristics;
and obtaining the second loss value based on the first predicted image characteristic and the second predicted image characteristic.
In one embodiment, the second loss value determining module is specifically configured to:
the second loss value is obtained by:
Obtaining a first predicted value corresponding to the first predicted image data by using a second predicted object of each pixel point in a first predicted image feature corresponding to the first predicted image data;
obtaining a second predicted value corresponding to the second predicted image data according to a third predicted object of each pixel point in a second predicted image feature corresponding to the second predicted image data;
And obtaining the second loss value based on the first predicted value and the second predicted value.
In one embodiment, the apparatus further comprises:
And the target synthetic image data determining module is used for inputting the small sample data and the synthetic image data of the real image into a pre-trained cyclic coincidence countermeasure network before the small sample data and the target synthetic image data of the real image are respectively input into the corresponding image segmentation network, so as to obtain the target synthetic image data.
According to a third aspect of embodiments of the present disclosure, there is provided an electronic device, comprising:
the system comprises at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions for execution by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the method as described in the first aspect.
According to a fourth aspect provided by embodiments of the present disclosure, there is provided a computer storage medium storing a computer program for performing the method according to the first aspect.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings that are needed in the description of the embodiments will be briefly described below, it will be apparent that the drawings in the following description are only some embodiments of the present disclosure, and that other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art.
FIG. 1 is one of the flow diagrams of an image segmentation method according to one embodiment of the present disclosure;
FIG. 2 is a second flow chart of an image segmentation method according to an embodiment of the disclosure;
FIG. 3 is a flow chart diagram of determining a second loss value according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a structure of a recurring uniform antagonism network according to one embodiment of the present disclosure;
FIG. 5 is a color distribution correlation comparison graph according to one embodiment of the present disclosure;
FIG. 6 is a third flow chart of an image segmentation method according to an embodiment of the disclosure;
FIG. 7 is an image segmentation apparatus according to an embodiment of the present disclosure;
fig. 8 is a schematic structural view of an electronic device according to an embodiment of the present disclosure.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are some embodiments of the present disclosure, but not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without inventive effort, based on the embodiments in this disclosure are intended to be within the scope of this disclosure.
The term "and/or" in the embodiments of the present disclosure describes an association relationship of association objects, which indicates that three relationships may exist, for example, a and/or B may indicate that a exists alone, while a and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.
The application scenario described in the embodiments of the present disclosure is for more clearly describing the technical solution of the embodiments of the present disclosure, and does not constitute a limitation on the technical solution provided by the embodiments of the present disclosure, and as a person of ordinary skill in the art can know that, with the appearance of a new application scenario, the technical solution provided by the embodiments of the present disclosure is equally applicable to similar technical problems. In the description of the present disclosure, unless otherwise indicated, the meaning of "a plurality" is two or more.
In the prior art, the image segmentation is performed through a neural network, and a large number of training samples for labeling images are needed in the training process of the neural network. However, in the actual environment of fruits and vegetables, the data sample resources are very deficient, the image segmentation model cannot be sufficiently trained, and the image segmentation effect of the image segmentation model after inspection is poor.
Accordingly, the present disclosure provides an image segmentation method, in which a gap between a predicted image of target synthetic image data and real annotation data is reduced by adjusting an image segmentation network based on first loss values obtained from the second predicted image data and preset target synthetic image annotation data, and the gap between the predicted image of real image and the predicted image of target synthetic image data is reduced by adjusting the image segmentation network based on second loss values obtained from the first predicted image data and the second predicted image data, thereby further reducing the gap between the predicted image data of real image and the annotation data and improving the image segmentation effect.
The following describes the training method of the image segmentation network in detail, as shown in fig. 1, which is a schematic flow chart of the training image segmentation network, and may include the following steps:
Step 101, respectively inputting small sample data of a real image and target synthetic image data into a corresponding image segmentation network to obtain first predicted image data and second predicted image data, wherein the first predicted image data is obtained based on the small sample data of the real image, the second predicted image data is obtained based on the target synthetic image data, and the small sample data of the real image and the image data in the target synthetic image data contain the same target object;
If the application scene is a fruit and vegetable scene, the target object includes leaves, fruits, pedicel, stems, buds, etc., and the specific target object may be set according to the actual situation of the actual scene, where the target object in the embodiment is only used for distance description and is not limited by the target object in the application.
Note that FRRN (Full-resolution Residual Networks, full-resolution residual network) is used for the image segmentation network in this embodiment. The use of the image dividing network may be set according to the actual situation, and the present embodiment is not limited herein.
102, Obtaining a first loss value based on the second predicted image data and preset target synthetic image annotation data;
in one embodiment, the first loss value is obtained by:
And obtaining an intermediate loss value of the pixel point according to the intermediate loss value corresponding to each pixel point in the second predicted image data based on a first predicted object of the pixel point in the second predicted image data and a labeling object of the pixel point in the target synthesized image labeling data aiming at any pixel point in the second predicted image data. Wherein the first loss value may be determined by equation (1):
Wherein L seg is the first loss value, H is the high of the second predicted image data, W is the wide of the second predicted image data, A first prediction object of a pixel point with an ordinate of h and an abscissa of w in the second prediction data,The pixel point with the ordinate being h and the abscissa being w is a labeling object in the target synthetic image labeling data, wherein h is [0, H ], and w is [0, W ].
Step 103, obtaining a second loss value according to the first predicted image data and the second predicted image data;
It should be noted that the execution sequence of step 102 and step 103 is not limited herein, and step 102 may be executed first and then step 103 may be executed. Step 103 may be performed first, and then step 102 may be performed. Step 102 and step 103 may also be performed simultaneously.
In one embodiment, the second loss value may be obtained by inputting the first predicted image data into a fully convolutional neural network to obtain a first predicted image feature, and inputting the second predicted image data into a fully convolutional neural network to obtain a second predicted image feature, and obtaining the second loss value based on the first predicted image feature and the second predicted image feature.
Step 104, judging whether the first loss value and the second loss value meet the specified conditions, if yes, ending the flow, and if not, executing step 105;
The specified condition in this embodiment is that the first loss value is smaller than a first preset threshold value, and the second loss value is smaller than a second preset threshold value.
And step 105, adjusting the parameters of each image segmentation network by using the first loss value and the second loss value, and returning to the step 101.
As shown in fig. 2, which is a training flowchart of the image segmentation network, as can be seen from fig. 2, the target synthetic image data is input into the second image segmentation network, and the small sample data of the real image is input into the first image segmentation network, so as to obtain a first predicted image and a second predicted image. And then obtaining a first loss value based on the first predicted image data and the preset target synthetic image annotation data, and adjusting parameters of the first image segmentation network through the first loss value. And inputting the first predicted image data and the second predicted image data into a full convolutional neural network to obtain a second loss value, and adjusting parameters of the second image segmentation network based on the second loss value. However, since the parameters of the first image segmentation network and the second image segmentation network are shared, adjusting the parameters of the first image segmentation network also adjusts the parameters of the second image segmentation network. The parameters of the second image segmentation network are adjusted as are the parameters of the first image segmentation network.
In one embodiment, as shown in fig. 3, to determine the second loss value, the method may include the following steps:
Step 301, obtaining a first predicted value corresponding to the first predicted image data by using a second predicted object of each pixel point in a first predicted image feature corresponding to the first predicted image data, where the first predicted value may be determined according to formula (2):
Wherein L D(It) is the first predicted value, And M is the height of the first predicted image data, N is the width of the first predicted image data, h epsilon [0, M ], w epsilon [0, N ] are the second predicted object of the pixel point with the vertical coordinate of h and the horizontal coordinate of w in the first predicted image characteristics corresponding to the first predicted image data.
Step 302, obtaining a second predicted value corresponding to the second predicted image data according to a third predicted object of each pixel point in a second predicted image feature corresponding to the second predicted image data, wherein the second predicted value can be determined according to a formula (3):
wherein L D(Is) is the second predicted value, And a third prediction object of a pixel point with an ordinate of H and an abscissa of W in a second prediction image feature corresponding to the second prediction image data, wherein H is the height of the second prediction image data, W is the width of the second prediction image data, H epsilon [0, H ], and W epsilon [0, W ].
The execution order of steps 301 and 302 is not limited herein, and step 301 may be executed first and then step 302 may be executed. Step 302 may be performed first, and then step 301 may be performed. Step 301 and step 302 may also be performed simultaneously.
And step 303, obtaining the second loss value based on the first predicted value and the second predicted value. Wherein the second loss value L p can be obtained by the formula (4):
Lp=LD(It)+LD(Is)...(4)。
in order to improve the accuracy of the image segmentation, in one implementation, the small sample data and the composite image data of the real image are input into a pre-trained loop coincidence countermeasure network to obtain the target composite image data, before step 101 is performed.
As shown in fig. 4, a schematic structural diagram of a cyclic matching countermeasure network includes a generator G1, a generator G2, a generator F1, a generator F2, a discriminator D1, and a discriminator D2.
Wherein the generator G1 and the generator G2 are trained to convert the input image data into another domain (i.e., the domain of the image data input by the corresponding generator). The corresponding discriminators D1 and D2 are trained to distinguish between the generated image data and the original domain image data, thereby obtaining intermediate losses to adjust the parameters of each generator. The generators F1 and F2 convert the first set of images obtained by the generators G1 and G2 back to the original domain to generate a second set of images. Parameters of each generator are then adjusted by comparing the second set of images to the initial input image to calculate a cyclical coincidence loss. Each generator transforms the image into an opposite domain through training while maintaining geometric consistency across the image.
As shown in fig. 4, since the network has symmetry, the input on both sides performs the same operation, and the workflow of the network will be described below taking the flow of the target composite image data input as an example:
the target composite image data is input into the generator G1, and the small sample data of the real image is input into the generator F1, the generator G1 generating an image a similar to the small sample data of the real image. The method comprises the steps of inputting small sample data of an image A and a real image into a judging device D1 to obtain a first intermediate loss, adjusting parameters of a generator G1 and a generator F2 by using the first intermediate loss, inputting the image A into the generator F2 to obtain an image C similar to a target synthetic image, obtaining a second intermediate loss by using the image C and the target synthetic image data, adjusting parameters of the generator G1 and the generator F1 by using the second intermediate loss, and returning to execute the steps of inputting the target synthetic image data into the generator G1 and inputting the small sample data of the real image into the generator F1 until the first intermediate loss and the second intermediate loss meet specified conditions, and determining the obtained image C as the target synthetic image data.
Therefore, in this embodiment, the color distribution of the target composite image data is made closer to that of the real image in the above manner, and as shown in fig. 5, the target composite image data is a list of the correlation of the color distribution between the composite image data and the small sample data of the real image of the fruit and vegetable image and the target composite image data. As shown in fig. 5, the correlation of the color distribution of each target object in the fruit and vegetable image is shown. Wherein the average color distribution correlation between the small sample data of the real image and the synthesized image data is 0.62. The average color distribution correlation between the small sample data of the real image and the target composite image data is 0.79. It is determined that the color distribution of the target composite image data is closer to the real image.
In one embodiment, after training the image segmentation network, the image to be segmented is input into the pre-trained image segmentation network to obtain a segmented image.
The segmented image of the embodiment can be compared with the segmented image in the prior art through pixel accuracy and average cross-over ratio. Pixel accuracy can be determined by equation (5):
Wherein P A is the pixel accuracy, P is the number of pixels of the second prediction object and the labeling object of the segmented image, and n is the total number of pixels of the segmented image.
And, the average intersection ratio MIOU can be determined by equation (6):
Where k is the number of target objects, p ii is the number of pixels where both the second prediction object and the labeling object are i in the segmented image, p ij is the number of pixels where the labeling object is i in the segmented image, but the second prediction object is j, and p ji is the number of pixels where the labeling object is j in the segmented image, but the second prediction object is i.
Thus, the image quality after the segmentation can be evaluated by the pixel accuracy and the average cross-correlation.
For further understanding of the technical solution of the present disclosure, the following detailed description with reference to fig. 6 may include the following steps:
step 601, inputting small sample data and synthetic image data of the real image into a pre-trained cyclic consistent countermeasure network to obtain target synthetic image data;
Step 602, respectively inputting small sample data of a real image and target synthetic image data into a corresponding image segmentation network to obtain first predicted image data and second predicted image data, wherein the first predicted image data is obtained based on the small sample data of the real image, the second predicted image data is obtained based on the target synthetic image data, and the small sample data of the real image and the image data in the target synthetic image data contain the same target object;
Step 603, obtaining a first loss value based on the second predicted image data and preset target synthetic image annotation data;
step 604, obtaining a second loss value according to the first predicted image data and the second predicted image data;
Step 605, judging whether the first loss value and the second loss value meet the specified conditions, if yes, executing step 607, and if not, executing step 606;
Step 606, after the parameters of the image segmentation networks are adjusted by using the first loss value and the second loss value, returning to the step 602;
Step 607, inputting the image to be segmented into a pre-trained image segmentation network to obtain a segmented image.
Based on the same disclosure concept, the image segmentation method as described above of the present disclosure may also be implemented by an image segmentation apparatus. The effects of the image segmentation apparatus are similar to those of the previous method, and will not be described here again.
Fig. 7 is a schematic structural view of an image dividing apparatus according to an embodiment of the present disclosure.
As shown in fig. 7, the image segmentation apparatus 700 of the present disclosure may include a segmentation module 710, a prediction module 720, a first loss value determination module 730, a second loss value determination module 740, and an adjustment module 750.
The segmentation module 710 is configured to input an image to be segmented into a pre-trained image segmentation network to obtain a segmented image;
wherein the image segmentation network is trained by:
A prediction module 720, configured to input small sample data of a real image and target synthetic image data into corresponding image segmentation networks, respectively, to obtain first predicted image data and second predicted image data, where the first predicted image data is obtained based on the small sample data of the real image, the second predicted image data is obtained based on the target synthetic image data, and the small sample data of the real image and the image data in the target synthetic image data contain the same target object;
a first loss value determining module 730, configured to obtain a first loss value based on the first predicted image data and preset target synthetic image annotation data;
a second loss value determining module 740, configured to obtain a second loss value according to the first predicted image data and the second predicted image data;
And the adjusting module 750 is configured to return to a step of respectively inputting the small sample data of the real image and the target composite image data into the corresponding image segmentation network after adjusting the parameters of the image segmentation network by using the first loss value and the second loss value, until the first loss value and the second loss value meet a specified condition.
In one embodiment, the first loss value determining module 730 is specifically configured to:
for any pixel point in the second predicted image data, obtaining an intermediate loss value of the pixel point based on a first predicted object of the pixel point in the second predicted image data and a labeling object of the pixel point in the target synthetic image labeling data;
And obtaining the first loss value according to the intermediate loss value corresponding to each pixel point in the second predicted image data.
In one embodiment, the first loss value determining module 730 is specifically configured to:
the first loss value is obtained by the following formula:
Wherein L seg is the first loss value, H is the high of the second predicted image data, W is the wide of the second predicted image data, A first prediction object of a pixel point with an ordinate of h and an abscissa of w in the second prediction data,The pixel point with the ordinate being h and the abscissa being w is a labeling object in the target synthetic image labeling data, wherein h is [0, H ], and w is [0, W ].
In one embodiment, the second loss value determining module 740 is specifically configured to:
inputting the first predicted image data into a full convolution neural network to obtain a first predicted image characteristic;
inputting the second predicted image data into a full convolution neural network to obtain second predicted image characteristics;
and obtaining the second loss value based on the first predicted image characteristic and the second predicted image characteristic.
In one embodiment, the second loss value determining module 740 is specifically configured to:
the second loss value is obtained by:
Obtaining a first predicted value corresponding to the first predicted image data by using a second predicted object of each pixel point in a first predicted image feature corresponding to the first predicted image data;
obtaining a second predicted value corresponding to the second predicted image data according to a third predicted object of each pixel point in a second predicted image feature corresponding to the second predicted image data;
And obtaining the second loss value based on the first predicted value and the second predicted value.
In one embodiment, the apparatus further comprises:
The target composite image data determining module 760 is configured to input the small sample data and the composite image data of the real image into a pre-trained cyclic coincidence countermeasure network before the small sample data and the target composite image data of the real image are respectively input into the corresponding image segmentation network, so as to obtain the target composite image data.
Having described an image segmentation method and apparatus according to an exemplary embodiment of the present disclosure, next, an electronic device according to another exemplary embodiment of the present disclosure is described.
Those skilled in the art will appreciate that the various aspects of the present disclosure may be implemented as a system, method, or program product. Accordingly, aspects of the present disclosure may be embodied in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects that may be referred to herein collectively as a "circuit," module, "or" system.
In some possible implementations, an electronic device according to the present disclosure may include at least one processor, and at least one computer storage medium. Wherein the computer storage medium stores program code which, when executed by a processor, causes the processor to perform the steps in the image segmentation method according to various exemplary embodiments of the disclosure described above in this specification. For example, the processor may perform steps 101-105 as shown in FIG. 1.
An electronic device 800 according to such an embodiment of the present disclosure is described below with reference to fig. 8. The electronic device 800 shown in fig. 8 is merely an example and should not be construed to limit the functionality and scope of use of embodiments of the present disclosure in any way.
As shown in fig. 8, the electronic device 800 is embodied in the form of a general-purpose electronic device. The components of electronic device 800 may include, but are not limited to, at least one processor 801 described above, at least one computer storage medium 802 described above, and a bus 803 connecting the various system components, including computer storage medium 802 and processor 801.
Bus 803 represents one or more of several types of bus structures, including a computer storage media bus or computer storage media controller, a peripheral bus, a processor, or a local bus using any of a variety of bus architectures.
Computer storage media 802 may include readable media in the form of volatile computer storage media, such as random access computer storage media (RAM) 821 and/or cache storage media 822, and may further include read only computer storage media (ROM) 823.
Computer storage media 802 can also include a program/utility 825 having a set (at least one) of program modules 824, such program modules 824 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
The electronic device 800 may also communicate with one or more external devices 804 (e.g., keyboard, pointing device, etc.), one or more devices that enable a user to interact with the electronic device 800, and/or any device (e.g., router, modem, etc.) that enables the electronic device 800 to communicate with one or more other electronic devices. Such communication may occur through an input/output (I/O) interface 805. Also, the electronic device 800 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through a network adapter 806. As shown, network adapter 806 communicates with other modules for electronic device 800 over bus 803. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 800, including, but not limited to, microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
In some possible embodiments, aspects of an image segmentation method provided by the present disclosure may also be implemented in the form of a program product comprising program code for causing a computer device to carry out the steps of the image segmentation method according to the various exemplary embodiments of the present disclosure as described above in the present specification, when the program product is run on the computer device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of a readable storage medium include an electrical connection having one or more wires, a portable disk, a hard disk, a random access computer storage medium (RAM), a read-only computer storage medium (ROM), an erasable programmable read-only computer storage medium (EPROM or flash memory), an optical fiber, a portable compact disc read-only computer storage medium (CD-ROM), an optical computer storage medium, a magnetic computer storage medium, or any suitable combination thereof.
The program product of image segmentation of embodiments of the present disclosure may employ a portable compact disc read-only computer storage medium (CD-ROM) and include program code and may be run on an electronic device. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the consumer electronic device, partly on the consumer electronic device, as a stand-alone software package, partly on the consumer electronic device, partly on the remote electronic device, or entirely on the remote electronic device or server. In the case of remote electronic devices, the remote electronic device may be connected to the consumer electronic device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external electronic device (e.g., connected through the internet using an internet service provider).
It should be noted that although several modules of the apparatus are mentioned in the detailed description above, this division is merely exemplary and not mandatory. Indeed, the features and functions of two or more modules described above may be embodied in one module in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module described above may be further divided into a plurality of modules to be embodied.
Furthermore, although the operations of the methods of the present disclosure are depicted in the drawings in a particular order, this is not required or suggested that these operations must be performed in this particular order or that all of the illustrated operations must be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.
It will be apparent to those skilled in the art that embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk computer storage media, CD-ROM, optical computer storage media, and the like) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable computer storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable computer storage medium produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present disclosure without departing from the spirit or scope of the disclosure. Thus, the present disclosure is intended to include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (12)

1. An image segmentation method, the method comprising:
inputting an image to be segmented into a pre-trained image segmentation network to obtain a segmented image;
wherein the image segmentation network is trained by:
Inputting small sample data and synthetic image data of a real image into a pre-trained cyclic consistent countermeasure network to obtain target synthetic image data, wherein the small sample data of the real image and the image data in the target synthetic image data contain the same target object;
Respectively inputting the small sample data of the real image and the target synthetic image data into a corresponding image segmentation network to obtain first predicted image data and second predicted image data, wherein the first predicted image data is obtained based on the small sample data of the real image, and the second predicted image data is obtained based on the target synthetic image data;
Obtaining a first loss value based on the second predicted image data and preset target synthetic image annotation data, and
Obtaining a second loss value according to the first predicted image data and the second predicted image data, and,
And after the parameters of the image segmentation networks are adjusted by using the first loss value and the second loss value, returning to the step of respectively inputting the small sample data of the real image and the target synthetic image data into the corresponding image segmentation network until the first loss value and the second loss value meet the specified condition.
2. The method according to claim 1, wherein the obtaining a first loss value based on the second predicted image data and target synthetic image annotation data set in advance includes:
for any pixel point in the second predicted image data, obtaining an intermediate loss value of the pixel point based on a first predicted object of the pixel point in the second predicted image data and a labeling object of the pixel point in the target synthetic image labeling data;
And obtaining the first loss value according to the intermediate loss value corresponding to each pixel point in the second predicted image data.
3. The method of claim 2, wherein the first loss value is obtained by the following formula:
Wherein L seg is the first loss value, H is the high of the second predicted image data, W is the wide of the second predicted image data, A first predicted object in the second predicted image data for a pixel point having an ordinate h and an abscissa w,The pixel point with the ordinate being h and the abscissa being w is a labeling object in the target synthetic image labeling data, wherein h is [0, H ], and w is [0, W ].
4. The method of claim 1, wherein deriving a second loss value from the first predicted image data and the second predicted image data comprises:
Inputting the first predicted image data into a full convolutional neural network to obtain a first predicted image feature, and
Inputting the second predicted image data into a full convolution neural network to obtain second predicted image characteristics;
and obtaining the second loss value based on the first predicted image characteristic and the second predicted image characteristic.
5. The method of claim 4, wherein the second loss value is obtained by:
Obtaining a first predicted value corresponding to the first predicted image data by using a second predicted object of each pixel point in the first predicted image characteristic corresponding to the first predicted image data,
Obtaining a second predicted value corresponding to the second predicted image data according to a third predicted object of each pixel point in a second predicted image feature corresponding to the second predicted image data;
And obtaining the second loss value based on the first predicted value and the second predicted value.
6. An image segmentation apparatus, the apparatus comprising:
the segmentation module is used for inputting the image to be segmented into a pre-trained image segmentation network to obtain a segmented image;
wherein the image segmentation network is trained by:
Inputting small sample data and synthetic image data of a real image into a pre-trained cyclic consistent countermeasure network to obtain target synthetic image data, wherein the small sample data of the real image and the image data in the target synthetic image data contain the same target object;
The prediction module is used for respectively inputting the small sample data of the real image and the target synthetic image data into a corresponding image segmentation network to obtain first prediction image data and second prediction image data, wherein the first prediction image data is obtained based on the small sample data of the real image, and the second prediction image data is obtained based on the target synthetic image data;
A first loss value determining module for obtaining a first loss value based on the second predicted image data and the preset target synthetic image annotation data, and
A second loss value determining module for obtaining a second loss value according to the first predicted image data and the second predicted image data, and,
And the adjusting module is used for returning to the step of respectively inputting the small sample data of the real image and the target synthetic image data into the corresponding image segmentation network after the parameters of the image segmentation network are adjusted by using the first loss value and the second loss value until the first loss value and the second loss value meet the specified condition.
7. The apparatus of claim 6, wherein the first loss value determining module is specifically configured to:
for any pixel point in the second predicted image data, obtaining an intermediate loss value of the pixel point based on a first predicted object of the pixel point in the second predicted image data and a labeling object of the pixel point in the target synthetic image labeling data;
And obtaining the first loss value according to the intermediate loss value corresponding to each pixel point in the second predicted image data.
8. The apparatus of claim 7, wherein the first loss value determining module is specifically configured to:
the first loss value is obtained by the following formula:
Wherein L seg is the first loss value, H is the high of the second predicted image data, W is the wide of the second predicted image data, A first predicted object in the second predicted image data for a pixel point having an ordinate h and an abscissa w,The pixel point with the ordinate being h and the abscissa being w is a labeling object in the target synthetic image labeling data, wherein h is [0, H ], and w is [0, W ].
9. The apparatus of claim 6, wherein the second loss value determining module is specifically configured to:
Inputting the first predicted image data into a full convolutional neural network to obtain a first predicted image feature, and
Inputting the second predicted image data into a full convolution neural network to obtain second predicted image characteristics;
and obtaining the second loss value based on the first predicted image characteristic and the second predicted image characteristic.
10. The apparatus of claim 9, wherein the second loss value determining module is specifically configured to:
the second loss value is obtained by:
Obtaining a first predicted value corresponding to the first predicted image data by using a second predicted object of each pixel point in the first predicted image characteristic corresponding to the first predicted image data,
Obtaining a second predicted value corresponding to the second predicted image data according to a third predicted object of each pixel point in a second predicted image feature corresponding to the second predicted image data;
And obtaining the second loss value based on the first predicted value and the second predicted value.
11. An electronic device comprising at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions for execution by the at least one processor, the instructions for execution by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A computer storage medium, characterized in that it stores a computer program for executing the method according to any one of claims 1-5.
CN202111623073.2A 2021-12-28 2021-12-28 Image segmentation method and device Active CN114359303B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111623073.2A CN114359303B (en) 2021-12-28 2021-12-28 Image segmentation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111623073.2A CN114359303B (en) 2021-12-28 2021-12-28 Image segmentation method and device

Publications (2)

Publication Number Publication Date
CN114359303A CN114359303A (en) 2022-04-15
CN114359303B true CN114359303B (en) 2024-12-24

Family

ID=81103519

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111623073.2A Active CN114359303B (en) 2021-12-28 2021-12-28 Image segmentation method and device

Country Status (1)

Country Link
CN (1) CN114359303B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507993A (en) * 2020-03-18 2020-08-07 南方电网科学研究院有限责任公司 Image segmentation method and device based on generation countermeasure network and storage medium
CN113822428A (en) * 2021-08-06 2021-12-21 中国工商银行股份有限公司 Neural network training method and device and image segmentation method

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11011275B2 (en) * 2018-02-12 2021-05-18 Ai.Skopy, Inc. System and method for diagnosing gastrointestinal neoplasm
US11132792B2 (en) * 2018-02-22 2021-09-28 Siemens Healthcare Gmbh Cross domain medical image segmentation
CN110503654B (en) * 2019-08-01 2022-04-26 中国科学院深圳先进技术研究院 A method, system and electronic device for medical image segmentation based on generative adversarial network
CN111783986B (en) * 2020-07-02 2024-06-14 清华大学 Network training method and device, and gesture prediction method and device
CN112183627B (en) * 2020-09-28 2024-07-19 中星技术股份有限公司 Method for generating prediction density map network and vehicle annual inspection number detection method
CN113706551A (en) * 2021-04-14 2021-11-26 腾讯科技(深圳)有限公司 Image segmentation method, device, equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507993A (en) * 2020-03-18 2020-08-07 南方电网科学研究院有限责任公司 Image segmentation method and device based on generation countermeasure network and storage medium
CN113822428A (en) * 2021-08-06 2021-12-21 中国工商银行股份有限公司 Neural network training method and device and image segmentation method

Also Published As

Publication number Publication date
CN114359303A (en) 2022-04-15

Similar Documents

Publication Publication Date Title
Gai et al. A detection algorithm for cherry fruits based on the improved YOLO-v4 model
US11055535B2 (en) Method and device for video classification
US11551333B2 (en) Image reconstruction method and device
JP2021197190A (en) Optical character identification method, device, electronic device, and storage media
WO2023160290A1 (en) Neural network inference acceleration method, target detection method, device, and storage medium
CN110622176A (en) Video partitioning
WO2021253686A1 (en) Feature point tracking training and tracking methods, apparatus, electronic device, and storage medium
Xiao et al. Apple ripeness identification from digital images using transformers
Yan et al. [Retracted] Optimization Research on Deep Learning and Temporal Segmentation Algorithm of Video Shot in Basketball Games
CN115861462A (en) Training method and device for image generation model, electronic equipment and storage medium
CN115457365A (en) Model interpretation method and device, electronic equipment and storage medium
Zheng et al. YOLOX-Dense-CT: A detection algorithm for cherry tomatoes based on YOLOX and DenseNet
Wei et al. A study on Shine-Muscat grape detection at maturity based on deep learning
US11681920B2 (en) Method and apparatus for compressing deep learning model
Gao et al. Using improved YOLO V5s to recognize tomatoes in a continuous working environment
Zhang et al. EasyDAM_V2: Efficient data labeling method for multishape, cross-species fruit detection
CN114359303B (en) Image segmentation method and device
Zhang et al. CAM R-CNN: End-to-End Object Detection with Class Activation Maps
CN113822097B (en) Single-view human body posture recognition method and device, electronic equipment and storage medium
CN116740134B (en) Image target tracking method, device and equipment based on hierarchical attention strategy
CN113011597B (en) Deep learning method and device for regression task
Li et al. An algorithm for cattle counting in rangeland based on multi‐scale perception and image association
WO2020237674A1 (en) Target tracking method and apparatus, and unmanned aerial vehicle
Chai et al. DCFA-YOLO: A Dual-Channel Cross-Feature-Fusion Attention YOLO Network for Cherry Tomato Bunch Detection
US12238451B2 (en) Predicting video edits from text-based conversations using neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant