[go: up one dir, main page]

CN112581446A - Method, device and equipment for detecting salient object of image and storage medium - Google Patents

Method, device and equipment for detecting salient object of image and storage medium Download PDF

Info

Publication number
CN112581446A
CN112581446A CN202011479093.2A CN202011479093A CN112581446A CN 112581446 A CN112581446 A CN 112581446A CN 202011479093 A CN202011479093 A CN 202011479093A CN 112581446 A CN112581446 A CN 112581446A
Authority
CN
China
Prior art keywords
image
significance
saliency
salient
score
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011479093.2A
Other languages
Chinese (zh)
Other versions
CN112581446B (en
Inventor
吕朋伟
姜文杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Insta360 Innovation Technology Co Ltd
Original Assignee
Insta360 Innovation Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Insta360 Innovation Technology Co Ltd filed Critical Insta360 Innovation Technology Co Ltd
Priority to CN202011479093.2A priority Critical patent/CN112581446B/en
Publication of CN112581446A publication Critical patent/CN112581446A/en
Priority to PCT/CN2021/138277 priority patent/WO2022127814A1/en
Application granted granted Critical
Publication of CN112581446B publication Critical patent/CN112581446B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

本发明适用图像处理技术领域,提供了一种图像的显著性物体检测方法、装置、设备及存储介质,该方法包括:先获取待检测图像,再通过显著性检测模型对该待检测图像进行检测,得到待检测图像中的所有显著性物体,然后分别计算每个显著性物体的显著性得分,最后根据显著性得分对所有显著性物体进行显著性排序,将得到的显著性得分分值最大的显著性物体确定为待检测图像中的目标显著性物体,从而提高了多场景图像中显著性物体的识别速度和识别准确率。

Figure 202011479093

The present invention is applicable to the technical field of image processing, and provides a method, device, equipment and storage medium for detecting a saliency object of an image. The method includes: first acquiring an image to be detected, and then detecting the image to be detected by a saliency detection model , get all the salient objects in the image to be detected, then calculate the saliency score of each salient object separately, and finally sort all salient objects according to the saliency score, and select the one with the largest saliency score. The salient object is determined as the target salient object in the image to be detected, thereby improving the recognition speed and recognition accuracy of the salient object in the multi-scene image.

Figure 202011479093

Description

Method, device and equipment for detecting salient object of image and storage medium
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a method, a device, equipment and a storage medium for detecting a salient object of an image.
Background
With the rapid development of information technology, cameras, portable cameras and the like on mobile electronic devices are continuously upgraded and applied, so that people record and share information by using images becomes a normal state. Images have become a main data resource of the information society, which leads to an increasing demand for data processing, and the increasing demand for data processing necessarily requires an increase in information processing efficiency. For an image, people are often interested in only a part of the area, which can represent the image content most and arouse the user interest most, in the image, and this part of the area is the saliency area, so how to automatically acquire the saliency area on the image becomes more and more important.
In recent years, a convolutional neural network is widely applied to the field of machine vision, has the capability of automatically extracting image features, particularly, has the capability of using a full convolutional neural network, and greatly improves the performance of salient object detection, but the current salient detection method based on a deep learning neural network generally performs image transformation, such as scaling, feature extraction and the like, on an image containing salient objects, in the image transformation process, the salient objects with small scale are easily polluted, so that the missing detection of the salient objects with small targets is caused, the existing salient detection types are mainly concentrated on specific objects in specific fields such as people, animals, plants and the like, the identification of richer salient objects in daily life scenes is lacked, and meanwhile, the comparison analysis among a plurality of salient objects is lacked under the condition that a plurality of salient objects exist, causing significant ambiguity problems.
Disclosure of Invention
The invention aims to provide a method, a device, equipment and a storage medium for detecting salient objects of images, and aims to solve the problems of low speed and low recognition efficiency of detecting salient objects in various scene images due to the fact that the prior art cannot provide an effective method for detecting salient objects of images.
In one aspect, the present invention provides a method for detecting a salient object in an image, the method comprising the steps of:
acquiring an image to be detected;
detecting the image to be detected through a significance detection model to obtain all significance objects in the image to be detected;
calculating a saliency score of each of the saliency objects separately;
and performing significance sorting on all the significant objects according to the significance scores to obtain a significant object with the maximum significance score value, and determining the significant object as a target significant object in the image to be detected.
Preferably, the step of calculating a saliency score for each said salient object separately comprises:
respectively calculating a first significance score of each significant object, and calculating a first significance mean value according to the obtained first significance scores of all the significant objects;
determining a significance threshold value according to the first significance mean value;
respectively cutting the outline area of each salient object according to the saliency threshold;
calculating a second significance mean value according to all the clipped significance objects;
and respectively calculating a second significance score of each significant object according to the calculated second significance mean value and a preset proportionality coefficient, and determining the obtained second significance score as the significance score.
Preferably, the method further comprises:
and performing learning training on a preset neural network through preset training data on a mapping relation between an image and a salient object in the image to obtain the salient detection model, wherein the training data comprises an image data set without the salient object and an image data set containing the salient object.
Further preferably, the preset neural network is a U-Net network, and/or a classical significance detection network.
Further preferably, the U-Net network comprises a down-sampling layer comprising a hopping connection module, the hopping connection module comprising a depth separable convolutional layer and a max-pooling layer.
In another aspect, the present invention provides a salient object detection apparatus for an image, the apparatus comprising:
the detection image acquisition unit is used for acquiring an image to be detected;
the salient object obtaining unit is used for detecting the image to be detected through a salient detection model to obtain all salient objects in the image to be detected;
a saliency score calculation unit for calculating a saliency score of each of the saliency objects, respectively; and
and the significance sorting unit is used for performing significance sorting on all the significance objects according to the significance scores to obtain a significance object with the maximum significance score value and determining the significance object as a target significance object in the image to be detected.
Preferably, the saliency score calculation unit includes:
a first mean value calculating unit, configured to calculate a first saliency score of each of the saliency objects, and calculate a first saliency mean value according to the obtained first saliency scores of all the saliency objects;
a threshold determination unit, configured to determine a significance threshold according to the first significance mean;
the region clipping unit is used for clipping the outline region of each salient object according to the saliency threshold;
the second mean value calculating unit is used for calculating a second significance mean value according to all the clipped significant objects; and
and the score calculating unit is used for respectively calculating a second significance score of each significant object according to the calculated second significance mean value and a preset proportionality coefficient, and determining the obtained second significance score as the significance score.
Preferably, the apparatus further comprises:
and the detection model training unit is used for performing learning training on a mapping relation between an image and a salient object in the image on a preset neural network through preset training data to obtain the salient detection model, wherein the training data comprises an image data set without the salient object and an image data set containing the salient object.
In another aspect, the present invention further provides an image processing apparatus, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the salient object detection method of the image when executing the computer program.
In another aspect, the present invention further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the method for detecting a salient object in an image.
According to the method, an image to be detected is obtained, the image to be detected is detected through a saliency detection model to obtain all saliency objects in the image to be detected, then the saliency score of each saliency object is calculated respectively, all saliency objects are subjected to saliency sorting according to the saliency scores, the saliency object with the largest score value after sorting is determined as a target saliency object in the image to be detected, and therefore the recognition speed and the recognition accuracy of the saliency objects in the multi-scene image are improved.
Drawings
Fig. 1 is a flowchart of an implementation of a salient object detection method for an image according to an embodiment of the present invention;
fig. 2 is a flowchart of an implementation of a salient object detection method for an image according to a second embodiment of the present invention;
fig. 3 is a flowchart of an implementation of a salient object detection method for an image according to a third embodiment of the present invention;
fig. 4 is a schematic diagram of a jump connection module in the salient object detection method for an image according to the third embodiment of the present invention;
fig. 5 is a schematic structural diagram of a salient object detection apparatus for an image according to a fourth embodiment of the present invention;
fig. 6 is a schematic structural diagram of a salient object detection apparatus of an image according to a fifth embodiment of the present invention; and
fig. 7 is a schematic structural diagram of an image processing apparatus according to a sixth embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The following detailed description of specific implementations of the present invention is provided in conjunction with specific embodiments:
the first embodiment is as follows:
fig. 1 shows an implementation flow of a salient object detection method for an image according to a first embodiment of the present invention, and for convenience of description, only the relevant portions of the image according to the first embodiment of the present invention are shown, which is detailed as follows:
in step S101, an image to be detected is acquired.
The embodiment of the invention is suitable for image processing equipment for image display, acquisition and the like. In the embodiment of the present invention, the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (e.g., a cloud storage space) by an image processing device.
In step S102, the to-be-detected image is detected by the saliency detection model, and all saliency objects in the to-be-detected image are obtained.
In the embodiment of the invention, the obtained image to be detected is detected through the saliency detection model, all saliency objects on the image to be detected are obtained, and relevant attribute information (such as a contour region, position information, color and the like) of each saliency object is obtained.
When the image to be detected is detected through the saliency detection model, preferably, the input image to be detected is subjected to feature extraction and image segmentation through a U-Net network and/or a classical saliency detection network so as to obtain all saliency objects on the image to be detected, and therefore the saliency degree and accuracy of saliency detection are improved.
In step S103, a saliency score is calculated for each salient object, respectively.
In the embodiment of the present invention, the saliency score of each saliency object is calculated separately from the correlation attribute information of the saliency object, and the saliency score value is pixel-level.
Preferably, the calculation of the saliency score for each salient object is achieved by:
(1) and respectively calculating a first significance score of each significant object, and calculating a first significance mean value according to the obtained first significance scores of all significant objects.
In the embodiment of the present invention, the first saliency score of each salient object is calculated according to the relevant attribute information of the salient object, and the first saliency mean is calculated according to the obtained first saliency scores of all the salient objects.
As an example, a relative relationship between each salient object and the size of the image to be detected and a color difference between each salient object and the image to be detected may be determined according to the contour region, the position information, and the color of the salient object, a first saliency score of each salient object may be determined according to the relative relationship between the sizes and the color difference, and finally, an average of the first saliency scores of all the salient objects may be calculated to obtain a first saliency average.
(2) A significance threshold is determined from the first significance mean.
In the embodiment of the present invention, the significance threshold is determined according to the first significance mean, and the significance threshold is smaller than the first significance mean, for example, if the first significance mean is M0, the significance threshold M1 is 0.2 × M0.
(3) And respectively clipping the outline area of each salient object according to the saliency threshold.
In the embodiment of the invention, the contour region of each salient object is respectively clipped according to the saliency threshold, and only the contour region of the salient object higher than the current saliency threshold is reserved.
(4) And calculating a second significance mean value according to all the clipped significance objects.
In the embodiment of the invention, the second saliency score of each saliency object is recalculated according to the clipped outline region reserved by each saliency object, and the second saliency average is calculated according to the obtained second saliency score.
(5) And respectively calculating a second significance score of each significant object according to the calculated second significance mean value and a preset proportionality coefficient, and determining the obtained second significance score as a significance score.
In the embodiment of the invention, the scale coefficient is determined according to the area size of the salient objects, the larger the area is, the larger the scale coefficient is, and the second saliency score of each salient object is obtained by multiplying the scale coefficient of each salient object on the basis of the calculated second saliency mean value.
The calculation of the saliency score of each salient object is realized through the steps (1) to (5), so that the priority of the salient objects is clarified through the comparative analysis among a plurality of salient objects in one image.
In step S104, all salient objects are subjected to saliency sorting according to the saliency scores, and the salient object with the highest score value after sorting is determined as the target salient object in the image to be detected.
In the embodiment of the invention, all the salient objects are subjected to the significance sorting according to the score of the significance score, and the salient objects can be subjected to the ascending/descending sorting according to the score, wherein the salient object with the largest score of the significance is the most significant target object in the current image to be detected, and the target object is determined as the target salient object in the image to be detected.
In the embodiment of the invention, all salient objects in the image to be detected are detected through the salient detection model, the salient score of each salient object is respectively calculated, and all salient objects are subjected to salient sequencing according to the salient scores to obtain the most salient target object in the image to be detected, so that the identification speed and the identification accuracy of the salient objects in the multi-scene image are improved.
Example two:
fig. 2 shows an implementation flow of a salient object detection method for an image according to a second embodiment of the present invention, and for convenience of description, only the relevant portions of the embodiment of the present invention are shown, which is detailed as follows:
in step S201, a preset neural network is subjected to learning training of a mapping relationship between an image and a salient object in the image through preset training data to obtain a salient detection model, where the training data includes an image data set without a salient object and an image data set with a salient object.
The embodiment of the invention is suitable for image processing equipment for image display, acquisition and the like. In the embodiment of the present invention, the training data composed of the image dataset without a salient object and the image dataset with a salient object may be a standard dataset, such as an Imagenet dataset, or may be a customized image training dataset, where one or more salient objects may be in the image of the image dataset with a salient object. When the preset neural network is trained, firstly, the fine contour of the salient object on the image is marked out in an artificial mode on the image data set containing the salient object, but the marked salient object is not divided into specific categories, namely, all the salient objects are classified into one category, other non-salient areas on the image are classified into another category, an image pair of the image and a salient result is obtained, then the marked image data set and the image data set without the salient object are used for learning and training the mapping relation between the image and the salient object in the image on the preset neural network, and a salient detection model is obtained, so that the training speed and the training effect of the network are improved.
Preferably, the preset neural network is a U-Net network and/or a classical significance detection network, so that the significance degree and the accuracy of significance detection of the neural network are improved.
Further preferably, the U-Net network is an improved U-Net network including a skip connection module in a downsampling layer, and the skip connection module includes a depth Separable Convolution layer (SepConv) and a Max Pooling layer (Max Pooling), so that excessive loss of details of small target dominant objects in an image during downsampling of a saliency detection model is avoided, and the probability of missing detection of the small target dominant objects is reduced.
In step S202, an image to be detected is acquired.
In step S203, the to-be-detected image is detected by the saliency detection model, and all saliency objects in the to-be-detected image are obtained.
In step S204, a saliency score is calculated for each salient object, respectively.
In step S205, all salient objects are subjected to saliency sorting according to the saliency scores, and the salient object with the largest score value after sorting is determined as the target salient object in the image to be detected.
In the embodiment of the present invention, the detailed implementation of steps S202 to S205 can refer to the description of steps S101 to S104 in the first embodiment, and will not be described herein again.
In the embodiment of the invention, a preset neural network is trained through training data composed of an image data set without a salient object and an image data set containing the salient object to obtain a saliency detection model, all salient objects in an image to be detected are detected through the saliency detection model, the saliency score of each salient object is respectively calculated, all salient objects are subjected to saliency sequencing according to the saliency scores to obtain the most salient target object in the image to be detected, and therefore the recognition speed and the recognition accuracy of the salient objects in the multi-scene image are improved.
Example three:
fig. 3 shows an implementation flow of a salient object detection method for an image provided by the third embodiment of the present invention, and for convenience of description, only the parts related to the third embodiment of the present invention are shown, which are detailed as follows:
in step S301, an image to be detected is acquired.
The embodiment of the invention is suitable for image processing equipment for image display, acquisition and the like. In the embodiment of the present invention, the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (e.g., a cloud storage space) by an image processing device.
In step S302, the image to be detected is detected through the improved U-Net network, and all salient objects in the image to be detected are obtained, wherein the downsampling of the improved U-Net network includes the jump connection module.
In the embodiment of the invention, a saliency detection model is an improved U-Net network which comprises a jump connection module in a lower sampling layer, the improved U-Net network is used for carrying out feature extraction and image segmentation on an input image to be detected to obtain all saliency objects on the image to be detected, and relevant attribute information (such as a contour region, position information, color and the like) of each saliency object is obtained, wherein the jump connection module does not change the whole U-Net structure, and the jump connection module is arranged in the down sampling process of each layer of the U-Net network U-shaped structure.
Preferably, the jump connection module included in the downsampling structure of the improved U-Net network includes a depth Separable Convolution layer (SepConv) and a Max Pooling layer (Max Pooling), so that excessive loss of details of small-target dominant objects in an image during downsampling is avoided, and the probability of missed detection of the small-target dominant objects is reduced.
Further preferably, fig. 4 shows a structure of a jump connection module, where the jump connection module includes 2 SepConv layers, a leakage corrected linear unit (leakage corrected linear unit ) function, and a Max Pooling layer, and the jump connection module implemented by the Max Pooling layer compresses features before downsampling and directly transmits the compressed features to a feature extraction module after downsampling, so as to retain more original features before downsampling, thereby further avoiding excessive loss of details of small target dominant objects in an image during downsampling, and reducing the probability of missed detection of the small target dominant objects. Illustratively, after the feature a is input into the jump connection module, the feature b is obtained after deep separable convolution is carried out through 2 layers of SepConv layers, meanwhile, the Max Pooling layer carries out maximum Pooling operation on the feature a to obtain a feature c, and finally, the jump connection module carries out feature fusion on the features b and c to obtain and output a feature d.
In step S303, a saliency score is calculated for each salient object, respectively.
In step S304, all salient objects are subjected to saliency sorting according to the saliency scores, and the salient object with the highest score value after sorting is determined as the target salient object in the image to be detected.
In the embodiment of the present invention, the detailed implementation of steps S303 to S304 may refer to the description of steps S103 to S104 in the first embodiment, and will not be described herein again.
In the embodiment of the invention, all salient objects in the image to be detected are detected through an improved U-Net network containing a jump connection module in down-sampling, the salient score of each salient object is respectively calculated, and all salient objects are subjected to salient sequencing according to the salient scores to obtain the most salient target object in the image to be detected, so that the identification speed and the identification accuracy of the salient objects in the multi-scene image are improved.
Example four:
fig. 5 shows a structure of a salient object detection apparatus of an image according to a fourth embodiment of the present invention, and for convenience of description, only a part related to the embodiment of the present invention is shown, where:
a detection image acquisition unit 51 for acquiring an image to be detected.
The embodiment of the invention is suitable for image processing equipment for image display, acquisition and the like. In the embodiment of the present invention, the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (e.g., a cloud storage space) by an image processing device.
And the salient object obtaining unit 52 is configured to detect the image to be detected through the salient detection model, and obtain all salient objects in the image to be detected.
In the embodiment of the invention, the obtained image to be detected is detected through the saliency detection model, all saliency objects on the image to be detected are obtained, and relevant attribute information (such as a contour region, position information, color and the like) of each saliency object is obtained.
And a saliency score calculation unit 53 for calculating a saliency score of each of the saliency objects, respectively.
In the embodiment of the present invention, the saliency score of each saliency object is calculated separately from the correlation attribute information of the saliency object, and the saliency score value is pixel-level.
And the significance sorting unit 54 is used for performing significance sorting on all significance objects according to the significance scores, and determining the significance object with the largest score value after sorting as the target significance object in the image to be detected.
In the embodiment of the invention, all the salient objects are subjected to the significance sorting according to the score of the significance score, and the salient objects can be subjected to the ascending/descending sorting according to the score, wherein the salient object with the largest score of the significance is the most significant target object in the current image to be detected, and the target object is determined as the target salient object in the image to be detected.
In the embodiment of the present invention, each unit of the salient object detecting apparatus for an image may be implemented by a corresponding hardware or software unit, and each unit may be an independent software or hardware unit, or may be integrated into a software or hardware unit, which is not limited herein.
Example five:
fig. 6 shows a structure of a salient object detection apparatus of an image provided in the fifth embodiment of the present invention, and for convenience of description, only a part related to the fifth embodiment of the present invention is shown, where:
the detection model training unit 61 is configured to perform learning training on a mapping relationship between an image and a salient object in the image on a preset neural network through preset training data to obtain a salient detection model, where the training data includes an image data set that does not include the salient object and an image data set that includes the salient object.
The embodiment of the invention is suitable for image processing equipment for image display, acquisition and the like. In the embodiment of the present invention, the training data composed of the image dataset without a salient object and the image dataset with a salient object may be a standard dataset, such as an Imagenet dataset, or may be a customized image training dataset, where one or more salient objects may be in the image of the image dataset with a salient object. When the preset neural network is trained, firstly, the fine contour of the salient object on the image is marked out in an artificial mode on the image data set containing the salient object, but the marked salient object is not divided into specific categories, namely, all the salient objects are classified into one category, other non-salient areas on the image are classified into another category, an image pair of the image and a salient result is obtained, then the marked image data set and the image data set without the salient object are used for learning and training the mapping relation between the image and the salient object in the image on the preset neural network, and a salient detection model is obtained, so that the training speed and the training effect of the network are improved.
And a detection image acquisition unit 62 for acquiring an image to be detected.
In the embodiment of the present invention, the image to be detected may be captured in real time by a mobile electronic device with a camera, or may be acquired from a preset storage location (e.g., a cloud storage space) by an image processing device.
And the salient object obtaining unit 63 is configured to detect the image to be detected through the salient detection model, and obtain all salient objects in the image to be detected.
In the embodiment of the invention, the obtained image to be detected is detected through the saliency detection model, all saliency objects on the image to be detected are obtained, and relevant attribute information (such as a contour region, position information, color and the like) of each saliency object is obtained.
When the image to be detected is detected through the saliency detection model, preferably, the input image to be detected is subjected to feature extraction and image segmentation through a U-Net network and/or a classical saliency detection network, so that the saliency degree and the accuracy of saliency detection are improved.
Still preferably, the saliency detection model is an improved U-Net network including jump connection modules in a downsampling layer, wherein the jump connection modules include a depth Separable Convolution layer (SepConv) and a Max Pooling layer (Max Pooling), and the jump connection modules do not change the overall U-Net structure, and there is a jump connection module in each layer of the U-Net network downsampling process, so that details of small target dominant objects in an image are prevented from being lost too much in the downsampling process, and the probability of missing detection of the small target dominant objects is reduced.
Further preferably, the jump connection module comprises 2 SepConv layers, a leakage corrected linear unit (leakage ReLU) function and a Max Pooling layer, the jump connection module realized by the Max Pooling layer compresses the features before downsampling and directly transmits the compressed features to the feature extraction module after downsampling, more original features before downsampling are reserved, and therefore the situation that details of small-target dominant objects in an image are lost too much in the downsampling process is further avoided, and the probability of missed detection of the small-target dominant objects is reduced. Illustratively, after the feature a is input into the jump connection module, the feature b is obtained after deep separable convolution is carried out through 2 layers of SepConv layers, meanwhile, the Max Pooling layer carries out maximum Pooling operation on the feature a to obtain a feature c, and finally, the jump connection module carries out feature fusion on the features b and c to obtain and output a feature d.
And a saliency score calculation unit 64 for calculating a saliency score of each saliency object separately.
In the embodiment of the present invention, the saliency score of each saliency object is calculated separately from the correlation attribute information of the saliency object, and the saliency score value is pixel-level.
And the significance sorting unit 65 is used for performing significance sorting on all significance objects according to the significance scores, and determining the significance object with the largest score value after sorting as the target significance object in the image to be detected.
In the embodiment of the invention, all the salient objects are subjected to the significance sorting according to the score of the significance score, and the salient objects can be subjected to the ascending/descending sorting according to the score, wherein the salient object with the largest score of the significance is the most significant target object in the current image to be detected, and the target object is determined as the target salient object in the image to be detected.
Wherein, preferably, the significant score calculating unit 64 includes:
the first mean calculating unit 641 is configured to calculate a first saliency score of each saliency object, and calculate a first saliency mean according to the obtained first saliency scores of all saliency objects.
In the embodiment of the present invention, the first saliency score of each salient object is calculated according to the relevant attribute information of the salient object, and the first saliency mean is calculated according to the obtained first saliency scores of all the salient objects.
As an example, a relative relationship between each salient object and the size of the image to be detected and a color difference between each salient object and the image to be detected may be determined according to the contour region, the position information, and the color of the salient object, a first saliency score of each salient object may be determined according to the relative relationship between the sizes and the color difference, and finally, an average of the first saliency scores of all the salient objects may be calculated to obtain a first saliency average.
A threshold determining unit 642, configured to determine a significance threshold according to the first significance mean.
In the embodiment of the present invention, the significance threshold is determined according to the first significance mean, and the significance threshold is smaller than the first significance mean, for example, if the first significance mean is M0, the significance threshold M1 is 0.2 × M0.
A region clipping unit 643, configured to clip the outline region of each salient object according to the saliency threshold.
In the embodiment of the invention, the contour region of each salient object is respectively clipped according to the saliency threshold, and only the contour region of the salient object higher than the current saliency threshold is reserved.
A second mean calculation unit 644, configured to calculate a second significant mean according to all the significant objects after clipping.
In the embodiment of the invention, the second saliency score of each saliency object is recalculated according to the clipped outline region reserved by each saliency object, and the second saliency average is calculated according to the obtained second saliency score.
And a score calculating unit 645, configured to calculate a second saliency score of each saliency object according to the calculated second saliency mean and a preset proportionality coefficient, and determine the obtained second saliency score as a saliency score.
In the embodiment of the invention, the scale coefficient is determined according to the area size of the salient objects, the larger the area is, the larger the scale coefficient is, and the second saliency score of each salient object is obtained by multiplying the scale coefficient of each salient object on the basis of the calculated second saliency mean value.
In the embodiment of the present invention, each unit of the salient object detecting apparatus for an image may be implemented by a corresponding hardware or software unit, and each unit may be an independent software or hardware unit, or may be integrated into a software or hardware unit, which is not limited herein.
Example six:
fig. 7 shows a configuration of an image processing apparatus according to a sixth embodiment of the present invention, and for convenience of explanation, only a part related to the embodiment of the present invention is shown.
The image processing apparatus 7 of the embodiment of the present invention includes a processor 70, a memory 71, and a computer program 72 stored in the memory 71 and executable on the processor 70. The processor 70, when executing the computer program 72, implements the steps in the above-described embodiment of the method for detecting salient objects of an image, such as the steps S101 to S104 shown in fig. 1. Alternatively, the processor 70, when executing the computer program 72, implements the functions of the units in the above-described apparatus embodiments, such as the functions of the units 51 to 54 shown in fig. 5.
In the embodiment of the invention, all salient objects in the image to be detected are detected through the salient detection model, the salient score of each salient object is respectively calculated, all the salient objects are subjected to salient sorting according to the salient scores, and the salient object with the largest value after sorting is determined as the target salient object in the image to be detected, so that the identification speed and the identification accuracy of the salient object in the multi-scene image are improved.
The image processing device of the embodiment of the invention can be a smart phone or a personal computer. The steps implemented when the processor 70 in the image processing apparatus 7 executes the computer program 72 to implement the method for detecting a salient object in an image may refer to the description of the foregoing method embodiments, and are not repeated herein.
Example seven:
in an embodiment of the present invention, a computer-readable storage medium is provided, which stores a computer program that, when executed by a processor, implements the steps in the above-described salient object detection method embodiment of the image, for example, steps S101 to S104 shown in fig. 1. Alternatively, the computer program may be adapted to perform the functions of the units of the above-described device embodiments, such as the functions of the units 51 to 54 shown in fig. 5, when executed by the processor.
In the embodiment of the invention, all salient objects in the image to be detected are detected through the salient detection model, the salient score of each salient object is respectively calculated, all the salient objects are subjected to salient sorting according to the salient scores, and the salient object with the largest value after sorting is determined as the target salient object in the image to be detected, so that the identification speed and the identification accuracy of the salient object in the multi-scene image are improved.
The computer readable storage medium of the embodiments of the present invention may include any entity or device capable of carrying computer program code, a recording medium, such as a ROM/RAM, a magnetic disk, an optical disk, a flash memory, or the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. A method for detecting salient objects in an image, the method comprising the steps of:
acquiring an image to be detected;
detecting the image to be detected through a significance detection model to obtain all significance objects in the image to be detected;
calculating a saliency score of each of the saliency objects separately;
and performing significance sorting on all the significant objects according to the significance scores to obtain a significant object with the maximum significance score value, and determining the significant object as a target significant object in the image to be detected.
2. The method of claim 1, wherein the step of separately calculating a saliency score for each said salient object comprises:
respectively calculating a first significance score of each significant object, and calculating a first significance mean value according to the obtained first significance scores of all the significant objects;
determining a significance threshold value according to the first significance mean value;
respectively cutting the outline area of each salient object according to the saliency threshold;
calculating a second significance mean value according to all the clipped significance objects;
and respectively calculating a second significance score of each significant object according to the calculated second significance mean value and a preset proportionality coefficient, and determining the obtained second significance score as the significance score.
3. The method of claim 1, wherein before the detecting the image to be detected by the saliency detection model to obtain all the saliency objects in the image to be detected, the method further comprises:
and performing learning training on a preset neural network through preset training data on a mapping relation between an image and a salient object in the image to obtain the salient detection model, wherein the training data comprises an image data set without the salient object and an image data set containing the salient object.
4. The method of claim 3, wherein the pre-set neural network is a U-Net network, and/or a classical significance detection network.
5. The method of claim 4, wherein the U-Net network comprises a downsampling layer including a skip connection module, the skip connection module comprising a depth separable convolutional layer and a max-pooling layer.
6. An apparatus for detecting a salient object in an image, the apparatus comprising:
the detection image acquisition unit is used for acquiring an image to be detected;
the salient object obtaining unit is used for detecting the image to be detected through a salient detection model to obtain all salient objects in the image to be detected;
a saliency score calculation unit for calculating a saliency score of each of the saliency objects, respectively; and
and the significance sorting unit is used for performing significance sorting on all the significance objects according to the significance scores to obtain a significance object with the maximum significance score value and determining the significance object as a target significance object in the image to be detected.
7. The apparatus of claim 6, wherein the prominence score calculation unit comprises:
a first mean value calculating unit, configured to calculate a first saliency score of each of the saliency objects, and calculate a first saliency mean value according to the obtained first saliency scores of all the saliency objects;
a threshold determination unit, configured to determine a significance threshold according to the first significance mean;
the region clipping unit is used for clipping the outline region of each salient object according to the saliency threshold;
the second mean value calculating unit is used for calculating a second significance mean value according to all the clipped significant objects; and
and the score calculating unit is used for respectively calculating a second significance score of each significant object according to the calculated second significance mean value and a preset proportionality coefficient, and determining the obtained second significance score as the significance score.
8. The apparatus of claim 6, wherein the apparatus further comprises:
and the detection model training unit is used for performing learning training on a mapping relation between an image and a salient object in the image on a preset neural network through preset training data to obtain the salient detection model, wherein the training data comprises an image data set without the salient object and an image data set containing the salient object.
9. An image processing apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 5 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
CN202011479093.2A 2020-12-15 2020-12-15 A method, device, equipment and storage medium for detecting significant objects in an image Active CN112581446B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011479093.2A CN112581446B (en) 2020-12-15 2020-12-15 A method, device, equipment and storage medium for detecting significant objects in an image
PCT/CN2021/138277 WO2022127814A1 (en) 2020-12-15 2021-12-15 Method and apparatus for detecting salient object in image, and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011479093.2A CN112581446B (en) 2020-12-15 2020-12-15 A method, device, equipment and storage medium for detecting significant objects in an image

Publications (2)

Publication Number Publication Date
CN112581446A true CN112581446A (en) 2021-03-30
CN112581446B CN112581446B (en) 2024-12-13

Family

ID=75135251

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011479093.2A Active CN112581446B (en) 2020-12-15 2020-12-15 A method, device, equipment and storage medium for detecting significant objects in an image

Country Status (2)

Country Link
CN (1) CN112581446B (en)
WO (1) WO2022127814A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113592390A (en) * 2021-07-12 2021-11-02 嘉兴恒创电力集团有限公司博创物资分公司 Warehousing digital twin method and system based on multi-sensor fusion
WO2022127814A1 (en) * 2020-12-15 2022-06-23 影石创新科技股份有限公司 Method and apparatus for detecting salient object in image, and device and storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP7643063B2 (en) * 2021-02-10 2025-03-11 日本電気株式会社 DATA GENERATION DEVICE, DATA GENERATION METHOD, AND PROGRAM
CN115439726B (en) * 2022-11-07 2023-02-07 腾讯科技(深圳)有限公司 Image detection method, device, equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105513080A (en) * 2015-12-21 2016-04-20 南京邮电大学 Infrared image target salience evaluating method
CN106296638A (en) * 2015-06-04 2017-01-04 欧姆龙株式会社 Significance information acquisition device and significance information acquisition method
CN108345892A (en) * 2018-01-03 2018-07-31 深圳大学 A kind of detection method, device, equipment and the storage medium of stereo-picture conspicuousness
CN109472259A (en) * 2018-10-30 2019-03-15 河北工业大学 Image co-saliency detection method based on energy optimization
CN109509191A (en) * 2018-11-15 2019-03-22 中国地质大学(武汉) A kind of saliency object detection method and system
CN110399847A (en) * 2019-07-30 2019-11-01 北京字节跳动网络技术有限公司 Extraction method of key frame, device and electronic equipment
CN110648334A (en) * 2019-09-18 2020-01-03 中国人民解放军火箭军工程大学 Multi-feature cyclic convolution saliency target detection method based on attention mechanism
CN110853053A (en) * 2019-10-25 2020-02-28 天津大学 Salient object detection method taking multiple candidate objects as semantic knowledge
CN111399731A (en) * 2020-03-12 2020-07-10 深圳市腾讯计算机系统有限公司 Image manipulation intent processing method, recommended method, device, electronic device, and storage medium
CN111524145A (en) * 2020-04-13 2020-08-11 北京智慧章鱼科技有限公司 Intelligent picture clipping method and system, computer equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109146892B (en) * 2018-07-23 2020-06-19 北京邮电大学 Image clipping method and device based on aesthetics
CN109040605A (en) * 2018-11-05 2018-12-18 北京达佳互联信息技术有限公司 Shoot bootstrap technique, device and mobile terminal and storage medium
CN112581446B (en) * 2020-12-15 2024-12-13 影石创新科技股份有限公司 A method, device, equipment and storage medium for detecting significant objects in an image

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106296638A (en) * 2015-06-04 2017-01-04 欧姆龙株式会社 Significance information acquisition device and significance information acquisition method
CN105513080A (en) * 2015-12-21 2016-04-20 南京邮电大学 Infrared image target salience evaluating method
CN108345892A (en) * 2018-01-03 2018-07-31 深圳大学 A kind of detection method, device, equipment and the storage medium of stereo-picture conspicuousness
CN109472259A (en) * 2018-10-30 2019-03-15 河北工业大学 Image co-saliency detection method based on energy optimization
CN109509191A (en) * 2018-11-15 2019-03-22 中国地质大学(武汉) A kind of saliency object detection method and system
CN110399847A (en) * 2019-07-30 2019-11-01 北京字节跳动网络技术有限公司 Extraction method of key frame, device and electronic equipment
CN110648334A (en) * 2019-09-18 2020-01-03 中国人民解放军火箭军工程大学 Multi-feature cyclic convolution saliency target detection method based on attention mechanism
CN110853053A (en) * 2019-10-25 2020-02-28 天津大学 Salient object detection method taking multiple candidate objects as semantic knowledge
CN111399731A (en) * 2020-03-12 2020-07-10 深圳市腾讯计算机系统有限公司 Image manipulation intent processing method, recommended method, device, electronic device, and storage medium
CN111524145A (en) * 2020-04-13 2020-08-11 北京智慧章鱼科技有限公司 Intelligent picture clipping method and system, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李建伟 等: "基于条件生成对抗网络的视频显著性目标检测", 《传感器与微系统》, vol. 38, no. 11, pages 129 - 132 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022127814A1 (en) * 2020-12-15 2022-06-23 影石创新科技股份有限公司 Method and apparatus for detecting salient object in image, and device and storage medium
CN113592390A (en) * 2021-07-12 2021-11-02 嘉兴恒创电力集团有限公司博创物资分公司 Warehousing digital twin method and system based on multi-sensor fusion

Also Published As

Publication number Publication date
CN112581446B (en) 2024-12-13
WO2022127814A1 (en) 2022-06-23

Similar Documents

Publication Publication Date Title
CN112581446B (en) A method, device, equipment and storage medium for detecting significant objects in an image
CN112381104B (en) Image recognition method, device, computer equipment and storage medium
CN112861575A (en) Pedestrian structuring method, device, equipment and storage medium
CN109376256B (en) Image searching method and device
CN113378770B (en) Gesture recognition method, device, equipment and storage medium
WO2021184718A1 (en) Card border recognition method, apparatus and device, and computer storage medium
CN111553302B (en) Key frame selection method, apparatus, device, and computer-readable storage medium
CN109726678B (en) License plate recognition method and related device
CN113487610B (en) Herpes image recognition method and device, computer equipment and storage medium
CN114721403B (en) Automatic driving control method, device and storage medium based on OpenCV
CN111415373A (en) Target tracking and segmenting method, system and medium based on twin convolutional network
CN111753766B (en) Image processing method, device, equipment and medium
Jency et al. Traffic Sign Recognition System for Autonomous Vehicles using Deep Learning
CN116052230A (en) Palm vein recognition method, device, equipment and storage medium
CN113688839B (en) Video processing method and device, electronic equipment and computer readable storage medium
CN114596440A (en) Semantic segmentation model generation method and device, electronic equipment and storage medium
CN113657283A (en) Visual positioning method and device and electronic equipment
CN111862159A (en) Improved target tracking and segmentation method, system and medium for twin convolutional network
US20220122341A1 (en) Target detection method and apparatus, electronic device, and computer storage medium
CN112419227B (en) Underwater target detection method and system based on small target search scaling technology
CN111124862B (en) Intelligent device performance testing method and device and intelligent device
CN114550288A (en) Event data based action identification method and device
CN107220650B (en) Food image detection method and device
CN108171149B (en) A face recognition method, device, device and readable storage medium
CN112989924A (en) Target detection method, target detection device and terminal equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant