CN109636784B

CN109636784B - Image saliency object detection method based on maximum neighborhood and superpixel segmentation

Info

Publication number: CN109636784B
Application number: CN201811488182.6A
Authority: CN
Inventors: 李洁; 张航; 王颖; 王飞; 陈聪; 张敏
Original assignee: Xidian University
Current assignee: Xidian University
Priority date: 2018-12-06
Filing date: 2018-12-06
Publication date: 2021-07-27
Anticipated expiration: 2038-12-06
Also published as: CN109636784A

Abstract

The invention proposes an image salient target detection method based on maximum neighborhood and superpixel segmentation, which is used to solve the technical problem of low image salient target detection accuracy in the prior art. The implementation steps are: 1. Perform superpixel segmentation on the image to be detected; 2. Count the frequency of occurrence of each color in the image to be detected; 3. Perform color substitution on the image to be detected; 4. Preprocess the image after color substitution; 5. . Calculate the initial saliency image of the image to be detected; 6. Determine the saliency values of the K superpixel blocks; 7. Obtain and output the final saliency image. The invention improves the accuracy of image saliency target detection, can uniformly highlight the image saliency target, and can be used for image preprocessing in the field of computer vision.

Description

Image saliency target detection method based on maximum neighborhood and super-pixel segmentation

Technical Field

The invention belongs to the technical field of computer image processing, relates to an image saliency target detection method, in particular to an image saliency target detection method based on maximum neighborhood and superpixel segmentation, and can be used for an image preprocessing process in the field of computer vision.

Background

When a human being views an image, the human being usually only focuses on a more significant portion of the entire image. Therefore, in computer simulation of the human visual system, simulation is mainly performed by detecting salient regions in images. The image saliency target detection can improve the performance of a plurality of computer vision and image processing algorithms, and can be particularly used in the research fields of image segmentation, target recognition, image retrieval and the like.

According to the detection principle, the image saliency target detection can be divided into three types, namely a model based on global comparison, a model based on background prior and a model based on local comparison, wherein the model based on global comparison is used for calculating a saliency value by comparing pixel points and global characteristics, so that the problem that the interior of a target cannot be detected can be reduced, but when the image foreground is complex and the appearance is changeable, the method cannot accurately detect the target; the background prior-based model judges background information in an image to be detected through background prior, and then inhibits the detected background information when calculating a significant characteristic value.

The model based on local comparison calculates a significance value by comparing pixel points and local region characteristics of the pixel points, can detect a small target in an image, but for a larger target, the method can only detect a target boundary and cannot detect the interior of the target. For example, patent application publication No. CN103996195A entitled "a method for detecting saliency of an image" discloses an algorithm for detecting feature values of an image by fusing various feature values of the image to the same range. The method comprises the steps of partitioning an image into image blocks with the same size, and then calculating a brightness characteristic value, a color characteristic value, a direction characteristic value, a depth characteristic value and a sparse characteristic value of each block; the method comprises the steps of quantizing all characteristic values of image blocks to the same interval range, carrying out fusion calculation on all characteristic values to obtain difference values between all image blocks and the rest image blocks, determining weighting coefficients, carrying out weighting summation calculation on the difference values between all image blocks and the rest image blocks to obtain significance values of all image blocks, and finally obtaining an image significance detection result. The method can provide the most characteristic values for the image sub-blocks, but has the defect that the final significance detection accuracy is low because the significance detection image is obtained by weighting the difference values between different sub-blocks of the image, and the non-target area is reserved when the significance target in the image is detected.

For another example, in an article "sales detection using maximum symmetry systematic surrounding" published by Achanta et al in ICIP 2010, color and brightness information of pixel points in an image is used to provide a Saliency image with full resolution based on a maximum symmetric neighborhood detection image Saliency target, and the method can detect the Saliency target, but cannot remove a non-target region, resulting in low detection accuracy.

Disclosure of Invention

The invention aims to provide an image saliency target detection method based on maximum neighborhood and super-pixel segmentation aiming at overcoming the defects of the prior art and aiming at improving the accuracy of image saliency target detection.

The technical idea of the invention is as follows: under Lab space, taking the two norms of the difference value between the color vector of each pixel point and the average color vector in the maximum neighborhood of the position of each pixel point as the significance value of the current pixel point to obtain an initial significance image of the image to be detected, and then determining the significance value of each super-pixel block according to the initial significance image and the super-pixel segmentation result of the image to be detected to obtain a final significance image of the image to be detected, wherein the specific implementation steps are as follows:

(1) performing superpixel segmentation on an image to be detected:

performing superpixel segmentation on an image to be detected to obtain K superpixel blocks and storing the K superpixel blocks, wherein K is more than or equal to 200;

(2) counting the frequency of each color in the image to be detected:

dividing three color channels in RGB color space into N equal parts, wherein N is more than or equal to 10 to obtain N³Color is planted and counted in the image to be detected and N is counted³The frequency of occurrence of each color corresponding to the seed color;

(3) carrying out color substitution on an image to be detected:

arranging all the counted colors according to the sequence of the appearance frequency from large to small, sequentially accumulating the appearance frequency of each color in the sequence until the accumulation result is 80% of the total pixel number M of the image to be detected, and keeping the representative color C ═ C of the frequency contained in the accumulation result_p1,C_p2,…,C_pi,…,C_ppAnd simultaneously, the representative color C is used for corresponding to the frequency which does not participate in accumulation_t1,C_t2,…C_tj,…,C_ttReplacing to obtain an image with the replaced color;

(4) preprocessing the image after color substitution:

performing Gaussian filtering on the image after color replacement, and performing RGB-to-Lab color space conversion on the filtered image to obtain a preprocessed image in a Lab space;

(5) calculating an initial saliency image of an image to be detected:

(5a) carrying out color channel separation on the image preprocessed in the Lab space to obtain a color vector I (x, y) of each pixel point, wherein the (x, y) is a coordinate of the pixel point;

(5b) calculating the average color vector I in the maximum neighborhood of the position (x, y) of each pixel point_μ(x, y), and mixing I (x, y) and I_μTaking the two-norm of the (x, y) difference value as the significance value of the current pixel point;

(5c) normalizing the significance values of all the pixel points to obtain an initial significance image sm of the image to be detected;

(6) determining significance values for K superpixel blocks:

(6a) taking the average significance value T of an initial significance image sm of an image to be detected as a threshold, marking the pixel points with the significance values larger than the threshold in sm as 1, and marking the rest pixel points as 0 to obtain the significance label of each pixel point;

(6b) judging whether the number of the pixels with the pixel saliency labels of 1 in each super pixel block exceeds half, if so, taking 1 as the saliency value K of the super pixel block_lOtherwise, 0 is taken as the super imageSignificance value K of a prime block_lObtaining significance values of K superpixel blocks;

(7) acquiring a final saliency image and outputting:

and assigning the significance value of each super-pixel block in the K super-pixel blocks to each pixel contained in the super-pixel block to obtain a significance map SM ', and taking the maximum connected domain in the SM' as a final significance image and outputting the final significance image.

Compared with the prior art, the invention has the following advantages:

1) the method adopts a significance value calculation method based on maximum neighborhood and superpixel segmentation, after an initial significance image is obtained through maximum neighborhood calculation, according to the combination of a superpixel segmentation result and the initial significance image, the significance value of a superpixel block is determined, the significance value of the superpixel block is assigned to pixel points contained in the superpixel block to obtain a significance detection image, then the maximum connected domain of the significance detection image is taken as a finally output significance target detection image, non-target regions in the image are effectively removed, and a simulation result shows that the method can accurately detect the significance target of the image and improve the accuracy of the significance target detection.

2) In the image preprocessing process, the color substitution operation is carried out on the image to be detected, the main color in the image to be detected is reserved, and meanwhile, the main color is used for substituting the non-main color, so that the color interference of a non-target area is reduced, and the accuracy of the detection of the saliency target is improved.

Drawings

FIG. 1 is a flow chart of an implementation of the present invention;

FIG. 2 is an image to be detected as employed in an embodiment of the present invention;

fig. 3 is a diagram of a target result of a manual mark in an image to be detected, which is adopted in a simulation experiment of the present invention, and a simulation diagram of a detection result of the prior art and the present invention.

Detailed Description

The invention is described in further detail below with reference to the figures and the specific embodiments.

Referring to fig. 1, an image saliency target detection method based on maximum neighborhood and super-pixel segmentation includes the following steps:

step 1) performing superpixel segmentation on an image to be detected:

an image to be detected adopts an SLIC superpixel segmentation method, SLIC is a short name of a simple linear iterative clustering algorithm (simple linear iterative cluster), the SLIC algorithm considers space and color distance between pixel points at the same time, the image is segmented into superpixel blocks containing a plurality of pixel points, K superpixel blocks are obtained and stored finally, and the segmentation quantity K of the best experimental effect is obtained by comparing a plurality of commonly used values K200, 250,300,400 and 500, wherein the segmentation quantity K of the best experimental effect is 200, the image to be detected is shown in FIG. 2, a significant target in the image to be detected is a flower, and a non-target area in the image comprises leaves and branches of the flower;

step 2) counting the frequency of each color in the image to be detected:

the method comprises the steps of dividing three color channels in an RGB color space into N equal parts respectively, enabling the range of the three color channels of the RGB to be 0-255, enabling a model of the RGB color space to be a space cube, and dividing the RGB color space into N equal parts after evenly dividing the edges of the cube³Comparing the experimental effects of a plurality of common values N10, 14, 16 and 32 to obtain the best experimental effect, N16, and dividing RGB color space into 16³Color is planted and counted in the image to be detected and 16³The frequency of occurrence of each color corresponding to the seed color;

step 3) carrying out color substitution on the image to be detected:

arranging all the counted colors according to the sequence of the appearance frequency from large to small, sequentially accumulating the appearance frequency of each color in the sequence until the accumulation result is 80% of the total pixel number M of the image to be detected, and keeping the representative color C ═ C of the frequency contained in the accumulation result_p1,C_p2,…,C_pi,…,C_ppRepresenting colors are colors with higher occurrence frequency in the image to be detected, including colors of the image significance targets, and representing colors C are used for colors which do not participate in image significanceColor C' corresponding to the accumulated frequency ═ C_t1,C_t2,…C_tj,…,C_ttReplacing:

wherein, the representative color C is used to correspond to the frequency not participating in accumulation_t1,C_t2,…C_tj,…,C_ttThe replacing steps are as follows:

step 3a) calculating the color C corresponding to the frequency not participating in accumulation_tjAnd representative color C ═ C_p1,C_p2,…,C_pi,…,C_ppThe Euclidean distance of } of

The calculation formula is as follows:

wherein, C_tj,RAnd C_pi,RRepresents the R component, C_tj,GAnd C_pi,GRepresents the G component, C_tj,BAnd C_pi,BRepresents the B component;

step 3b) selecting the Euclidean distance with the minimum value from the calculated Euclidean distances

And through use of

C in (1)_p′Color C in image to be detected_tjColor replacement is carried out, wherein

The selection formula is as follows:

step 3C) replacing the color C' with low frequency in the image to be detected by using the representative color C through the steps 3a and 3b to obtain an image with replaced color, wherein the image only contains the representative color C, the color with high frequency is the color containing the target area, and the color replacement can effectively reduce the color interference of the non-target area;

step 4) preprocessing the image after color substitution:

the image after color replacement is subjected to Gaussian filtering, the image can be effectively smoothed by the Gaussian filtering, a filtering template with the length of 3 multiplied by 3 and the sigma of 0.5 is adopted, RGB-to-Lab color space conversion is carried out on the filtered image, a preprocessed image in a Lab space is obtained, the brightness and color information of the image can be provided in the Lab space, the difference between different colors can be more fully shown, and the conversion formula is as follows:

wherein R, G, B represents the red, green, and blue color components, respectively, and L, a, and b represent the color components of the color space-converted luminance, green to red, and blue to yellow, respectively;

step 5) calculating an initial saliency image of the image to be detected:

step 5a) color channel separation is carried out on the image preprocessed in the Lab space, and a color vector I (x, y) of each pixel point is obtained:

separating the preprocessed image under the Lab space into three channels of L, a and b, wherein I (x, y) is composed of a brightness component value L (x, y), a color component value a (x, y) and b (x, y), and the combination mode is as follows:

I(x,y)＝(L(x,y),a(x,y),b(x,y))

wherein, (x, y) represents the coordinates of the pixel;

step 5b) calculating the average color vector I in the maximum neighborhood of the position of each pixel point_μ(x, y), and mixing I (x, y) and I_μTaking the two-norm of the (x, y) difference as the significance value of the current pixel:

step 5b1) the maximum neighborhood is the maximum rectangular region with the position of the pixel point (x, y) as the center point, and is used for calculating the pixel point (x, y)The significance value provides a more reasonable local area, and the average color vector I in the maximum neighborhood of the position of each pixel point_μThe formula for the calculation of (x, y) is:

x₀＝min(x,w-x)

y₀＝min(y,h-y)

A＝(2x₀+1)(2y₀+1)

w and h respectively represent the width and height of the image to be detected, I (I, j) is a color vector with pixel point coordinates (I, j), and x₀,y₀Respectively representing the width and half of the height width of the maximum neighborhood taking (x, y) as a central point, and A representing the total number of pixel points contained in the maximum neighborhood taking (x, y) as the central point;

step 5b2) combine I (x, y) and I_μTaking the two-norm of the (x, y) difference as the significance value of the current pixel point, wherein the calculation formula is as follows:

S(x,y)＝||I_μ(x,y)-I(x,y)||₂

wherein S (x, y) represents a saliency value calculated for a (x, y) position of the pixel coordinate.

Step 5c) normalizing the significance values of all the pixel points obtained in the step 5b to 0-255 to obtain an initial significance image sm of the image to be detected, wherein the initial significance image sm is a detection result image similar to a gray image, and the greater the significance value of the pixel point is, the more likely the position of the pixel point is to be a significant target in the image;

step 6) determining the significance values of the K superpixel blocks:

step 6a) taking the average significance value T of the initial significance image sm of the image to be detected as a threshold, marking the pixel points with the pixel point significance values larger than the threshold in the sm as 1, marking the rest pixel points as 0, and obtaining the significance label of each pixel point, wherein the average significance value T is an integral expression of the significance value of the initial significance image sm, and the significance labels of the pixel points are obtained by taking the average significance value T as the threshold, so that the significance degree of the pixel points in the image to be detected can be reflected better:

step 6a1) average saliency value T of the initial saliency image sm, which is calculated by the formula:

wherein λ is a threshold parameter, and comparing the experimental effects of a plurality of common values λ 1, 1.1, 1.2, 1.4, where λ is 1.2 when the optimal experimental effect is obtained, sm (x, y) represents a saliency value of an (x, y) position in the initial saliency image;

step 6a2), the calculation formula of the saliency label of the pixel point is as follows:

wherein sm' (x, y) is a significance label result with coordinates of (x, y) position;

step 6b) judging whether the number of the pixels with the pixel saliency labels of 1 in each super pixel block exceeds half, if so, taking 1 as the saliency value K of the super pixel block_lOtherwise, 0 is taken as the saliency value K of the super-pixel block_lAnd obtaining significance values of K superpixel blocks, wherein the superpixel blocks comprise a series of pixel points with similar colors and brightness, and the significance values of the superpixel blocks can be more accurately represented by judging whether more than half of the pixel point significance labels in the superpixel blocks are 1, and the interference of non-significant pixel points in the superpixel blocks can be reduced, and K is_lThe calculation formula of (2) is as follows:

wherein n is the number of pixel points contained in the first super pixel block, and K is the number of the super pixel blocks;

and 7) acquiring and outputting a final saliency image:

assigning the significance value of each super-pixel block in the K super-pixel blocks to each pixel contained in the super-pixel blocks to obtain a significance map SM ', and taking the maximum connected domain in the SM' as a final significance image and outputting:

for each super-pixel block, the significance value of the super-pixel block is assigned to each pixel point contained by the super-pixel block as the significance value of the pixel point to obtain a significance map SM ', at the moment, the SM ' contains a detection result of a significant target in an image to be detected and a small non-target area, the significant target in the image is the target which can most cause a point of interest, for the small non-target area, the non-target area can be removed by obtaining the maximum connected domain in the image, the maximum connected domain is a connected area with the largest area (8 connected) in all connected domains of the binary image, the accuracy of significant target detection can be improved by selecting the maximum connected domain, and the maximum connected domain in the significance map SM ' is finally output as the detection result.

The technical effects of the present invention will be further described with reference to simulation experiments.

1. Simulation conditions are as follows: the invention is performed on WINDOWS 10 systems using MatlabR2014a platform.

2. And (5) simulating content and result analysis.

Simulation 1:

the image to be detected adopted by the embodiment of the invention is shown in fig. 2, the salient object in the image to be detected is a flower, and the non-object area comprises a flower leaf and a flower branch. Fig. 3 includes a target result diagram (a) of an artificial marker in an image to be detected, which is adopted in a simulation experiment of the present invention, and a detection result simulation diagram (b) of the prior art and a detection result simulation diagram (c) of the present invention. By comparing the detection result graph (b) with the graph (c), the method can accurately detect the salient object in the image and has good effect of inhibiting the non-object area.

Simulation 2:

the average accuracy of the prior art and the present invention on the MSRA1K dataset is shown in the table below, from which it can be seen that the present invention has a significant improvement in accuracy over the prior art.

	Prior Art	The invention
			Rate of accuracy	0.803	0.847

Claims

1. an image saliency target detection method based on maximum neighborhood and superpixel segmentation, is characterized in that comprising the following steps:

(1) Perform superpixel segmentation on the image to be detected:

Perform superpixel segmentation on the image to be detected, obtain K superpixel blocks and save them, K≥200;

(2) Count the frequency of each color in the image to be detected:

Divide the three color channels in the RGB color space into N equal parts, N≥10, to obtain N ³ colors, and count the occurrence frequency of each color corresponding to the N ³ colors in the image to be detected;

(3) Replace the color of the image to be detected:

Arrange all the counted colors in descending order of occurrence frequency, and accumulate the frequency of occurrence of each color in the sequence obtained by sorting, until the accumulated result is 80% of the total number of pixels M of the image to be detected, keep it The representative color C={C _p1 ,C _p2 ,...,C _pi ,...,C _pp } of the frequencies included in the accumulation result, and the color C'={C _t1 , C _t2 ,…C _tj ,…,C _tt } is replaced to obtain an image after color substitution;

(4) Preprocess the image after color substitution:

Perform Gaussian filtering on the color-replaced image, and convert the filtered image from RGB to Lab color space to obtain a preprocessed image in Lab space;

(5) Calculate the initial saliency image of the image to be detected:

(5a) Perform color channel separation on the preprocessed image in Lab space, and obtain the color vector I(x, y) of each pixel point, where (x, y) is the coordinate of the pixel point;

(5b) Calculate the maximum neighborhood of the position (x, y) of each pixel point, that is, calculate the average color vector I _μ (x , y), and the two-norm of the difference between I (x, y) and I _μ (x, y) is used as the saliency value of the current pixel;

(5c) Normalize the saliency values of all pixel points to obtain the initial saliency image sm of the image to be detected;

(6) Determine the saliency values of the K superpixel blocks:

(6a) The average saliency value T of the initial saliency image sm of the image to be detected is used as the threshold value, and the pixel points in sm whose saliency value is greater than the threshold value are marked as 1, and the rest of the pixels are marked as 0. saliency labels of pixels;

(6b) Judging whether the pixels with the saliency label of 1 in each superpixel block are more than half, if so, take 1 as the saliency value K1 of the _superpixel block, otherwise, take 0 as the saliency value of the superpixel block The saliency value K _l , the saliency values of K superpixel blocks are obtained;

(7) Obtain the final saliency image and output:

The saliency value of each superpixel block in the K superpixel blocks is assigned to each pixel contained in the superpixel block, and the saliency map SM' is obtained, and the maximum connected domain in SM' is used as the final saliency image. output.

2. the image saliency target detection method based on maximum neighborhood and superpixel segmentation according to claim 1, is characterized in that, described in step (3) by representing color C corresponding to the frequency that does not participate in accumulating. Color C′={C _t1 , C _t2 ,…C _tj ,…,C _tt } is replaced, and the implementation steps are:

(3a) Calculate the Euclidean distance between the color C _tj corresponding to the frequency not participating in the accumulation and the representative color C={C _p1 ,C _p2 ,...,C _pi ,...,C _pp }

The calculation formula is:

Among them, C _{tj, R} and C _{pi, R} represent the R component, C _{tj, G} and C _{pi, G} represent the G component, C _{tj, B} and C _{pi, B} represent the B component;

(3b) Select the smallest value among the calculated Euclidean distances

and by using

The C _p' color in the to-be-detected image is replaced by the C _tj color, where

The selection formula is:

3. the image saliency target detection method based on maximum neighborhood and superpixel segmentation according to claim 1, is characterized in that, described in step (4) is carried out RGB to Lab color space conversion to the filtered image, The conversion formula is:

Among them, R, G, B represent the red, green, and blue color components, respectively, and L, a, and b represent the color space-converted brightness, green-to-red, and blue-to-yellow color components, respectively.

4. the image saliency target detection method based on maximum neighborhood and superpixel segmentation according to claim 1, is characterized in that, the color vector I (x, y) of the pixel point described in the step (5a), realizes The steps are:

The preprocessed image in Lab space is separated into three channels: L, a, and b. I(x,y) consists of luminance component values L(x,y), color component values a(x,y) and b(x). ,y) composition, the combination method is:

I(x,y)=(L(x,y),a(x,y),b(x,y))

Among them, (x, y) represents the coordinates of the pixel point.

5. The image saliency target detection method based on maximum neighborhood and superpixel segmentation according to claim 1, is characterized in that, the maximum value of each pixel location (x, y) described in step (5b) The average color vector I _μ (x, y) in the neighborhood, its calculation formula is:

x ₀ =min(x,wx)

y ₀ =min(y,hy)

A=(2x ₀ +1)(2y ₀ +1)

Among them, w and h represent the width and height of the image to be detected, respectively, I(i, j) is the color vector with the pixel coordinates (i, j), (x, y) represents the coordinates of the pixel, x ₀ , y ₀ represents the maximum neighborhood width and half of the height and width with (x, y) as the center point, respectively, and A represents the total number of pixels contained in the maximum neighborhood with (x, y) as the center point.

6. The image saliency target detection method based on maximum neighborhood and superpixel segmentation according to claim 1, wherein the average saliency of the initial saliency image sm of the image to be detected described in the step (6a) value T, which is calculated as:

Among them, λ is the threshold parameter, w and h represent the width and length of the image to be detected, respectively, (x, y) are the pixel coordinates, and sm(x, y) represents the saliency at the (x, y) position in the initial saliency image. sex value.

7. the image saliency target detection method based on maximum neighborhood and superpixel segmentation according to claim 1, is characterized in that, the saliency value K1 of the _superpixel block described in the step (6b), calculation formula is :

Among them, n is the number of pixels contained in the l-th superpixel block, sm'(x, y) is the saliency label whose pixel coordinate is (x, y), and K is the number of superpixel blocks.