Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a method and a system for quickly capturing a target in a production line video sequence, which can adapt to the change of the movement speed and direction of a production line and ensure that the target is not missed and mistakenly detected.
To achieve the above object, the present invention provides a method for rapidly capturing a target in a production line video sequence, comprising the following steps:
a method of capturing a target in a production line video sequence, comprising the steps of:
(1) inputting an nth frame of original production line image, and carrying out target positioning in the image to obtain a target positioning state and a target front-rear boundary position;
(2) if the target positioning state indicates that only the front boundary of the target is positioned, the step (3) is carried out; if the target positioning state indicates that the complete target is positioned, the step (4) is carried out; if the target positioning state indicates that the target is not found or is only positioned to the rear boundary of the target, the step (5) is carried out;
(3) if the target positioning states of the current frame and the historical continuous K-1 frames indicate that only the front boundary of the target is positioned, estimating the moving speed and the moving direction of the target by solving the change of the position of the front boundary of the target in the K frame image; otherwise, recording the position of the target front boundary and the frame number of the current frame, and turning to the step (5), wherein K is larger than 1;
(4) if the target positioning state of the previous frame of the current frame indicates that a complete target is positioned, predicting the position of the previous target in the current frame image according to the target boundary position of the previous frame and the estimated target moving speed, and if the predicted position is matched with the current target positioning, indicating that the current target is captured and abandoning the target; otherwise, taking the current target as a new target, outputting the current frame image, and turning to the step (5);
(5) and (3) adding 1 to the original image frame number N, ending the processing flow if N is greater than N, otherwise, turning to the step (1) to wait for entering the next frame, wherein N is the total frame number of the production line video sequence.
According to a preferred embodiment, the specific implementation manner of the target location in step (1) is as follows:
image IMG of n-th frame production line acquired at current moment
nThe gray value of each pixel is
Wherein subscripts i and j respectively represent the abscissa and ordinate of the pixel in the image, and IMG _ row and IMG _ col respectively represent IMG
nHeight, width of (d);
performing binary segmentation and contour extraction on the current production line image frame to obtain an initial set of closed contours, and screening the closed contour set according to geometric features and the centroid positions;
when a target contour exists, carrying out contour filling and vertical projection on the target contour, and carrying out binary segmentation and column-by-column scanning on a vertical projection result;
when a target area is found in the scanning, the front and rear boundary positions of the target area are recorded as obj _ front and obj _ back, and a target positioning state obj _ state is determined according to the front and rear boundary positions of the target area.
According to a preferred embodiment, the specific implementation manner of the target location in step (1) is as follows:
step 101: IMG current production line image framenPerforming binary segmentation and contour extraction;
the specific formula of binary segmentation is as follows:
wherein obj _ th is a current frame segmentation threshold value, and the value range is 0-255; obj _ type is the split direction, a value of 0 indicates a forward split, and a value of 1 indicates a reverse split;&&it is shown that the two conditions on the left and right are simultaneously satisfied,
dividing the ith row and the jth column of the nth frame image into pixel values;
the operation process of contour extraction is as follows: selecting a structural element (ele _ m, ele _ n) pair with the height of ele _ m and the width of ele _ n
Performing binary expansion for ele _ t times to obtain binary expansion image DILATE
n(ii) a To pair of DILATE
nCarrying out contour extraction to obtain an initial set of closed outer contours, wherein the number of the closed outer contours is cont _ size;
step 102: screening the closed outer contour to reduce the influence of illumination and background environment, and specifically operating as follows: calculating the area of the closed outer contour to obtain the maximum value cont _ area of the contour areacmaxAnd the cmax is the serial number of the maximum area contour in the set, and the contour centroid position is calculated to obtain the line coordinate cont _ ctrcmax(x) Column coordinates cont _ ctrcmax(y), the screening formula is as follows:
wherein area _ ratemin、area_ratemaxUpper and lower limits of the ratio of the outline area to the image area, ctr _ rate, respectivelymin、ctr_ratemaxThe upper limit and the lower limit of the ratio of the outline centroid position to the image height-width are respectively; if the contour area and the contour centroid position meet the formula condition, obtaining a target contour, and turning to step 103; otherwise, go to step 105;
step 103: carrying out contour filling and vertical projection on the target contour, filling the target contour and the internal area thereof by using a gray value of 255 to obtain a target contour image CONTn(ii) a To CONTnCarrying out vertical projection, and accumulating the pixel values in the same column to obtain a projection image CONT _ PRJn(ii) a For CONT _ PRJnBinary division is performed to the CONT _ PRJ with a projection image division threshold value obj _ CONT _ thnCarrying out binarization segmentation, wherein the value of a pixel value larger than obj _ CONT _ th in the image is 255, otherwise, the value is 0, and obtaining a projection segmentation image CONT _ PRJ _ BINn(ii) a For CONT _ PRJ _ BINnScanning column by column to calculate CONT _ PRJ _ BINnThe middle pixel values are the longest continuous area of 255, and the head and tail pixel positions prj _ first and prj _ end of the area are obtained; the longest continuous area is compensated, and the compensation formula is as follows:
prj_first=func_max(1,prj_first-prj_comp)
prj_end=func_min(img_col,prj_end+prj_comp)
wherein prj _ comp is a projection region compensation value, func _ max () function represents taking the larger of two numbers, func _ min () function represents taking the smaller of two numbers;
step 104: judging the head and tail pixel positions of the longest continuous area to obtain a target horizontal direction length obj _ length which is prj _ end-prj _ first +1, if the obj _ length is greater than prj _ comp +1, considering that the target area exists, assigning the head and tail pixel positions of the target area to obj _ front and obj _ back respectively, and turning to step 106; otherwise, go to step 105;
step 105: setting the value of obj _ state to 1, indicating that no target is found, and turning to step 112;
step 106: and screening the position of the boundary before the target, wherein the screening formula is as follows:
1≤obj_front≤edge_extend*img_col
wherein edge _ extend is the expansion ratio of the image boundary in the horizontal direction, and can be more than 0 and less than 1. If the obj _ front satisfies the screening formula, the target is considered to be located on the left boundary of the image and not completely appear, and the step 107 is executed; otherwise, go to step 108;
step 107: setting the value of obj _ state to 3, indicating that the target is positioned on the left boundary of the image, only obtaining the position of the rear boundary of the target, and turning to step 112;
step 108: and screening the rear boundary position of the target, wherein the screening formula is as follows:
(1-edge_extend)*img_col≤obj_back≤img_col
if the obj _ back satisfies the screening formula, the target is considered to be positioned on the right boundary of the image and not completely appear, and the step 109 is executed; otherwise, turning to step 110, which shows that the front and rear boundaries of the target are effective;
step 109: setting the value of obj _ state to 2, indicating that the target is positioned on the right boundary of the image, only obtaining the position of the front boundary of the target, and turning to step 112;
step 110: screening the length of the target in the horizontal direction, wherein the screening formula is as follows:
len_perL,len_perH∈(0,1)&&len_perL<len_perH
wherein len _ perL and len _ perH are respectively the upper limit and the lower limit of the ratio of the target length to the target contour image width; if the obj _ length satisfies the screening formula, the complete target is considered to appear, and the step 111 is carried out; otherwise, go to step 105;
step 111: taking the value of obj _ state as 4 to show that a complete target appears and obtain the position of the boundary before and after the target;
step 112: adjusting the target state and the position of the front boundary and the rear boundary of the target according to the target moving direction obj _ move, and if obj _ move is 1, not processing; otherwise, the following adjustments are made:
when the obj _ state is 1, no processing is performed;
when the obj _ state is 2, the obj _ state is reset to 3, and the values of obj _ front and obj _ back are exchanged;
when the obj _ state is 3, the obj _ state is reset to 2, and the values of obj _ front and obj _ back are exchanged;
when obj _ state is 4, the values of obj _ front and obj _ back are swapped.
According to a preferred embodiment, the image frame with the sequence number of the current frame n plus the frame loss number throw is taken as the next frame to be processed; the calculation formula of the number of lost frames throw is as follows:
throw=func_max(throw,0)+1
wherein obj _ state indicates that the image is a target detection state flag bit, obj _ state 1 indicates that no target is found, obj _ state 2 indicates that the target is located on the front boundary of the image, obj _ state 3 or obj _ state 4 indicates that the target is located on the rear boundary of the image or a complete target, and obj _ state 5 indicates that the target is about to enter the full target of the central area;
obj _ front is the front boundary position of the target area, obj _ back is the rear boundary position of the target area, pixelv is the target moving speed, img _ ctr is the column center of the current image frame, obj _ ctr is the column center of the target in the current image frame, and func _ max () function represents the larger value of the two numbers;
ctr _ margin is the frame loss margin of the complete target, and ctr _ margin is a positive integer; considering that the target rear boundary predicted value prediction is related to the value of throw, the prediction is updated as follows:
according to a preferred embodiment, the step (4) further determines whether the target is in the image center area, if so, the target is extracted, otherwise, it is predicted whether the target will appear in the image center area in the following mth frame according to the current position of the target and the estimated moving speed of the target, if so, the target is extracted as the effective target by waiting for the following mth frame, and if not, the target in the current frame is determined as the effective target.
According to a preferred embodiment, the specific implementation manner of the step (4) of determining whether the target is in the central region of the image is as follows:
calculating the column center img _ ctr of the image frame and the column center obj _ ctr of the object in the image frame:
the img _ col is the image width, the obj _ front is the center of a target column in the previous frame image, and the obj _ back is the center of the target column in the next frame image;
it is determined whether the following capture conditions are satisfied:
wherein ctr _ frame is a central region frame number, and & represents logic, pixelv is a target moving speed, obj _ state represents that an image is a target detection state flag bit, and predict is a target rear boundary prediction value, the distance between obj _ ctr and img _ ctr is converted into a required frame number through pixelv, and a region in a ctr _ frame range taking img _ ctr as a center is defined as a central region;
formula (i) in the capturing condition indicates that the current target is in the central area, formula (ii) indicates that the current target will leave the central area in the next frame image although not entering the central area, and formula (iii) indicates that the current target has left the central area but the target rear boundary prediction value is 0, that is, the current target has not been captured; if the capture condition (i) or (ii) or (iii) is met, entering a repeated target judgment process; otherwise the current target is discarded and the value of obj _ state is taken to be 5, indicating that the complete target is about to enter the central region.
According to a preferred embodiment, the production line image frames in step (1) are also down-sampled:
1≤k1+1≤down_row,1≤k2+1≤down_col,dsize>0
wherein IMG
nFor the production line image frame, the gray value of the pixel point of the ith row and the jth column of the nth frame image is
chan is IMG
nThe number of channels of (a); DOWN
nFor down-sampled images, the gray value of each pixel point is
k
1+1 and k
2+1 represents the abscissa and ordinate of the pixel in the image, respectively; img _ row and img _ col respectively represent the height and width of an input image, dsize is a down-sampling coefficient, and the height and width of the down-sampling image are down _ row [ img _ row/dsize ]]And down _ col [ img _ col/dsize ]]Wherein]Indicating a rounding down.
A system for capturing a target in a production line video sequence, comprising the following modules:
the system comprises a first module, a second module and a third module, wherein the first module is used for positioning a target in a production line image frame acquired at the current moment;
the second module is used for entering the third module if the target positioning state indicates that the target is positioned only to the front boundary of the target; if the target positioning state indicates that the complete target is positioned, entering a fourth module; if the target positioning state indicates that the target is not found or is only positioned to the rear boundary of the target, entering a fifth module;
the third module is used for estimating the moving speed and the moving direction of the target by solving the change of the position of the target front boundary in the K frame image if the target positioning states of the current frame and the historical continuous K-1 frames indicate that only the target front boundary is positioned; otherwise, recording the position of the target front boundary and the frame number of the current frame, and switching to a fifth module, wherein K is greater than 1;
a fourth module, predicting the position of the previous target in the current frame image according to the target boundary position of the previous frame and the estimated target moving speed if the target positioning state of the previous frame of the current frame indicates that the complete target is positioned, and giving up the target if the predicted position is matched with the current target positioning, which indicates that the current target is captured; otherwise, the current target is used as a new target, the current frame image is output, and the fifth module is switched to;
and the fifth module is used for waiting for entering the next frame and returning to the first module.
Through the technical scheme, compared with the prior art, the invention has the following beneficial effects:
the method obtains effective information by using the target positioning state, and estimates the moving speed and direction of the target by using historical multi-frame images when the target enters a view field, so that the method can adapt to the change of the moving speed and direction of a production line, and embodies better robustness and real-time response capability; when the target completely appears in the visual field, the position of the previous target in the current image is predicted by using the complete target captured before and the moving speed of the target, and then whether the current target is a new target or not is judged, so that the problem of capturing when the same target appears in the visual field for many times is solved, and the rapid capturing of a plurality of targets in the production line video sequence is realized.
According to a preferred embodiment, the method of binary segmentation, contour extraction, internal filling, vertical projection and the like is adopted in the step (1), and background interference is eliminated according to the geometric characteristics and the centroid coordinates of the candidate region, so as to obtain the front and rear boundary positions of the target. By the method, under the condition of strong interference, the invention can realize the quick separation of the target and the background.
According to a better implementation mode, the target state and the front-rear boundary position of the target are also adjusted in the specific implementation process of the step (1), so that targets in different moving directions can adopt the same processing step, and the simplicity and the adaptability of the algorithm are ensured.
According to a better implementation mode, the frame loss number is adopted, the frame loss number is calculated according to the target state and the front and rear boundary positions of the target, and the frame loss number is used for obtaining the image frame sequence number of the next processing.
According to a better implementation mode, the judgment of the image area where the current target is located is adopted, the invention only captures the complete target in the image center area, thereby effectively reducing the influence of uneven illumination, background noise and camera imaging distortion on the image and improving the accuracy of target capture.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The following description uses several sets of video sequences captured from an industrial camera JAI, with a frame rate of 15 frames/second, an image size of 2448 x 2050, a number of channels of 3, and a grey value range of 0-255 for each channel. The moving direction of the object in the image is from right to left, namely, the object enters the visual field from the right boundary of the image, and the boundary of the object which firstly enters the visual field is the front boundary of the object.
Initializing parameters: setting the total frame number of a production line video sequence as N, wherein N is a natural number, and initializing the original production line image frame number N to 1; initializing a target positioning state obj _ state and a previous frame target positioning state obj _ state _ last to be 1, wherein the target positioning states have four values, 1 represents that no target is found, 2 represents that the target is positioned on the right boundary of the image, 3 represents that the target is positioned on the left boundary of the image, 4 represents a complete target, and 5 represents a complete target which is about to enter the central area; initializing a target moving direction obj _ move to 1, wherein the target moving direction has two values, 1 represents that the moving direction of the target in the image is from right to left, and 2 represents that the moving direction of the target in the image is from left to right; initializing a target rear boundary predicted value prediction and a target moving speed pixelv to be 0, wherein the unit of the prediction is a pixel, and the unit of the pixelv is each frame of the pixel; eye protection (reduction)Array of bounded location upd _ valkAnd an image frame sequence number array upd _ frakAll elements in the array are removed, k is the serial number of the elements in the array, and k is initialized to 0;
as shown in FIG. 2, the method for capturing targets in a production line video sequence of the invention comprises the following steps:
(1) target positioning, inputting the n-th frame original production line image IMG
nThe gray value of each pixel is
Wherein subscripts i and j respectively represent the abscissa and ordinate of the pixel in the image, and IMG _ row and IMG _ col respectively represent IMG
nFirstly, performing binary segmentation and contour extraction on a down-sampled image to obtain an initial set of closed contours, and screening the set of closed contours according to geometric features and centroid positions; secondly, when a target contour exists, contour filling and vertical projection are carried out on the target contour, and binary segmentation and column-by-column scanning are carried out on a vertical projection result; then, when a target area exists, assigning the head pixel position and the tail pixel position of the target area to the obj _ front and the obj _ back respectively, and obtaining a target state identifier obj _ state according to the front and rear boundary positions of the target;
(2) if the obj _ state is 1, the step (5) is executed; if the obj _ state is 2, only the target front boundary position is obtained, then the step (3) is carried out; if the obj _ state is 3, only the target rear boundary position is obtained, then the step (5) is carried out; if the obj _ state is 4, obtaining the front and rear boundary positions of the target, and then turning to the step (4);
(3) the target moving speed and direction are estimated by first, when obj _ state _ last is not equal to 2, upd _ valk、upd_frakAll elements in the solution are removed, and the value of k is taken as 0; recording the position of the target boundary and the serial number of the current frame, wherein the specific formula is as follows:
upd_valk=obj_front
upd_frak=n
k=k+1
then, if k! If yes, ending the step, wherein upd _ num is a threshold value of the quantity of the updating data; if k is upd _ num, the target moving speed pixelv is calculated, and the specific formula is as follows:
wherein upd _ difkFor the average frame difference of adjacent update data, when upd _ num > 5, k! Max ═ max&&k! Min means to remove upd _ difkTo calculate all remaining upd _ difkThe arithmetic mean of (a);
finally, if pixelv is greater than 0, the moving direction of the target in the image is from left to right, and the moving direction obj _ move of the target is set to 2; if pixelv is less than 0, the moving direction of the target in the image is from right to left, obj _ move is taken as 1, and pixelv is taken as-pixelv; if pixelv is 0, the update data is considered erroneous and upd _ val will be usedk、upd_frakAll elements in the solution are removed, and the value of k is taken as 0;
(4) judging whether the current target is a captured repeated target or not, if a complete target appears in the image frame processed last time, predicting the position of the previous target in the current frame image according to the target rear boundary position and the target moving speed recorded in the previous processing, and if the predicted position is matched with the current target, indicating that the current target is captured and abandoning the current target; otherwise, the current target is considered as a new target, and the current frame image IMG is outputn;
Specifically judging whether the formula is preset > 0& & obj _ front < preset-pixelv, if the formula conditions are met, the current target is a repeated target, and if the formula conditions are not met, the current target is a new target; updating the target rear boundary predicted value prediction as obj _ back-pixel;
(5) taking obj _ state _ last as obj _ state; if the obj _ state is not 4, the value of predict is taken to be 0; and (4) adding 1 to the original image frame sequence number N, ending the processing flow if N is larger than N, otherwise, turning to the step (1) and waiting for entering the next frame.
According to a preferred embodiment, the specific implementation manner of the target location in step (1) is as follows:
step 101: IMG current production line image framenPerforming binary segmentation and contour extraction;
the specific formula of binary segmentation is as follows:
wherein obj _ th is a current frame segmentation threshold value, and the value range is 0-255; obj _ type is the split direction, a value of 0 indicates a forward split, and a value of 1 indicates a reverse split;&&it is shown that the two conditions on the left and right are simultaneously satisfied,
the pixel value is divided for the ith row and the jth column of the nth frame image.
The operation process of contour extraction is as follows: selecting a structural element (ele _ m, ele _ n) pair with the height of ele _ m and the width of ele _ n
Performing binary expansion for ele _ t times to obtain binary expansion image DILATE
n(ii) a To pair of DILATE
nCarrying out contour extraction to obtain an initial set of closed outer contours, wherein the number of the closed outer contours is cont _ size;
processing fig. 4b with obj _ th being 55, obj _ type being 1, ele _ m being 1, ele _ n being 4, and ele _ t being 1, the resulting closed outline set is as shown in fig. 4c, with different outlines being distinguished by color;
processing fig. 5b with obj _ th being 120, obj _ type being 0, ele _ m being 0, ele _ n being 0, and ele _ t being 0, the resulting closed outline set is shown in fig. 5 c;
processing fig. 6b with obj _ th being 62, obj _ type being 1, ele _ m being 0, ele _ n being 0, and ele _ t being 0, the resulting closed outline set is shown in fig. 6 c;
step 102: screening the closed outer contour to reduce the influence of illumination and background environment, and specifically operating as follows: calculating the area of the closed outer contour to obtain the maximum value cont _ area of the contour areacmaxAnd the cmax is the serial number of the maximum area contour in the set, and the contour centroid position is calculated to obtain the line coordinate cont _ ctrcmax(x) Column coordinates cont _ ctrcmax(y), the screening formula is as follows:
wherein area _ ratemin、area_ratemaxUpper and lower limits of the ratio of the outline area to the image area, ctr _ rate, respectivelymin、ctr_ratemaxThe upper limit and the lower limit of the ratio of the outline centroid position to the image height-width are respectively; if the contour area and the contour centroid position meet the formula condition, obtaining a target contour, and turning to step 103; otherwise, go to step 105;
get area _ ratemin=0.1,area_ratemax=0.9,ctr_ratemin=0.1,ctr_ratemax0.9, and the height and width of the down-sampled image are known as down _ row 205 and down _ col 245, respectively;
in fig. 4c, the row coordinate and the column coordinate of the maximum area and the centroid position of the contour are respectively: cont _ areamax=13440,cont_ctrmax(x)=98,cont_ctrmax(y) 105, wherein the position of the "+" sign is the centroid position of the outline, and the outline with the largest area and the centroid position thereof all meet the formula condition;
in FIG. 5c, the maximum value of the profile area is obtainedAnd the row coordinate and the column coordinate of the outline centroid position are respectively as follows: cont _ areamax=17739,cont_ctrmax(x)=81,cont_ctrmax(y) 105, the outline with the largest area and the centroid position thereof all meet the formula condition;
in fig. 6c, the row coordinate and the column coordinate of the maximum area and the centroid position of the contour are respectively: cont _ areamax=15181,cont_ctrmax(x)=89,cont_ctrmax(y) 128, the outline with the largest area and the centroid position thereof all meet the formula condition;
step 103: carrying out contour filling and vertical projection on the target contour, filling the target contour and the internal area thereof by using a gray value of 255 to obtain a target contour image CONTn(ii) a To CONTnCarrying out vertical projection, and accumulating the pixel values in the same column to obtain a projection image CONT _ PRJnThe height of the projected image is 1; for CONT _ PRJnBinary division is performed to the CONT _ PRJ with a projection image division threshold value obj _ CONT _ thnCarrying out binarization segmentation, wherein the value of a pixel value larger than obj _ CONT _ th in the image is 255, otherwise, the value is 0, and obtaining a projection segmentation image CONT _ PRJ _ BINn(ii) a For CONT _ PRJ _ BINnScanning column by column to calculate CONT _ PRJ _ BINnThe middle pixel values are the longest continuous area of 255, and the head and tail pixel positions prj _ first and prj _ end of the area are obtained; the longest continuous area is compensated, and the compensation formula is as follows:
prj_first=func_max(1,prj_first-prj_comp)
prj_end=func_min(img_col,prj_end+prj_comp)
wherein prj _ comp is a projection region compensation value, func _ max () function represents taking the larger of two numbers, func _ min () function represents taking the smaller of two numbers;
filling the target contour in fig. 4c to obtain a target contour image, as shown in fig. 4d, taking obj _ cont _ th as 765, prj _ comp as 5, and obtaining prj _ first as 10, and prj _ end as 196;
filling the target contour in fig. 5c to obtain a target contour image, as shown in fig. 5d, taking obj _ cont _ th as 765 and prj _ comp as 15, and obtaining prj _ first as 1 and prj _ end as 215;
filling the target contour in fig. 6c to obtain a target contour image, as shown in fig. 6d, taking obj _ cont _ th as 765, prj _ comp as 5, and obtaining prj _ first as 36, and prj _ end as 225;
step 104: judging the head and tail pixel positions of the longest continuous area to obtain a target horizontal direction length obj _ length which is prj _ end-prj _ first +1, if the obj _ length is greater than prj _ comp +1, considering that the target area exists, assigning the head and tail pixel positions of the target area to obj _ front and obj _ back respectively, and turning to step 106; otherwise, go to step 105;
in the target contour image shown in fig. 4d, the target horizontal length is obj _ length 187, which meets the target area existence condition, and obj _ front is 10 and obj _ back is 196 are obtained;
in the target contour image shown in fig. 5d, the target horizontal length is obj _ length 215, which meets the target area existence condition, and obj _ front is 1 and obj _ back is 215;
in the target contour image shown in fig. 6d, the target horizontal length is obj _ length 190, which meets the target area existence condition, and obj _ front 36 and obj _ back 225 are obtained;
step 105: setting the value of obj _ state to 1, indicating that no target is found, and turning to step 112;
step 106: and screening the position of the boundary before the target, wherein the screening formula is as follows:
1≤obj_front≤edge_extend*img_col
wherein edge _ extend is the expansion ratio of the image boundary in the horizontal direction, and can be more than 0 and less than 1. If the obj _ front satisfies the screening formula, the target is considered to be located on the left boundary of the image and not completely appear, and the step 107 is executed; otherwise, go to step 108;
the expansion ratio of the image boundary in the horizontal direction is taken as edge _ extended being 0.02;
in the target contour image shown in fig. 4d, obj _ front ═ 9 does not satisfy the filtering condition, and the target is not considered to be located on the left boundary of the image;
in the target contour image shown in fig. 5d, obj _ front ═ 1 satisfies the filtering condition, and the target is considered to be located on the left boundary of the image;
in the target contour image shown in fig. 6d, obj _ front ═ 36 does not satisfy the filtering condition, and the target is not considered to be located on the left boundary of the image;
step 107: setting the value of obj _ state to 3, indicating that the target is positioned on the left boundary of the image, only obtaining the position of the rear boundary of the target, and turning to step 112;
step 108: and screening the rear boundary position of the target, wherein the screening formula is as follows:
(1-edge_extend)*img_col≤obj_back≤img_col
if the obj _ back satisfies the screening formula, the target is considered to be positioned on the right boundary of the image and not completely appear, and the step 109 is executed; otherwise, turning to step 110, which shows that the front and rear boundaries of the target are effective;
in the target contour image shown in fig. 4d, obj _ back 195 does not satisfy the filtering condition, and the target is not considered to be located on the right boundary of the image;
in the target contour image shown in fig. 6d, obj _ back 225 does not satisfy the filtering condition, and the target is not considered to be located on the right boundary of the image;
step 109: setting the value of obj _ state to 2, indicating that the target is positioned on the right boundary of the image, only obtaining the position of the front boundary of the target, and turning to step 112;
step 110: screening the length of the target in the horizontal direction, wherein the screening formula is as follows:
len_perL,len_perH∈(0,1)&&len_perL<len_perH
wherein len _ perL and len _ perH are respectively the upper limit and the lower limit of the ratio of the target length to the target contour image width; if the obj _ length satisfies the screening formula, the complete target is considered to appear, and the step 111 is carried out; otherwise, go to step 105;
taking len _ perL as 0.55 and len _ perH as 0.95;
in the target contour image shown in fig. 4d, the ratio of the target length to the target contour image width is 0.763, which is in accordance with the screening formula, and a complete target appears, as shown in fig. 4e, the position of the "yellow line" in the image is the target horizontal position obtained after the algorithm processing;
in the target contour image shown in fig. 6d, the ratio of the target length to the target contour image width is 0.776, which conforms to the screening formula, and a complete target is considered to appear, as shown in fig. 6 e;
step 111: taking the value of obj _ state as 4 to show that a complete target appears and obtain the position of the boundary before and after the target;
step 112: adjusting the target state and the position of the front boundary and the rear boundary of the target according to the target moving direction obj _ move, and if obj _ move is 1, not processing; otherwise, the following adjustments are made:
when the obj _ state is 1, no processing is performed;
when the obj _ state is 2, the obj _ state is reset to 3, and the values of obj _ front and obj _ back are exchanged;
when the obj _ state is 3, the obj _ state is reset to 2, and the values of obj _ front and obj _ back are exchanged;
when obj _ state is 4, the values of obj _ front and obj _ back are swapped.
According to a preferred embodiment, the image frame with the sequence number of the current frame n plus the frame loss number throw is taken as the next frame to be processed; the calculation formula of the number of lost frames throw is as follows:
throw=func_max(throw,0)+1
wherein obj _ state 1 indicates that no object is found, obj _ state 2 indicates that the object is located on the front boundary of the image, obj _ state 3 or 4 indicates that the object is located on the rear boundary of the image or the complete object, and obj _ state 5 indicates that the complete object is about to enter the central region;
obj _ front is the front boundary position of the target area, obj _ back is the rear boundary position of the target area, pixelv is the target moving speed, img _ ctr is the column center of the current image frame, obj _ ctr is the column center of the target in the current image frame, and func _ max () function represents the larger value of the two numbers;
ctr _ margin is the frame loss margin of the complete target, and ctr _ margin is a positive integer; the frame loss number surplus is measured as ctr _ margin which is 2; considering that the target rear boundary predicted value prediction is related to the value of throw, the prediction is updated as follows:
predict=predict-pixelv*throw
predict=func_max(predict,0);
according to a preferred embodiment, step (4) captures only the target located in the central region of the image, and the specific capturing conditions are as follows:
calculating the column center img _ ctr of the image frame and the column center obj _ ctr of the object in the image frame:
the img _ col is the image width, the obj _ front is the center of a target column in the previous frame image, and the obj _ back is the center of the target column in the next frame image;
it is determined whether the following capture conditions are satisfied:
wherein ctr _ frame is a central region frame number, and & represents logic, pixelv is a target moving speed, obj _ state represents that an image is a target detection state flag bit, and predict is a target rear boundary prediction value, the distance between obj _ ctr and img _ ctr is converted into a required frame number through pixelv, and a region in a ctr _ frame range taking img _ ctr as a center is defined as a central region; the center region frame number is assumed to be ctr _ frame 2.
Formula (i) in the capturing condition indicates that the current target is in the central area, formula (ii) indicates that the current target will leave the central area in the next frame image although not entering the central area, and formula (iii) indicates that the current target has left the central area but the target rear boundary prediction value is 0, that is, the current target has not been captured; if the capture condition (i) or (ii) or (iii) is met, entering a repeated target judgment process; otherwise the current target is discarded and the value of obj _ state is taken to be 5, indicating that the complete target is about to enter the central region.
According to a preferred embodiment, the production line image frames in step (1) are first down-sampled:
1≤k1+1≤down_row,1≤k2+1≤down_col,dsize>0
wherein IMG
nFor the production line image frame, the gray value of the pixel point of the ith row and the jth column of the nth frame image is
chan is IMG
nThe number of channels of (a); DOWN
nFor down-sampled images, the gray value of each pixel point is
k
1+1 and k
2+1 represents the abscissa and ordinate of the pixel in the down-sampled image, respectively; img _ row and img _ col respectively represent the height and width of an input image, dsize is a down-sampling coefficient, and the height and width of the down-sampling image are down _ row [ img _ row/dsize ]]And down _ col [ img _ col/dsize ]]Wherein]Represents rounding down;
specifically, k is in the initial state1And k2All of which are 0, for k in the progressive scanning order1And k2Assigning value to input image (1+ k)1*dsize,1+k2Dsize) location to the downsampled image (k)1+1,k2+1), if the input image has multiple channels, assigning an arithmetic mean of pixel values of the multiple channels; the img _ row is 2050, img _ col is 2448, chan is 3, which is a parameter of the original production line image frame, the down-sampling coefficient is dsize is 10, the size of the down-sampled image is down _ row is 205, down _ col is 245, and the number of channels is 1.
A system for capturing a target in a production line video sequence, comprising the following modules:
the system comprises a first module, a second module and a third module, wherein the first module is used for positioning a target in a production line image frame acquired at the current moment;
the second module is used for entering the third module if the target positioning state indicates that the target is positioned only to the front boundary of the target; if the target positioning state indicates that the complete target is positioned, entering a fourth module; if the target positioning state indicates that the target is not found or is only positioned to the rear boundary of the target, entering a fifth module;
the third module is used for estimating the moving speed and the moving direction of the target by solving the change of the position of the target front boundary in the K frame image if the target positioning states of the current frame and the historical continuous K-1 frames indicate that only the target front boundary is positioned; otherwise, recording the position of the target front boundary and the frame number of the current frame, and switching to a fifth module, wherein K is greater than 1;
a fourth module, predicting the position of the previous target in the current frame image according to the target boundary position of the previous frame and the estimated target moving speed if the target positioning state of the previous frame of the current frame indicates that the complete target is positioned, and giving up the target if the predicted position is matched with the current target positioning, which indicates that the current target is captured; otherwise, the current target is used as a new target, the current frame image is output, and the fifth module is switched to;
and the fifth module is used for waiting for entering the next frame and returning to the first module.
In order to evaluate the performance of the invention, different models of mobile phone target video sequences are adopted for testing.
Fig. 4a is an original image acquired by the JAI camera, the Model of the mobile phone target is Model1, it is seen that there is strong interfering background noise, and the black area of the mobile phone target is used as the detection area; FIG. 4a is down-sampled to obtain FIG. 4 b; FIG. 4b is subjected to binary segmentation and contour extraction to obtain a band diagram 4 c; FIG. 4c is a graph of FIG. 4d after geometric profile, centroid position screening and profile filling; FIG. 4d is processed by vertical projection segmentation, line-by-line scanning and target area judgment to obtain the front and rear boundary positions of the target and the target state identifier; FIG. 4e is a target location effect graph, and the processing results of FIG. 4a are obtained as follows:
obj_front=10,obj_back=196,obj_state=4
fig. 4f shows the same model of the handset object as fig. 4a, but without installing the same handset accessory as the object in fig. 4a, and the parameters in fig. 4a are used to process fig. 4f to obtain the target position effect diagram as shown in fig. 4g, and the results obj _ front ═ 46, obj _ back ═ 233, and obj _ state ═ 4 are obtained.
The meanings of the images in fig. 5a to 5g are the same as those in fig. 4a to 4g, respectively, the Model of the mobile phone target is Model2, the white area of the mobile phone target is used as the detection area, and the large projection area compensation prj _ comp is 15 in consideration of the existence of the black border of the mobile phone target, so as to obtain the processing result in fig. 5a as follows:
obj_front=1,obj_back=215,obj_state=1
processing fig. 5f using the parameters of fig. 5a yields a target position effect map as shown in fig. 5g, and yields results obj _ front ═ 10, obj _ back ═ 231, and obj _ state ═ 4.
The meanings of the images in fig. 6a to 6g are the same as those in fig. 4a to 4g, respectively, the Model of the mobile phone target is Model3, it can be seen that there are uneven illumination, interference of background environment, and multiple objects in the original image, and the black area of the mobile phone target is taken as the detection area, and the processing result in fig. 6a is obtained as follows:
obj_front=36,obj_back=225,obj_state=4
processing fig. 6f using the parameters of fig. 6a yields a target position effect map as shown in fig. 6g, and yields results obj _ front ═ 32, obj _ back ═ 239, and obj _ state ═ 4.
The experimental results show that the method can deal with the situations of strong background interference, uneven illumination and multi-target occurrence, can process different types of targets, can adapt to the same type of targets in different processing states, and has stronger robustness; in addition, the large-scale down sampling, the self-adaptive frame loss, the self-adaptive moving speed and the moving direction updating in the invention ensure the rapidity of target detection, meet the real-time requirement of a production line, and ensure the non-repeated detection of a complete target after the self-adaptive boundary predicted value updating, which is not illustrated one by one.
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.