CN110084233B

CN110084233B - Method and system for quickly capturing target in production line video sequence

Info

Publication number: CN110084233B
Application number: CN201810072944.8A
Authority: CN
Inventors: 桑红石; 董通; 胡鹏; 沈光明
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2018-01-25
Filing date: 2018-01-25
Publication date: 2021-02-09
Anticipated expiration: 2038-01-25
Also published as: CN110084233A

Abstract

The invention discloses a method and a system for quickly capturing a target in a video sequence of a production line. The specific steps include: firstly, locating the target in the image to obtain the target positioning state and target boundary position; then, updating the target movement when the target part enters the field of view speed; finally, when a complete target appears in the field of view, predict the position of the previous complete target in the current image, and judge the appearance of a new target according to the predicted position and the target boundary position. In order to reduce the interference of light and background on the target area, it is preferable to capture only the target in the central area of the image; in order to reduce the average processing time, the number of lost frames is calculated according to different target positioning states, and the key frames in the video sequence are directly processed. The invention has the characteristics of quickly and accurately capturing the target, and can adapt to changes in the moving speed and direction of the production line, and can be used in various industrial automation video detection equipment.

Description

Method and system for quickly capturing target in production line video sequence

Technical Field

The invention belongs to the field of target detection in image processing, and particularly relates to a method and a system for quickly capturing a target in a production line video sequence.

Background

In the industrial manufacturing field, quality detection of products on a production line is an important link in the whole production flow, and because manual detection has the problems of low efficiency, unstable quality, non-uniform standard, difficulty in matching the production speed of a machine and the like, automatic detection mainly based on an industrial camera and an image processing algorithm becomes a trend of manufacturing development. The primary task of quality detection is target detection, each non-repetitive target needs to be obtained in an image sequence, and a rapid target detection technology based on image processing is a key for realizing automatic detection.

The object detection aims to separate an interested object from a background in a video sequence to obtain object region information. The target detection is the basis of higher-level image processing algorithms such as target classification, target tracking, behavior understanding and the like, and main factors influencing the target detection include uneven illumination, complex background interference, high target moving speed and multi-target interference. The traditional methods for target detection include interframe difference method, background subtraction method, optical flow method, feature matching method, etc.

The interframe difference method is to subtract two or more continuous frames of images in a video sequence to obtain a difference image, perform binary segmentation on the difference image, if the pixel value is greater than a threshold value, the difference image belongs to a target area, otherwise, the difference image belongs to a background area. The interframe difference method is simple to implement and low in complexity, but has the defects that the interframe difference method is easily influenced by illumination change, a complete target area is not easily extracted, and the extraction effect is influenced by the moving speed of a target.

The background subtraction method is to establish a background frame model, subtract a current frame and a background frame to obtain a differential image, perform binary segmentation on the differential image to obtain a target area, and update the background frame model. The Gaussian mixture model proposed by Stauffer and Grimson is a background modeling method commonly used at present. The background subtraction method can extract a relatively complete target region and adapt to the change of the background, but has the defects that a background frame model needs a certain time to be converged and the calculation complexity is high.

The optical flow method is used for analyzing the correlation of pixel point optical flows in time and space to obtain the motion state of pixel points, and further distinguishing a target area from a background area. The optical flow is calculated by three methods, namely feature matching based method, frequency domain based method and gradient based method. The optical flow method can adapt to the condition that the camera moves and can simultaneously complete target detection and tracking, but has the defects of high calculation complexity, poor real-time performance and easy noise interference on the calculation of the optical flow.

The feature matching method is to extract features in an image and match the features with target features under the condition of known target features to obtain a target area. The feature matching method comprises three matching methods of feature points, contour features and area features. The characteristic point matching firstly extracts characteristic points and descriptors in an image, then calculates the distance between the descriptors to obtain a matching pair, calculates a transformation matrix from the matching pair to obtain a target area in the image, and commonly used characteristic point operators comprise Harris, SIFT, SURF, ORB and the like; the contour feature matching depends on the effect of contour extraction, and the matching effect is poor when the contour is interfered; the region feature matching searches the target region by using the region features such as color, shape, texture and the like. The characteristic matching method can adapt to various conditions such as target shielding, illumination change and the like, but has the defects that only the targets of the same type can be detected, and the real-time requirement can be met only by optimizing according to actual conditions.

In the task of target detection of a production line video sequence, the robustness and the real-time performance of target detection are ensured under the conditions of background interference, target moving speed, direction change and the like, and the accuracy of target detection is ensured, namely, a target which meets the size requirement, is non-repetitive and completely appears is found. Therefore, the traditional target detection method is difficult to completely meet the target detection task of the production line video sequence.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to provide a method and a system for quickly capturing a target in a production line video sequence, which can adapt to the change of the movement speed and direction of a production line and ensure that the target is not missed and mistakenly detected.

To achieve the above object, the present invention provides a method for rapidly capturing a target in a production line video sequence, comprising the following steps:

a method of capturing a target in a production line video sequence, comprising the steps of:

(1) inputting an nth frame of original production line image, and carrying out target positioning in the image to obtain a target positioning state and a target front-rear boundary position;

(2) if the target positioning state indicates that only the front boundary of the target is positioned, the step (3) is carried out; if the target positioning state indicates that the complete target is positioned, the step (4) is carried out; if the target positioning state indicates that the target is not found or is only positioned to the rear boundary of the target, the step (5) is carried out;

(3) if the target positioning states of the current frame and the historical continuous K-1 frames indicate that only the front boundary of the target is positioned, estimating the moving speed and the moving direction of the target by solving the change of the position of the front boundary of the target in the K frame image; otherwise, recording the position of the target front boundary and the frame number of the current frame, and turning to the step (5), wherein K is larger than 1;

(4) if the target positioning state of the previous frame of the current frame indicates that a complete target is positioned, predicting the position of the previous target in the current frame image according to the target boundary position of the previous frame and the estimated target moving speed, and if the predicted position is matched with the current target positioning, indicating that the current target is captured and abandoning the target; otherwise, taking the current target as a new target, outputting the current frame image, and turning to the step (5);

(5) and (3) adding 1 to the original image frame number N, ending the processing flow if N is greater than N, otherwise, turning to the step (1) to wait for entering the next frame, wherein N is the total frame number of the production line video sequence.

According to a preferred embodiment, the specific implementation manner of the target location in step (1) is as follows:

image IMG of n-th frame production line acquired at current momentⁿThe gray value of each pixel is

Wherein subscripts i and j respectively represent the abscissa and ordinate of the pixel in the image, and IMG _ row and IMG _ col respectively represent IMGⁿHeight, width of (d);

performing binary segmentation and contour extraction on the current production line image frame to obtain an initial set of closed contours, and screening the closed contour set according to geometric features and the centroid positions;

when a target contour exists, carrying out contour filling and vertical projection on the target contour, and carrying out binary segmentation and column-by-column scanning on a vertical projection result;

when a target area is found in the scanning, the front and rear boundary positions of the target area are recorded as obj _ front and obj _ back, and a target positioning state obj _ state is determined according to the front and rear boundary positions of the target area.

step 101: IMG current production line image frameⁿPerforming binary segmentation and contour extraction;

the specific formula of binary segmentation is as follows:

wherein obj _ th is a current frame segmentation threshold value, and the value range is 0-255; obj _ type is the split direction, a value of 0 indicates a forward split, and a value of 1 indicates a reverse split;&&it is shown that the two conditions on the left and right are simultaneously satisfied,

dividing the ith row and the jth column of the nth frame image into pixel values;

the operation process of contour extraction is as follows: selecting a structural element (ele _ m, ele _ n) pair with the height of ele _ m and the width of ele _ n

Performing binary expansion for ele _ t times to obtain binary expansion image DILATEⁿ(ii) a To pair of DILATEⁿCarrying out contour extraction to obtain an initial set of closed outer contours, wherein the number of the closed outer contours is cont _ size;

step 102: screening the closed outer contour to reduce the influence of illumination and background environment, and specifically operating as follows: calculating the area of the closed outer contour to obtain the maximum value cont _ area of the contour area_cmaxAnd the cmax is the serial number of the maximum area contour in the set, and the contour centroid position is calculated to obtain the line coordinate cont _ ctr_cmax(x) Column coordinates cont _ ctr_cmax(y), the screening formula is as follows:

(a)

(b)

(c)

wherein area _ rate_min、area_rate_maxUpper and lower limits of the ratio of the outline area to the image area, ctr _ rate, respectively_min、ctr_rate_maxThe upper limit and the lower limit of the ratio of the outline centroid position to the image height-width are respectively; if the contour area and the contour centroid position meet the formula condition, obtaining a target contour, and turning to step 103; otherwise, go to step 105;

step 103: carrying out contour filling and vertical projection on the target contour, filling the target contour and the internal area thereof by using a gray value of 255 to obtain a target contour image CONTⁿ(ii) a To CONTⁿCarrying out vertical projection, and accumulating the pixel values in the same column to obtain a projection image CONT _ PRJⁿ(ii) a For CONT _ PRJⁿBinary division is performed to the CONT _ PRJ with a projection image division threshold value obj _ CONT _ thⁿCarrying out binarization segmentation, wherein the value of a pixel value larger than obj _ CONT _ th in the image is 255, otherwise, the value is 0, and obtaining a projection segmentation image CONT _ PRJ _ BINⁿ(ii) a For CONT _ PRJ _ BINⁿScanning column by column to calculate CONT _ PRJ _ BINⁿThe middle pixel values are the longest continuous area of 255, and the head and tail pixel positions prj _ first and prj _ end of the area are obtained; the longest continuous area is compensated, and the compensation formula is as follows:

prj_first＝func_max(1,prj_first-prj_comp)

prj_end＝func_min(img_col,prj_end+prj_comp)

wherein prj _ comp is a projection region compensation value, func _ max () function represents taking the larger of two numbers, func _ min () function represents taking the smaller of two numbers;

step 104: judging the head and tail pixel positions of the longest continuous area to obtain a target horizontal direction length obj _ length which is prj _ end-prj _ first +1, if the obj _ length is greater than prj _ comp +1, considering that the target area exists, assigning the head and tail pixel positions of the target area to obj _ front and obj _ back respectively, and turning to step 106; otherwise, go to step 105;

step 105: setting the value of obj _ state to 1, indicating that no target is found, and turning to step 112;

step 106: and screening the position of the boundary before the target, wherein the screening formula is as follows:

1≤obj_front≤edge_extend*img_col

wherein edge _ extend is the expansion ratio of the image boundary in the horizontal direction, and can be more than 0 and less than 1. If the obj _ front satisfies the screening formula, the target is considered to be located on the left boundary of the image and not completely appear, and the step 107 is executed; otherwise, go to step 108;

step 107: setting the value of obj _ state to 3, indicating that the target is positioned on the left boundary of the image, only obtaining the position of the rear boundary of the target, and turning to step 112;

step 108: and screening the rear boundary position of the target, wherein the screening formula is as follows:

(1-edge_extend)*img_col≤obj_back≤img_col

if the obj _ back satisfies the screening formula, the target is considered to be positioned on the right boundary of the image and not completely appear, and the step 109 is executed; otherwise, turning to step 110, which shows that the front and rear boundaries of the target are effective;

step 109: setting the value of obj _ state to 2, indicating that the target is positioned on the right boundary of the image, only obtaining the position of the front boundary of the target, and turning to step 112;

step 110: screening the length of the target in the horizontal direction, wherein the screening formula is as follows:

len_perL,len_perH∈(0,1)&&len_perL＜len_perH

wherein len _ perL and len _ perH are respectively the upper limit and the lower limit of the ratio of the target length to the target contour image width; if the obj _ length satisfies the screening formula, the complete target is considered to appear, and the step 111 is carried out; otherwise, go to step 105;

step 111: taking the value of obj _ state as 4 to show that a complete target appears and obtain the position of the boundary before and after the target;

step 112: adjusting the target state and the position of the front boundary and the rear boundary of the target according to the target moving direction obj _ move, and if obj _ move is 1, not processing; otherwise, the following adjustments are made:

when the obj _ state is 1, no processing is performed;

when the obj _ state is 2, the obj _ state is reset to 3, and the values of obj _ front and obj _ back are exchanged;

when the obj _ state is 3, the obj _ state is reset to 2, and the values of obj _ front and obj _ back are exchanged;

when obj _ state is 4, the values of obj _ front and obj _ back are swapped.

According to a preferred embodiment, the image frame with the sequence number of the current frame n plus the frame loss number throw is taken as the next frame to be processed; the calculation formula of the number of lost frames throw is as follows:

throw＝func_max(throw,0)+1

wherein obj _ state indicates that the image is a target detection state flag bit, obj _ state 1 indicates that no target is found, obj _ state 2 indicates that the target is located on the front boundary of the image, obj _ state 3 or obj _ state 4 indicates that the target is located on the rear boundary of the image or a complete target, and obj _ state 5 indicates that the target is about to enter the full target of the central area;

obj _ front is the front boundary position of the target area, obj _ back is the rear boundary position of the target area, pixelv is the target moving speed, img _ ctr is the column center of the current image frame, obj _ ctr is the column center of the target in the current image frame, and func _ max () function represents the larger value of the two numbers;

ctr _ margin is the frame loss margin of the complete target, and ctr _ margin is a positive integer; considering that the target rear boundary predicted value prediction is related to the value of throw, the prediction is updated as follows:

according to a preferred embodiment, the step (4) further determines whether the target is in the image center area, if so, the target is extracted, otherwise, it is predicted whether the target will appear in the image center area in the following mth frame according to the current position of the target and the estimated moving speed of the target, if so, the target is extracted as the effective target by waiting for the following mth frame, and if not, the target in the current frame is determined as the effective target.

According to a preferred embodiment, the specific implementation manner of the step (4) of determining whether the target is in the central region of the image is as follows:

calculating the column center img _ ctr of the image frame and the column center obj _ ctr of the object in the image frame:

the img _ col is the image width, the obj _ front is the center of a target column in the previous frame image, and the obj _ back is the center of the target column in the next frame image;

it is determined whether the following capture conditions are satisfied:

(i)

(ii)

(iii)

wherein ctr _ frame is a central region frame number, and & represents logic, pixelv is a target moving speed, obj _ state represents that an image is a target detection state flag bit, and predict is a target rear boundary prediction value, the distance between obj _ ctr and img _ ctr is converted into a required frame number through pixelv, and a region in a ctr _ frame range taking img _ ctr as a center is defined as a central region;

formula (i) in the capturing condition indicates that the current target is in the central area, formula (ii) indicates that the current target will leave the central area in the next frame image although not entering the central area, and formula (iii) indicates that the current target has left the central area but the target rear boundary prediction value is 0, that is, the current target has not been captured; if the capture condition (i) or (ii) or (iii) is met, entering a repeated target judgment process; otherwise the current target is discarded and the value of obj _ state is taken to be 5, indicating that the complete target is about to enter the central region.

According to a preferred embodiment, the production line image frames in step (1) are also down-sampled:

1≤k₁+1≤down_row,1≤k₂+1≤down_col,dsize＞0

wherein IMGⁿFor the production line image frame, the gray value of the pixel point of the ith row and the jth column of the nth frame image is

chan is IMGⁿThe number of channels of (a); DOWNⁿFor down-sampled images, the gray value of each pixel point is

k₁+1 and k₂+1 represents the abscissa and ordinate of the pixel in the image, respectively; img _ row and img _ col respectively represent the height and width of an input image, dsize is a down-sampling coefficient, and the height and width of the down-sampling image are down _ row [ img _ row/dsize ]]And down _ col [ img _ col/dsize ]]Wherein]Indicating a rounding down.

A system for capturing a target in a production line video sequence, comprising the following modules:

the system comprises a first module, a second module and a third module, wherein the first module is used for positioning a target in a production line image frame acquired at the current moment;

the second module is used for entering the third module if the target positioning state indicates that the target is positioned only to the front boundary of the target; if the target positioning state indicates that the complete target is positioned, entering a fourth module; if the target positioning state indicates that the target is not found or is only positioned to the rear boundary of the target, entering a fifth module;

the third module is used for estimating the moving speed and the moving direction of the target by solving the change of the position of the target front boundary in the K frame image if the target positioning states of the current frame and the historical continuous K-1 frames indicate that only the target front boundary is positioned; otherwise, recording the position of the target front boundary and the frame number of the current frame, and switching to a fifth module, wherein K is greater than 1;

a fourth module, predicting the position of the previous target in the current frame image according to the target boundary position of the previous frame and the estimated target moving speed if the target positioning state of the previous frame of the current frame indicates that the complete target is positioned, and giving up the target if the predicted position is matched with the current target positioning, which indicates that the current target is captured; otherwise, the current target is used as a new target, the current frame image is output, and the fifth module is switched to;

and the fifth module is used for waiting for entering the next frame and returning to the first module.

Through the technical scheme, compared with the prior art, the invention has the following beneficial effects:

the method obtains effective information by using the target positioning state, and estimates the moving speed and direction of the target by using historical multi-frame images when the target enters a view field, so that the method can adapt to the change of the moving speed and direction of a production line, and embodies better robustness and real-time response capability; when the target completely appears in the visual field, the position of the previous target in the current image is predicted by using the complete target captured before and the moving speed of the target, and then whether the current target is a new target or not is judged, so that the problem of capturing when the same target appears in the visual field for many times is solved, and the rapid capturing of a plurality of targets in the production line video sequence is realized.

According to a preferred embodiment, the method of binary segmentation, contour extraction, internal filling, vertical projection and the like is adopted in the step (1), and background interference is eliminated according to the geometric characteristics and the centroid coordinates of the candidate region, so as to obtain the front and rear boundary positions of the target. By the method, under the condition of strong interference, the invention can realize the quick separation of the target and the background.

According to a better implementation mode, the target state and the front-rear boundary position of the target are also adjusted in the specific implementation process of the step (1), so that targets in different moving directions can adopt the same processing step, and the simplicity and the adaptability of the algorithm are ensured.

According to a better implementation mode, the frame loss number is adopted, the frame loss number is calculated according to the target state and the front and rear boundary positions of the target, and the frame loss number is used for obtaining the image frame sequence number of the next processing.

According to a better implementation mode, the judgment of the image area where the current target is located is adopted, the invention only captures the complete target in the image center area, thereby effectively reducing the influence of uneven illumination, background noise and camera imaging distortion on the image and improving the accuracy of target capture.

Drawings

FIG. 1 is a schematic diagram of a production line video sequence acquisition platform according to the present invention;

FIG. 2 is a flow chart illustrating a method for fast target capture in a production line video sequence according to the present invention;

FIG. 3 is a schematic flow chart of the object location of the present invention;

fig. 4a to 4f are schematic diagrams illustrating processing of a black mobile phone target, which is Model 1. FIG. 4a is an original image captured by a JAI camera; FIG. 4b is the image after down-sampling, the number of image channels is 1, and the gray scale value range is 0-255; FIG. 4c is the image after contour extraction, wherein different contours are distinguished by colors, and the position of the "+" sign is the centroid position of the contour; FIG. 4d is an image of the target contour after its padding; FIG. 4e is the target position effect diagram, and the position of the "yellow line" is the target horizontal direction position obtained after the algorithm processing; FIG. 4f is a diagram of the same type of target raw image processed with the same parameters as in FIG. 4a, resulting in a target capture result as shown in FIG. 4 g;

fig. 5a to 5g are schematic diagrams of processing of a white handset target, the handset Model being Model 2. The meanings of the images are the same as those of fig. 4a to 4g, respectively;

fig. 6a to 6g are schematic diagrams of processing of a white handset target, the handset Model being Model 3. The meanings of the images are the same as those of fig. 4a to 4g, respectively;

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The following description uses several sets of video sequences captured from an industrial camera JAI, with a frame rate of 15 frames/second, an image size of 2448 x 2050, a number of channels of 3, and a grey value range of 0-255 for each channel. The moving direction of the object in the image is from right to left, namely, the object enters the visual field from the right boundary of the image, and the boundary of the object which firstly enters the visual field is the front boundary of the object.

Initializing parameters: setting the total frame number of a production line video sequence as N, wherein N is a natural number, and initializing the original production line image frame number N to 1; initializing a target positioning state obj _ state and a previous frame target positioning state obj _ state _ last to be 1, wherein the target positioning states have four values, 1 represents that no target is found, 2 represents that the target is positioned on the right boundary of the image, 3 represents that the target is positioned on the left boundary of the image, 4 represents a complete target, and 5 represents a complete target which is about to enter the central area; initializing a target moving direction obj _ move to 1, wherein the target moving direction has two values, 1 represents that the moving direction of the target in the image is from right to left, and 2 represents that the moving direction of the target in the image is from left to right; initializing a target rear boundary predicted value prediction and a target moving speed pixelv to be 0, wherein the unit of the prediction is a pixel, and the unit of the pixelv is each frame of the pixel; eye protection (reduction)Array of bounded location upd _ val_kAnd an image frame sequence number array upd _ fra_kAll elements in the array are removed, k is the serial number of the elements in the array, and k is initialized to 0;

as shown in FIG. 2, the method for capturing targets in a production line video sequence of the invention comprises the following steps:

(1) target positioning, inputting the n-th frame original production line image IMGⁿThe gray value of each pixel is

Wherein subscripts i and j respectively represent the abscissa and ordinate of the pixel in the image, and IMG _ row and IMG _ col respectively represent IMGⁿFirstly, performing binary segmentation and contour extraction on a down-sampled image to obtain an initial set of closed contours, and screening the set of closed contours according to geometric features and centroid positions; secondly, when a target contour exists, contour filling and vertical projection are carried out on the target contour, and binary segmentation and column-by-column scanning are carried out on a vertical projection result; then, when a target area exists, assigning the head pixel position and the tail pixel position of the target area to the obj _ front and the obj _ back respectively, and obtaining a target state identifier obj _ state according to the front and rear boundary positions of the target;

(2) if the obj _ state is 1, the step (5) is executed; if the obj _ state is 2, only the target front boundary position is obtained, then the step (3) is carried out; if the obj _ state is 3, only the target rear boundary position is obtained, then the step (5) is carried out; if the obj _ state is 4, obtaining the front and rear boundary positions of the target, and then turning to the step (4);

(3) the target moving speed and direction are estimated by first, when obj _ state _ last is not equal to 2, upd _ val_k、upd_fra_kAll elements in the solution are removed, and the value of k is taken as 0; recording the position of the target boundary and the serial number of the current frame, wherein the specific formula is as follows:

upd_val_k＝obj_front

upd_fra_k＝n

k＝k+1

then, if k! If yes, ending the step, wherein upd _ num is a threshold value of the quantity of the updating data; if k is upd _ num, the target moving speed pixelv is calculated, and the specific formula is as follows:

wherein upd _ dif_kFor the average frame difference of adjacent update data, when upd _ num > 5, k! Max ═ max&&k! Min means to remove upd _ dif_kTo calculate all remaining upd _ dif_kThe arithmetic mean of (a);

finally, if pixelv is greater than 0, the moving direction of the target in the image is from left to right, and the moving direction obj _ move of the target is set to 2; if pixelv is less than 0, the moving direction of the target in the image is from right to left, obj _ move is taken as 1, and pixelv is taken as-pixelv; if pixelv is 0, the update data is considered erroneous and upd _ val will be used_k、upd_fra_kAll elements in the solution are removed, and the value of k is taken as 0;

(4) judging whether the current target is a captured repeated target or not, if a complete target appears in the image frame processed last time, predicting the position of the previous target in the current frame image according to the target rear boundary position and the target moving speed recorded in the previous processing, and if the predicted position is matched with the current target, indicating that the current target is captured and abandoning the current target; otherwise, the current target is considered as a new target, and the current frame image IMG is outputⁿ；

Specifically judging whether the formula is preset > 0& & obj _ front < preset-pixelv, if the formula conditions are met, the current target is a repeated target, and if the formula conditions are not met, the current target is a new target; updating the target rear boundary predicted value prediction as obj _ back-pixel;

(5) taking obj _ state _ last as obj _ state; if the obj _ state is not 4, the value of predict is taken to be 0; and (4) adding 1 to the original image frame sequence number N, ending the processing flow if N is larger than N, otherwise, turning to the step (1) and waiting for entering the next frame.

the specific formula of binary segmentation is as follows:

the pixel value is divided for the ith row and the jth column of the nth frame image.

processing fig. 4b with obj _ th being 55, obj _ type being 1, ele _ m being 1, ele _ n being 4, and ele _ t being 1, the resulting closed outline set is as shown in fig. 4c, with different outlines being distinguished by color;

processing fig. 5b with obj _ th being 120, obj _ type being 0, ele _ m being 0, ele _ n being 0, and ele _ t being 0, the resulting closed outline set is shown in fig. 5 c;

processing fig. 6b with obj _ th being 62, obj _ type being 1, ele _ m being 0, ele _ n being 0, and ele _ t being 0, the resulting closed outline set is shown in fig. 6 c;

(a)

(b)

(c)

get area _ rate_min＝0.1,area_rate_max＝0.9,ctr_rate_min＝0.1,ctr_rate_max0.9, and the height and width of the down-sampled image are known as down _ row 205 and down _ col 245, respectively;

in fig. 4c, the row coordinate and the column coordinate of the maximum area and the centroid position of the contour are respectively: cont _ area_max＝13440,cont_ctr_max(x)＝98,cont_ctr_max(y) 105, wherein the position of the "+" sign is the centroid position of the outline, and the outline with the largest area and the centroid position thereof all meet the formula condition;

in FIG. 5c, the maximum value of the profile area is obtainedAnd the row coordinate and the column coordinate of the outline centroid position are respectively as follows: cont _ area_max＝17739,cont_ctr_max(x)＝81,cont_ctr_max(y) 105, the outline with the largest area and the centroid position thereof all meet the formula condition;

in fig. 6c, the row coordinate and the column coordinate of the maximum area and the centroid position of the contour are respectively: cont _ area_max＝15181,cont_ctr_max(x)＝89,cont_ctr_max(y) 128, the outline with the largest area and the centroid position thereof all meet the formula condition;

step 103: carrying out contour filling and vertical projection on the target contour, filling the target contour and the internal area thereof by using a gray value of 255 to obtain a target contour image CONTⁿ(ii) a To CONTⁿCarrying out vertical projection, and accumulating the pixel values in the same column to obtain a projection image CONT _ PRJⁿThe height of the projected image is 1; for CONT _ PRJⁿBinary division is performed to the CONT _ PRJ with a projection image division threshold value obj _ CONT _ thⁿCarrying out binarization segmentation, wherein the value of a pixel value larger than obj _ CONT _ th in the image is 255, otherwise, the value is 0, and obtaining a projection segmentation image CONT _ PRJ _ BINⁿ(ii) a For CONT _ PRJ _ BINⁿScanning column by column to calculate CONT _ PRJ _ BINⁿThe middle pixel values are the longest continuous area of 255, and the head and tail pixel positions prj _ first and prj _ end of the area are obtained; the longest continuous area is compensated, and the compensation formula is as follows:

prj_first＝func_max(1,prj_first-prj_comp)

prj_end＝func_min(img_col,prj_end+prj_comp)

filling the target contour in fig. 4c to obtain a target contour image, as shown in fig. 4d, taking obj _ cont _ th as 765, prj _ comp as 5, and obtaining prj _ first as 10, and prj _ end as 196;

filling the target contour in fig. 5c to obtain a target contour image, as shown in fig. 5d, taking obj _ cont _ th as 765 and prj _ comp as 15, and obtaining prj _ first as 1 and prj _ end as 215;

filling the target contour in fig. 6c to obtain a target contour image, as shown in fig. 6d, taking obj _ cont _ th as 765, prj _ comp as 5, and obtaining prj _ first as 36, and prj _ end as 225;

in the target contour image shown in fig. 4d, the target horizontal length is obj _ length 187, which meets the target area existence condition, and obj _ front is 10 and obj _ back is 196 are obtained;

in the target contour image shown in fig. 5d, the target horizontal length is obj _ length 215, which meets the target area existence condition, and obj _ front is 1 and obj _ back is 215;

in the target contour image shown in fig. 6d, the target horizontal length is obj _ length 190, which meets the target area existence condition, and obj _ front 36 and obj _ back 225 are obtained;

1≤obj_front≤edge_extend*img_col

the expansion ratio of the image boundary in the horizontal direction is taken as edge _ extended being 0.02;

in the target contour image shown in fig. 4d, obj _ front ═ 9 does not satisfy the filtering condition, and the target is not considered to be located on the left boundary of the image;

in the target contour image shown in fig. 5d, obj _ front ═ 1 satisfies the filtering condition, and the target is considered to be located on the left boundary of the image;

in the target contour image shown in fig. 6d, obj _ front ═ 36 does not satisfy the filtering condition, and the target is not considered to be located on the left boundary of the image;

(1-edge_extend)*img_col≤obj_back≤img_col

in the target contour image shown in fig. 4d, obj _ back 195 does not satisfy the filtering condition, and the target is not considered to be located on the right boundary of the image;

in the target contour image shown in fig. 6d, obj _ back 225 does not satisfy the filtering condition, and the target is not considered to be located on the right boundary of the image;

len_perL,len_perH∈(0,1)&&len_perL＜len_perH

taking len _ perL as 0.55 and len _ perH as 0.95;

in the target contour image shown in fig. 4d, the ratio of the target length to the target contour image width is 0.763, which is in accordance with the screening formula, and a complete target appears, as shown in fig. 4e, the position of the "yellow line" in the image is the target horizontal position obtained after the algorithm processing;

in the target contour image shown in fig. 6d, the ratio of the target length to the target contour image width is 0.776, which conforms to the screening formula, and a complete target is considered to appear, as shown in fig. 6 e;

when the obj _ state is 1, no processing is performed;

when obj _ state is 4, the values of obj _ front and obj _ back are swapped.

throw＝func_max(throw,0)+1

wherein obj _ state 1 indicates that no object is found, obj _ state 2 indicates that the object is located on the front boundary of the image, obj _

state

3 or 4 indicates that the object is located on the rear boundary of the image or the complete object, and obj _ state 5 indicates that the complete object is about to enter the central region;

ctr _ margin is the frame loss margin of the complete target, and ctr _ margin is a positive integer; the frame loss number surplus is measured as ctr _ margin which is 2; considering that the target rear boundary predicted value prediction is related to the value of throw, the prediction is updated as follows:

predict＝predict-pixelv*throw

predict＝func_max(predict,0)；

according to a preferred embodiment, step (4) captures only the target located in the central region of the image, and the specific capturing conditions are as follows:

it is determined whether the following capture conditions are satisfied:

(i)

(ii)

(iii)

wherein ctr _ frame is a central region frame number, and & represents logic, pixelv is a target moving speed, obj _ state represents that an image is a target detection state flag bit, and predict is a target rear boundary prediction value, the distance between obj _ ctr and img _ ctr is converted into a required frame number through pixelv, and a region in a ctr _ frame range taking img _ ctr as a center is defined as a central region; the center region frame number is assumed to be ctr _ frame 2.

According to a preferred embodiment, the production line image frames in step (1) are first down-sampled:

1≤k₁+1≤down_row,1≤k₂+1≤down_col,dsize＞0

k₁+1 and k₂+1 represents the abscissa and ordinate of the pixel in the down-sampled image, respectively; img _ row and img _ col respectively represent the height and width of an input image, dsize is a down-sampling coefficient, and the height and width of the down-sampling image are down _ row [ img _ row/dsize ]]And down _ col [ img _ col/dsize ]]Wherein]Represents rounding down;

specifically, k is in the initial state₁And k₂All of which are 0, for k in the progressive scanning order₁And k₂Assigning value to input image (1+ k)₁*dsize,1+k₂Dsize) location to the downsampled image (k)₁+1,k₂+1), if the input image has multiple channels, assigning an arithmetic mean of pixel values of the multiple channels; the img _ row is 2050, img _ col is 2448, chan is 3, which is a parameter of the original production line image frame, the down-sampling coefficient is dsize is 10, the size of the down-sampled image is down _ row is 205, down _ col is 245, and the number of channels is 1.

In order to evaluate the performance of the invention, different models of mobile phone target video sequences are adopted for testing.

Fig. 4a is an original image acquired by the JAI camera, the Model of the mobile phone target is Model1, it is seen that there is strong interfering background noise, and the black area of the mobile phone target is used as the detection area; FIG. 4a is down-sampled to obtain FIG. 4 b; FIG. 4b is subjected to binary segmentation and contour extraction to obtain a band diagram 4 c; FIG. 4c is a graph of FIG. 4d after geometric profile, centroid position screening and profile filling; FIG. 4d is processed by vertical projection segmentation, line-by-line scanning and target area judgment to obtain the front and rear boundary positions of the target and the target state identifier; FIG. 4e is a target location effect graph, and the processing results of FIG. 4a are obtained as follows:

obj_front＝10,obj_back＝196,obj_state＝4

fig. 4f shows the same model of the handset object as fig. 4a, but without installing the same handset accessory as the object in fig. 4a, and the parameters in fig. 4a are used to process fig. 4f to obtain the target position effect diagram as shown in fig. 4g, and the results obj _ front ═ 46, obj _ back ═ 233, and obj _ state ═ 4 are obtained.

The meanings of the images in fig. 5a to 5g are the same as those in fig. 4a to 4g, respectively, the Model of the mobile phone target is Model2, the white area of the mobile phone target is used as the detection area, and the large projection area compensation prj _ comp is 15 in consideration of the existence of the black border of the mobile phone target, so as to obtain the processing result in fig. 5a as follows:

obj_front＝1,obj_back＝215,obj_state＝1

processing fig. 5f using the parameters of fig. 5a yields a target position effect map as shown in fig. 5g, and yields results obj _ front ═ 10, obj _ back ═ 231, and obj _ state ═ 4.

The meanings of the images in fig. 6a to 6g are the same as those in fig. 4a to 4g, respectively, the Model of the mobile phone target is Model3, it can be seen that there are uneven illumination, interference of background environment, and multiple objects in the original image, and the black area of the mobile phone target is taken as the detection area, and the processing result in fig. 6a is obtained as follows:

obj_front＝36,obj_back＝225,obj_state＝4

processing fig. 6f using the parameters of fig. 6a yields a target position effect map as shown in fig. 6g, and yields results obj _ front ═ 32, obj _ back ═ 239, and obj _ state ═ 4.

The experimental results show that the method can deal with the situations of strong background interference, uneven illumination and multi-target occurrence, can process different types of targets, can adapt to the same type of targets in different processing states, and has stronger robustness; in addition, the large-scale down sampling, the self-adaptive frame loss, the self-adaptive moving speed and the moving direction updating in the invention ensure the rapidity of target detection, meet the real-time requirement of a production line, and ensure the non-repeated detection of a complete target after the self-adaptive boundary predicted value updating, which is not illustrated one by one.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A method of capturing a target in a production line video sequence, comprising the steps of:

(4) if the target positioning state table of the previous frame of the current frame indicates that the complete target is positioned, predicting the position of the previous target in the current frame image according to the target boundary position of the previous frame and the estimated target moving speed, and if the predicted position is matched with the current target positioning, indicating that the current target is captured and abandoning the target; otherwise, taking the current target as a new target, outputting the current frame image, and turning to the step (5);

2. The method for capturing the target in the production line video sequence as claimed in claim 1, wherein the step (1) of target location is realized by:

3. The method for capturing the target in the production line video sequence as claimed in claim 2, wherein the step (1) of target positioning is realized by:

the specific formula of binary segmentation is as follows:

(a)

(b)

(c)

prj_first＝func_max(1,prj_first-prj_comp)

prj_end＝func_min(img_col,prj_end+prj_comp)

1≤obj_front≤edge_extend*img_col

wherein edge _ extend is the expansion ratio of the image boundary in the horizontal direction, and can be more than 0 and less than edge _ extend and less than 1, if obj _ front satisfies the screening formula, the target is considered to be positioned on the left boundary of the image and not completely appear, and step 107 is executed; otherwise, go to step 108;

(1-edge_extend)*img_col≤obj_back≤img_col

len_perL,len_perH∈(0,1)&&len_perL＜len_perH

when the obj _ state is 1, no processing is performed;

when obj _ state is 4, the values of obj _ front and obj _ back are swapped.

4. The method of claim 1, wherein the image frame with sequence number of n plus the number of missing frames is treated as the next frame; the calculation formula of the number of lost frames throw is as follows:

throw＝func_max(throw,0)+1

obj _ front is the front boundary position of the target area, obj _ back is the rear boundary position of the target area, pixelv is the target moving speed, img _ ctr is the column center of the current image frame, obj _ ctr is the column center of the target in the current image frame, and func _ max () function represents the larger value of two numbers;

5. the method of claim 1, wherein step (4) further determines whether the target is in a center region of the image, if so, the target is extracted, otherwise, it is predicted whether the target will appear in a center region of the image in a subsequent mth frame according to the current position of the target and the estimated moving speed of the target, if so, the method waits until the subsequent mth frame extracts the target as the valid target, and if not, the method determines the target in the current frame as the valid target.

6. The method for capturing the target in the production line video sequence as claimed in claim 5, wherein the step (4) of determining whether the target is in the central region of the image is implemented by:

it is determined whether the following capture conditions are satisfied:

(i)

(ii)

(iii)

7. The method for capturing objects in a production line video sequence as claimed in claim 1, wherein the production line image frames in step (1) are further down-sampled:

1≤k₁+1≤down_row,1≤k₂+1≤down_col,dsize＞0

k₁+1 and k₂+1 represents the abscissa and ordinate of the pixel in the image, respectively; img _ row and img _ col respectively represent the height and width of the input image, dsize is the down-sampling coefficient, and the height and width of the down-sampling image are down_row＝[img_row/dsize]And down _ col [ img _ col/dsize ]]Wherein]Indicating a rounding down.

8. A system for capturing objects in a production line video sequence, comprising:

a fourth module, predicting the position of the previous target in the current frame image according to the target boundary position of the previous frame and the estimated target moving speed if the target positioning state table of the previous frame of the current frame indicates that the complete target is positioned, and giving up the target if the predicted position is matched with the current target positioning, which indicates that the current target is captured; otherwise, the current target is used as a new target, the current frame image is output, and the fifth module is switched to;