CN107346538A

CN107346538A - Method for tracing object and equipment

Info

Publication number: CN107346538A
Application number: CN201610295085.XA
Authority: CN
Inventors: 诸加丹; 王千; 庞勃; 王刚
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2016-05-06
Filing date: 2016-05-06
Publication date: 2017-11-14

Abstract

A kind of method for tracing object and equipment are provided, methods described includes：According to position in current frame image of the historical movement information prediction of the object object and the movement velocity vector of the opening position；Core correlation filtering is carried out along the movement velocity vector, with position of the detection object in current frame image；Judge whether object is blocked in the current frame；If be not blocked, using position of the position detected as the object in current frame image；Otherwise, to the position of prediction and carry out the testing result of core correlation filtering and carry out Bayes's checking, and using position of the Bayesian decision result as object in the current frame.The method for tracing object and equipment when object is blocked using the historical movement information prediction of the object object position and the position of prediction is modified by Bayes's checking, so that more reliable object's position can also be obtained by being blocked even if object, the accuracy of tracking is which thereby enhanced, object can be tracked for a long time.

Description

Object tracking method and device

Technical Field

The present disclosure relates generally to image processing, and more particularly to object tracking methods and apparatus.

Background

The object tracking is a basic function of image processing, and has important application in a plurality of research directions such as video monitoring, man-machine interaction and the like. In recent years, with the development of technology, object tracking has been greatly advanced. However, in practical applications, object tracking still faces many challenges, such as changes in light in a scene, occlusion of objects, changes in the appearance of objects themselves, and so on.

KCF (Kernel Correlation Filter) is a new single-target tracking method, and the method is high in operation speed and good in tracking performance. However, in practical applications, the KCF also faces the challenges of changing the size of the target object, blocking the target object, and very complicated scene tracking. For example, fig. 1 shows an example scenario for tracking in a real scene with KCF. The target object to be tracked is outlined by a rectangular box in the leftmost drawing in fig. 1; it can be seen that in the right-most diagram in fig. 1, the tracking result is wrong and the target object is lost. It can be seen that in a dense environment, KCF tracking fails when a target object is occluded and there are other objects beside it that are close to its appearance.

Disclosure of Invention

According to an embodiment of an aspect of the present disclosure, there is provided an object tracking method including: predicting the position of the object in the current frame image and the motion velocity vector at the position according to the historical motion information of the object; performing kernel correlation filtering along the motion velocity vector to detect the position of the object in the current frame image; judging whether the object is shielded in the current frame; if the object is not occluded in the current frame, taking the detected position as the position of the object in the current frame image; otherwise, Bayesian verification is carried out on the predicted position and the detection result of the kernel correlation filtering, and the Bayesian decision result is used as the position of the object in the current frame.

According to an embodiment of another aspect of the present disclosure, there is provided an object tracking apparatus including: a prediction unit configured to predict a position of an object in a current frame image and a motion velocity vector at the position according to historical motion information of the object; a detection unit configured to perform kernel-dependent filtering along the motion velocity vector to detect a position of the object in a current frame image; a judging unit configured to judge whether the object is occluded in the current frame; a determining unit configured to take the detected position as a position of the object in the current frame image if the object is not occluded in the current frame image; otherwise, Bayesian verification is carried out on the predicted position and the detection result of the kernel correlation filtering, and the Bayesian decision result is used as the position of the object in the current frame.

According to an embodiment of another aspect of the present disclosure, there is provided an object tracking apparatus including: a processor; a memory; and computer program instructions stored in the memory. The computer program instructions, when executed by the processor, perform the steps of: predicting the position of the object in the current frame image and the motion velocity vector at the position according to the historical motion information of the object; performing kernel correlation filtering along the motion velocity vector to detect the position of the object in the current frame image; judging whether the object is shielded in the current frame; if the object is not occluded in the current frame, taking the detected position as the position of the object in the current frame image; otherwise, Bayesian verification is carried out on the predicted position and the detection result of the kernel correlation filtering, and the Bayesian decision result is used as the position of the object in the current frame.

The object tracking method and device adopt KCF to track the object, and when the object is shielded, the position of the object is predicted by adopting the historical motion information of the object and the predicted position is corrected by Bayesian verification, so that the more reliable object position can be obtained even if the object is shielded, thereby improving the tracking accuracy and enabling the object to be tracked for a long time.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail embodiments of the present disclosure with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. In the drawings, like reference numbers generally represent like parts or steps.

Fig. 1 shows an example scenario for tracking in a real scene with KCF.

FIG. 2 shows a flow diagram of an object tracking method according to an embodiment of the present disclosure.

Fig. 3 illustrates a flowchart of KCF detection along a predicted motion velocity vector in an object tracking method according to an embodiment of the present disclosure.

Fig. 4 shows a confidence map representing exemplary detection results obtained by KCF detection.

Fig. 5 shows a flowchart of bayesian verification processing on a predicted location and a detection result of performing kernel correlation filtering in an object tracking method according to an embodiment of the present disclosure.

Fig. 6 shows an exemplary prior probability.

Fig. 7 shows a functional configuration block diagram of an object tracking apparatus according to an embodiment of the present disclosure.

FIG. 8 illustrates a block diagram of a computing device for implementing an exemplary object tracking device in accordance with embodiments of the present disclosure.

Detailed Description

The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

As described earlier, in a dense environment, when a target object is occluded and there are other objects close to its appearance beside the target object, KCF detection may be erroneous, resulting in a tracking failure. Therefore, in the present disclosure, when the target object is occluded, the position detected by the KCF is not employed as the tracking result, but Bayes (Bayes) verification is performed on the position predicted from the historical motion information of the target object and the detection result of the KCF, and the bayesian decision result is employed as the tracking result, thereby enabling to obtain a more reliable object position even if the target object is occluded.

An object tracking method according to an embodiment of the present disclosure is described below with reference to fig. 2. FIG. 2 shows a flow diagram of an object tracking method according to an embodiment of the present disclosure.

As shown in fig. 2, in step S210, the position of the object in the current frame image and the motion velocity vector at the position are predicted according to the historical motion information of the object.

The prediction may be performed in this step by any suitable method in the art, such as Kalman filtering, particle filtering, and the like. For convenience of explanation, prediction using Kalman filtering is described as an example below.

A Kalman filter is used to model the motion model of the object. As is well known, the working principle of a Kalman filter can be represented by the following equation:

X′_(t|t-1)＝P_(t-1|t-1)*X_(t-1|t-1)(1)

P_t|t-1＝P_t-1|t-1+Q (2)

Kg_t＝P_t|t-1*H+(H*P_t|t-1+R) (3)

X_t＝X′_t|t-1+Kg_t*(H*Y_t-H*X′_t|t-1) (4)

P_t＝(I-Kg_t)*H*P_t|t-1(5)

wherein X represents a system state value, and X ═ X yv_xv_y]', wherein [ x, y]' is the position of the object in the image frame, [ v [_xv_y]' is [ x, y ] of the object in the image]The motion velocity vector at, Y denotes the measured value, Kg_tIs the error gain, and the subscript t, t-1 indicates the time (in this disclosure it is assumed that time t corresponds to the current frame image and t-1 corresponds to the previous frame image); p denotes a state transition matrix, H is an observation model, Q and R are the variance of the system process noise and the variance of the measurement noise, respectively, and the initial values of P, H, Q and R can be set empirically according to the specific application scenario, for example, as a non-limiting example, P, H, Q and R can be set as follows:

equation (1) above describes the prediction process, X ', of the Kalman filter'_t|t-1A predicted system state value representing a time t; equations (2) - (5) describe the Kalman filter update process, X_tRepresenting the estimated system state value at time t.

It should be noted that the current frame image in this step is an image frame other than the first frame in the image frame sequence including the tracked object. Whereas in the first image frame of the sequence of image frames the initial position of the object may be determined by any object detection method or by manual specification. In addition, for convenience of description, in the present embodiment, an object is represented by a circumscribed rectangular frame of the object in an image frame, and coordinates of a center point of the rectangular frame are taken as position coordinates of the object. For example, assume that the rectangular box shown in the leftmost diagram of fig. 1 is the initial position of the object determined in the first frame image in the image frame sequence.

In addition, the image frame sequence containing the tracked object may be obtained by shooting through a camera or sensing through various sensors.

In step S220, kernel-dependent filtering is performed along the motion velocity vector to detect the position of the object in the current frame image.

As mentioned above, the kernel correlation filtering KCF is a single target tracking method, and its specific description can refer toAn article "High-Speed transportation with Kernelized corporation Filters", published by Joge Batista in IEEEtransactions on Pattern Analysis and Machine understanding, 11.2014, Henriques, Rui Caseiro, Pedr marks and Joge Batista, which is incorporated herein by reference in its entirety. To facilitate understanding, the KCF method is briefly described below. In general, the working principle of KCF can be expressed by the following expression:

wherein,indicating the element dot product, y_tIs one or more sample samples in an image frame, and represents a discrete Fourier transform, F^-1Representing inverse discrete fourier transform, and the KCF detection result is the detected multiple candidate values and the confidence of each candidate value.

Wherein z is a training sample, b is a regression objective function, and λ is a constant; k is a kernel matrix, the matrix elements being specifically of the formula:

wherein N, N' represents any two vectors, P_CIs a transposed matrix.

In this step S220, KCF detection is performed along the motion velocity vector predicted in step S210. An exemplary process of this step will be described below with reference to fig. 3.

As shown in fig. 3, in step S2201, at least one sample y is extracted along the motion velocity vector at predetermined intervals_t。

The motion velocity vector can represent the direction of motion of the object along which the sample (i.e. the rectangular box representing the object) is to be taken at this step, and the point on the motion velocity vector is taken as the center point of the rectangular box. The predetermined interval may be arbitrarily set, and as an example, the predetermined interval may be set to 1/2 of the width of a rectangular frame representing the object.

In step S2202, for the sample y_tPerforming Kernel Correlation Filtering (KCF) to determine a plurality of candidate positions of the object in the current frame image and the confidence of each candidate position.

In this step, y is sampled for each sample as shown in expressions (6) to (9) above_tPerforming KCF to obtain KCF detection results, i.e. multiple candidates of the object in the current frame imageLocation and confidence of each candidate location.

Fig. 4 shows a confidence map representing exemplary detection results obtained by KCF detection. As shown in fig. 4, the position of each pixel in the dashed box in the figure represents each detected candidate position, wherein the value of each pixel represents the confidence value of the candidate position where the pixel is located, and the darker the color represents the higher the confidence of the candidate position, and the lighter the color represents the lower the confidence of the candidate position.

In step S2203, the candidate position with the highest degree of confidence among the plurality of candidate positions is selected as the position of the detected object in the current frame image.

A higher confidence indicates a higher probability that the candidate position is the position of the object in the current frame image. Thus, in this step, the candidate position with the highest confidence is selected as the position of the detected object in the current frame image.

The process of step S220 has been described above in conjunction with fig. 4. Optionally, the candidate position with the highest confidence selected in step S2203 may be further processed, and the result after the further processing is used as the position of the object in the current frame image. Specifically, the candidate position with the highest confidence may be used as the measurement value Y of the current frame, the Kalman filter is updated as shown in equation (4), and the position [ X, Y ]' in the system state vector X of the current frame obtained thereby may be used as the position of the object in the current frame image. Through the above processing, the tracking result can be made smoother.

In step S230, it is determined whether the object is occluded in the current frame.

Various suitable methods may be employed in this step to determine whether an object is occluded. For example, as a basic method, whether occlusion occurs can be determined by detecting whether the foreground in the current frame is reduced. In the present embodiment, as an example, whether or not an object is occluded is determined by a Peak-to-Sidelobe ratio (PSR).

Specifically, a peak-to-side lobe ratio of a confidence map representing the detection result is calculated according to a plurality of candidate positions of the object detected in step S220 in the current frame image and the confidence of each candidate position; if the peak-to-side lobe ratio is greater than a predetermined threshold, determining that the object is occluded in the current frame, otherwise determining that the object is not occluded in the current frame.

The peak-to-side lobe ratio is an image processing means commonly used in the art, and can be calculated as the following equation (10):

wherein, g_maxIs the maximum value, u, in the confidence map_slAnd σ_slRespectively, the mean and variance of the side lobe part in the confidence map. The side lobe portion is the remaining portion of the confidence map excluding the peak portion, which is the portion centered at the maximum value and having a predetermined ratio (e.g., 80%) of the total energy to the global energy.

In step S240, if the object is not occluded in the current frame, the detected position is taken as the position of the object in the current frame image; otherwise, Bayesian verification is carried out on the predicted position and the detection result of the kernel correlation filtering, and the Bayesian decision result is used as the position of the object in the current frame.

In this step, corresponding processing is performed according to the determination result of step S230.

If step S230 judges that the object is not occluded in the current frame, the result of KCF detection is considered to be authentic, so the position detected based on KCF in step S220 is taken as the position of the object in the current frame image.

In the case where the object is not occluded in the current frame, the Kalman filter and the kernel correlation filter that performs KCF may be further updated, except that the position detected by KCF is directly set as the position of the object in the current frame image.

Specifically, on one hand, the kalman filter may be updated as shown in equations (2) - (5), so as to obtain updated model parameters and a system state value of the current frame, thereby obtaining a more accurate tracking result in the tracking of the subsequent frame.

On the other hand, a sample may be extracted at a position [ X, y ]' in the system state vector X of the current frame obtained by updating the Kalman filter by equation (4), and a new kernel correlation filter may be obtained by training with the sample and equations (7) to (8), and then the kernel correlation filter may be updated as shown in the following equation:

where η is the update rate, and in this example η is 0.1.

If it is determined in step S230 that the object is occluded in the current frame, the position detected by the KCF is not used as the tracking result in this case, since an error may occur in the KCF tracking as described above. In addition, although Kalman filtering, for example, may be employed to predict the position of an object when a target object is occluded, in real scene applications, the motion of the object is variable and does not strictly conform to a linear model, and therefore predicting using a linear model such as a Kalman filter is also erroneous; and the Kalman filter itself has noise, the prediction value also contains noise. So a simple Kalman prediction as the tracking result is prone to errors. Therefore, in this embodiment, if it is determined that the object is occluded in the current frame, bayesian verification is performed on the position predicted in step S210 and the detection result of the KCF, and the bayesian decision result is used as the tracking result.

Bayesian verification is an image processing means commonly used in the art, and the bayesian verification process in this step S240 will be described below with reference to fig. 5. Fig. 5 illustrates an exemplary flow diagram of a bayesian verification process for a predicted location and detection results of performing kernel correlation filtering.

As shown in FIG. 5, in step S2401, the prior probability P (x) of Bayesian verification is made_t) Subject to a gaussian function of the predicted position.

In this step, a widely used gaussian function is used, where the mean of the gaussian function is the predicted position information, as shown in the following equation:

P(x_t)～N(x′_t|t-1,σ) (13)

wherein, x'_t|t-1Is the prediction result at time t (i.e., the current frame image). Note that when prediction is performed by Kalman filtering, N (x ″) is determined because the prediction result of Kalman filtering includes the position of the object in the current frame image and the motion velocity vector at the position'_t|t-1σ) is a directional two-dimensional gaussian function whose direction coincides with the predicted motion velocity vector. For example, fig. 6 shows an exemplary prior probability, in which a straight line with an arrow represents a motion velocity vector of an object.

In step S2402, the detection result subjected to kernel correlation filtering is taken as a conditional probability of bayesian verification.

In this step, as shown in the following formula, the KCF detection result (i.e., the plurality of candidate positions of the subject in the current frame image and the confidence of each candidate position) is taken as the conditional probability of the bayesian verification.

P(x_t|y_t) KCF test result (14)

The KCF detection result is shown in expression (6).

In step S2403, a posterior probability proportional to the product of the prior probability and the conditional probability is calculated as shown in the following formula.

P(y_t|x_t)∝P(x_t)*P(x_t|y_t) (15)

In step S2404, the position of the maximum posterior probability is selected as the bayesian decision result.

y_b＝arg max P(y_t|x_t) (16)

Wherein, y_bThe position with the maximum a posteriori probability, i.e. the bayesian decision result, i.e. the position of the object in the current frame.

The object tracking method according to the embodiment of the present disclosure has been described above with reference to the accompanying drawings. The object tracking method adopts KCF to track the object, predicts the position of the object by adopting the historical motion information of the object when the object is shielded, and corrects the predicted position through Bayesian verification, so that the more reliable object position can be obtained even if the object is shielded, thereby improving the tracking accuracy and enabling the object to be tracked for a long time.

An object tracking device 700 according to an embodiment of the present disclosure is described below with reference to fig. 7. Fig. 7 shows a functional configuration block diagram of an object tracking apparatus according to an embodiment of the present disclosure. As shown in fig. 7, the object tracking apparatus 700 may include: a prediction unit 710, a detection unit 720, a judgment unit 730, and a determination unit 740. The specific functions and operations of the units are substantially the same as described above with respect to fig. 2-6, and thus, in order to avoid repetition, only a brief description of the apparatus will be provided hereinafter, while a detailed description of the same details will be omitted.

The prediction unit 710 is configured to predict a position of an object in a current frame image and a motion velocity vector at the position according to historical motion information of the object. The prediction unit 710 may perform the prediction by any suitable method in the art, such as Kalman filtering, particle filtering, and the like. For convenience of explanation, prediction using Kalman filtering is described as an example below. The working principle of modeling and predicting the motion model of the object by using the Kalman filter is shown in equations (1) to (5) in the foregoing, and will not be described in detail here.

It should be noted that the current frame image described herein is an image frame other than the first frame in the image frame sequence including the tracked object. Whereas in the first image frame of the sequence of image frames the initial position of the object may be determined by any object detection method or by manual specification. In addition, for convenience of description, in the present embodiment, an object is represented by a circumscribed rectangular frame of the object in an image frame, and coordinates of a center point of the rectangular frame are taken as position coordinates of the object.

The detection unit 720 is configured to perform kernel-dependent filtering along the motion velocity vector to detect the position of the object in the current frame image.

As mentioned above, the operation principle of the kernel correlation filtering KCF is represented by expressions (6) to (8) in the foregoing, and will not be described herein again

The detection unit 720 performs KCF detection along the motion velocity vector predicted by the prediction unit 710. Specifically, the detection unit may include a sampling sub-unit, a KCF detection sub-unit, and a selection sub-unit.

The sampling sub-unit is configured to extract at least one sample along the motion velocity vector predicted by the prediction unit 710 at a predetermined interval. The motion velocity vector can represent the direction of motion of the object, where the sampling subunit will extract the sample along this direction of motion (i.e. the rectangular frame representing the object) and take the point on the motion velocity vector as the center point of the rectangular frame. The predetermined interval may be arbitrarily set, and as an example, the predetermined interval may be set to 1/2 of the width of a rectangular frame representing the object.

The KCF detection subunit is configured to perform KCF on the sampling samples to determine a plurality of candidate positions of the object in the current frame image and confidence degrees of the candidate positions. Specifically, the KCF detection subunit performs KCF for each sample as shown in the above expressions (6) to (9), thereby obtaining KCF detection results, i.e., a plurality of candidate positions of the subject in the current frame image and the confidence degrees of the respective candidate positions.

The selecting subunit is configured to select a candidate position with the highest confidence level among the plurality of candidate positions as the position of the detected object in the current frame image. A higher confidence indicates a higher probability that the candidate position is the position of the object in the current frame image. Here, the selection subunit selects the candidate position with the highest confidence as the detected position of the object in the current frame image.

Optionally, the selecting subunit may further process the candidate position with the highest confidence coefficient, and use the result after further processing as the position of the object in the current frame image. Specifically, the candidate position with the highest confidence may be used as the measurement value Y of the current frame, the Kalman filter is updated as shown in equation (4), and the position [ X, Y ]' in the system state vector X of the current frame obtained thereby may be used as the position of the object in the current frame image. Through the above processing, the tracking result can be made smoother.

The judging unit 730 is configured to judge whether the object is occluded in the current frame. The determination unit 730 may employ various suitable methods to determine whether the object is occluded. For example, as a basic method, it can determine whether occlusion occurs by detecting whether the foreground in the current frame is reduced. In the present embodiment, the determination unit 730 determines whether the object is occluded by Peak-to-sidelike ratio (PSR) as an example.

Specifically, the determining unit 730 calculates a peak-to-side lobe ratio of a confidence map representing the detection result according to a plurality of candidate positions of the object detected by the detecting unit 720 in the current frame image and the confidence of each candidate position; if the peak-to-side lobe ratio is greater than a predetermined threshold, determining that the object is occluded in the current frame, otherwise determining that the object is not occluded in the current frame. The peak-to-side lobe ratio is a commonly used image processing means in the art and has been described above and will not be described further herein.

The determining unit 740 is configured to take the detected position as the position of the object in the current frame image if the object is not occluded in the current frame image; otherwise, Bayesian verification is carried out on the predicted position and the detection result of the kernel correlation filtering, and the Bayesian decision result is used as the position of the object in the current frame.

The determining unit 740 performs corresponding processing according to the judgment result of the judging unit 730.

Specifically, if the judging unit 730 judges that the object is not occluded in the current frame, the position detected by the KCF is considered to be authentic, and therefore the detecting unit 720 uses the position detected by the KCF as the position of the object in the current frame image.

It should be noted that, in the case where the object is not occluded in the current frame, the determination unit 740 may further update the Kalman filter and the kernel correlation filter that performs KCF, in addition to directly taking the position detected by KCF as the position of the object in the current frame image.

Specifically, on the one hand, the determining unit 740 may update the kalman filter as shown in equations (2) - (5), so as to obtain updated model parameters and a system state value of the current frame, thereby obtaining a more accurate tracking result in the tracking of the subsequent frame. On the other hand, the determining unit 740 may extract a sample at a position [ X, y ]' in the system state vector X of the current frame obtained by updating the Kalman filter by equation (4), train a new kernel correlation filter using the sample and equations (7) to (8), and then update the kernel correlation filter as shown in equations (11) to (12)

On the other hand, if the determination unit 730 determines that the object is occluded in the current frame, since an error may occur in the KCF tracking as described above, the position detected by the KCF is not employed as the tracking result in this case. In addition, although Kalman filtering, for example, may be employed to predict the position of an object when the target object is occluded, purely using Kalman prediction as a tracking result is prone to errors. Therefore, in this embodiment, if it is determined that the object is occluded in the current frame, the determining unit 740 performs bayesian verification on the predicted position and the detection result of the KCF, and uses the bayesian decision result as the tracking result.

Bayesian verification is an image processing means commonly used in the art, and the determining unit 740 may further include a prior subunit, a condition subunit, a posterior subunit, and a decision subunit, and performs bayesian verification as follows.

The prior subunit uses the prior probability P (x) of Bayesian verification_t) Subject to a gaussian function of the predicted position. As an example, the prior subunit uses a directional two-dimensional gaussian function, where the average of the two-dimensional gaussian function is the predicted position information, and the direction of the two-dimensional gaussian function is consistent with the predicted motion velocity vector.

And the conditional subunit takes the detection result of the KCF as the conditional probability of the Bayesian verification. Specifically, the conditional subunit uses the KCF detection result, i.e. the confidence of the candidate positions and the confidence of each candidate position of the object in the current frame image, as the conditional probability of the bayesian verification.

The posterior subunit calculates a posterior probability that is proportional to the product of the prior probability and the conditional probability.

The decision subunit selects the position of the maximum posterior probability in the posterior probabilities calculated by the posterior subunit as the bayesian decision result, i.e. the position of the object in the current frame.

An object tracking device 700 according to an embodiment of the present disclosure has been described above with reference to fig. 7. The object tracking apparatus 700 predicts the position of an object using historical motion information of the object when the object is occluded and corrects the predicted position through bayesian verification, so that a more reliable object position can be obtained even if the object is occluded, thereby improving the tracking accuracy and enabling the object to be tracked for a long time.

A block diagram of a computing device that may be used to implement an exemplary object tracking device of embodiments of the present disclosure is described below with reference to fig. 8.

As shown in fig. 8, computing device 800 includes one or more processors 802, storage 804, cameras 806, and output 808, which are interconnected by a bus system 810 and/or other form of connection mechanism (not shown). It should be noted that the components and configuration of computing device 800 shown in FIG. 8 are exemplary only, and not limiting, as computing device 800 may have other components and configurations as desired.

The processor 802 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the computing device 800 to perform desired functions.

The storage 804 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 802 to implement the functionality of the embodiments of the disclosure described above and/or other desired functionality. Various applications and various data, such as historical motion information of the subject, a predicted position and a motion velocity vector at the position, extracted sample samples, a detection result of KCF, a peak-to-side lobe ratio of a confidence map, a prior probability, a posterior probability, a conditional probability, various predetermined thresholds, and the like, may also be stored in the computer-readable storage medium.

The camera 806 is used to capture a sequence of image frames containing a target object and store the captured frame images in the storage 804 for use by other components. Of course, other external devices may be utilized to capture the sequence of image frames and transmit the captured frames of images to the computing device 800. In this case, the camera 806 may be omitted.

The output device 808 may output various information, such as a tracking result of a position of the target object in the current frame image, to the outside, and may include various display apparatuses such as a display, a projector, a television, and the like.

The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.

The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

The flowchart of steps in the present disclosure and the above description of the methods are only given as illustrative examples and are not intended to require or imply that the steps of the various embodiments must be performed in the order given, some steps may be performed in parallel, independently of each other or in other suitable orders. Additionally, words such as "thereafter," "then," "next," etc. are not intended to limit the order of the steps; these words are only used to guide the reader through the description of these methods.

It is also noted that in the apparatus and methods of the present disclosure, the components or steps may be broken down and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims

1. An object tracking method, comprising:

predicting the position of the object in the current frame image and the motion velocity vector at the position according to the historical motion information of the object;

performing kernel correlation filtering along the motion velocity vector to detect the position of the object in the current frame image;

judging whether the object is shielded in the current frame; and

if the object is not occluded in the current frame, taking the detected position as the position of the object in the current frame image; otherwise, Bayesian verification is carried out on the predicted position and the detection result of the kernel correlation filtering, and the Bayesian decision result is used as the position of the object in the current frame.

2. The object tracking method according to claim 1, wherein the current frame image is an image frame other than a first frame in a sequence of image frames containing the object.

3. The object tracking method of claim 1, wherein performing kernel-dependent filtering along the motion velocity vector to detect the position of the object in the current frame image further comprises:

extracting at least one sample along the motion velocity vector at predetermined intervals;

performing kernel correlation filtering on the sampling samples to determine a plurality of candidate positions of the object in the current frame image and confidence degrees of the candidate positions;

and selecting the candidate position with the highest confidence degree in the plurality of candidate positions as the position of the detected object in the current frame image.

4. The object tracking method of claim 3, wherein the determining whether the object is occluded in the current frame further comprises:

calculating peak side lobe ratios of a confidence map representing the confidence of the respective candidate locations;

if the peak-to-side lobe ratio is greater than a predetermined threshold, determining that the object is occluded in the current frame, otherwise determining that the object is not occluded in the current frame.

5. The object tracking method according to claim 1, wherein bayesian verifying the predicted location and the detection result of the kernel correlation filtering, and using the bayesian decision result as the location of the object in the current frame further comprises:

obeying the prior probability of bayesian verification to a gaussian function of the predicted location;

taking the detection result of the kernel correlation filtering as the conditional probability of Bayesian verification;

calculating a posterior probability, which is proportional to the product of the prior probability and the conditional probability;

and selecting the position with the maximum posterior probability as a Bayesian decision result.

6. The object tracking method according to claim 1, wherein a position of the object in the current frame image and a motion velocity vector at the position are predicted by kalman filtering, and the object tracking method further comprises:

updating the Kalman filter if the object is not occluded in the current frame.

7. The object tracking method of claim 6, further comprising:

and if the object is not shielded in the current frame, extracting a sampling sample at the estimated position obtained by updating the Kalman filter, and updating the kernel correlation filter for performing the kernel correlation filtering by using the sampling sample.

8. The object tracking method of claim 6, further comprising:

an initial position of the object is detected or specified in a first image frame of a sequence of image frames comprising the object.

9. An object tracking device, comprising:

a prediction unit configured to predict a position of an object in a current frame image and a motion velocity vector at the position according to historical motion information of the object;

a detection unit configured to perform kernel-dependent filtering along the motion velocity vector to detect a position of the object in a current frame image;

a judging unit configured to judge whether the object is occluded in the current frame;

a determining unit configured to take the detected position as a position of the object in the current frame image if the object is not occluded in the current frame image; otherwise, Bayesian verification is carried out on the predicted position and the detection result of the kernel correlation filtering, and the Bayesian decision result is used as the position of the object in the current frame.

10. An object tracking device, comprising:

a processor;

a memory; and

computer program instructions stored in the memory, which when executed by the processor perform the steps of:

judging whether the object is shielded in the current frame;