Space-time feature-based crowd personnel trajectory prediction method
Technical Field
The invention belongs to the technical field of video identification, and particularly relates to a method for predicting a group personnel track based on space-time characteristics.
Background
With the development of socioeconomic performance and the deepening of urbanization, the population scale of cities rises year by year, and people gathering and congestion are easy to occur. People gathering has contingency and burstiness, and how to timely and accurately predict the track of the crowd personnel becomes a key for preventing the crowd events. In recent years, with the rapid development of smart cities and artificial intelligence technologies, cameras are distributed over streets and alleys of the cities, and video monitoring can cover a large number of public areas such as streets and the like, so that hardware foundation and data support are provided for crowd personnel track prediction. Through a CV (Computer Vision) algorithm, a Computer can automatically analyze video content without manually observing massive videos. The CV algorithm can locate and track a plurality of targets and forecast target tracks, so that the analysis, forecast and control capacity of the relevant management departments on the group behaviors can be improved, and the urban order and safety are ensured.
At present, although the multi-target tracking algorithm can identify people in a video and draw a personnel track, the influence of specific space-time conditions (such as rush hour, bus stop and the like) on the personnel track is not considered, the influence of special people (such as public people) on the crowd track is not considered, and the crowd personnel track cannot be analyzed aiming at a specific scene.
Disclosure of Invention
First, the technical problem to be solved
The invention aims to solve the technical problems that: how to provide a method for predicting the track of population personnel based on space-time characteristics.
(II) technical scheme
In order to solve the technical problems, the invention provides a method for predicting a population personnel track based on space-time characteristics, which comprises the following steps:
step 1: multi-target track identification;
Step 2: constructing space-time weight vectors;
Step 3: constructing a special personnel weight vector;
Step 4: and (5) multi-target track prediction analysis.
In the step 1, multiple target tracks are identified; the step 1 comprises the following steps:
Step 11: target identification;
Step 12: tracking a target;
Step 13: and (5) vectorizing the representation of the target.
In the step 11, the target is identified;
In image recognition, the convolutional neural network algorithm is based on the best performance; in the method, the recognition of personnel is focused; the method comprises the steps of analyzing video streams frame by frame through a multi-layer convolutional neural network, identifying and positioning personnel in video frames, wherein for an input video V, V= { V 1,V2,……,Vt }, t is a frame at different moments;
The construction of the target recognition model is divided into two stages of model training and model testing; in the training stage, inputting video data V train with labeling information; the labeling content is a plurality of rectangular frames and is expressed as (x, y, h, w), wherein x is the left upper corner abscissa of the labeling frame, y is the left upper corner ordinate of the labeling frame, h is the height of the labeling frame, and w is the width of the labeling frame; after training and model convergence, a target recognition model is obtained; in the test stage, inputting video data V test without marking information, and testing the model effect; if the accuracy requirement is met, finishing model training; otherwise, the parameters are adjusted and retrained until the accuracy requirement is met.
Wherein, in the step 12, the target is tracked;
Arranging the target frame information of the video frame by frame according to a time sequence to obtain target track information; because the image of the adjacent frames in the video has little change, the operation cost of the frame-by-frame analysis is too high, so the video frames are sampled;
The standard video is 24 frames per second, and the sampling ratio value is in the method
Wherein, in the step 13, the target vectorization represents;
For each target O i, the state at time t 0 is represented as The displacement vector at time t thereafter isBy state variableRepresents the position of the target O i by displacement vectorThe direction and speed of movement of the object O i are indicated, describing the static and dynamic characteristics of the object.
In the step 2, a space-time weight vector is constructed;
The step 2 comprises the following steps:
step 21: constructing a time weight vector;
step 22: constructing a space weight vector;
Step 23: and (5) space-time vector fusion.
In the step 21, a time weight vector is constructed;
Considering life work and rest of people, the movement trend of people flow often has a periodic rule; for example, the mass flow of people in the morning is generally higher than at night; the difference of the human flow in different time periods leads to the difference of the reference dimension; therefore, different weight parameters need to be trained for the people stream prediction models of different time periods for such variability;
continuously sampling the people stream characteristics to obtain the total displacement vector Determining the number of t and the time interval of each sampling according to the sampling frequency; for example, if the sampling frequency is set to be abnormal for 5 minutes, the sampling time is 00:00,00:05, … …,23:55, and the sampling frequency is 288;
Time weighting Wherein ε is a small constant, prevent zero removal;
step 22: constructing a space weight vector;
the specific places can also influence the traffic flow, for example, people can be on buses such as buses, and people tend to show an aggregation trend; if such an effect is not eliminated, misjudgment of the personnel gathering event is likely to be caused;
For a specific observation point x, counting the number n of people appearing in the observation point x, and then weighting the observation point x in space Wherein ε is a small constant, prevent zero removal;
Step 23: space-time vector fusion;
For different observation points x of different time periods t, the time-space weight vector is as follows
Where w i,j is the product of the time vector and the space vector.
In the step 3, a special personnel weight vector is constructed;
Special personnel (such as public figures) can have a great influence on the track of the crowd, the crowd is easy to approach to the public figures, the density of people around the special personnel is increased, and the moving speed is reduced; the special personnel go out easily to cause a swarm event, and a great challenge is caused to security; the travel of special personnel is a scene which has to be considered, but is also a small probability event, and is applicable only in special scenes;
The step 3 comprises the following steps:
Step 31: constructing a special personnel library;
According to actual conditions, a face library of special personnel is established at different observation points, and a global face library can also be established; establishing a face library, and storing a unique ID (identity) of a person, a face picture, face characteristics and weight vectors;
Step 32: constructing a special personnel weight vector;
Special personnel have a certain attraction effect on surrounding people, and attraction weights are constructed according to the attraction effect intensity; estimating attractive weight of the special personnel by classifying the special personnel; because the number of special personnel is relatively small, the workload of constructing the special personnel weight vector is low.
In the step 4, multi-target track prediction analysis is performed;
The whole flow of the step 4 is as follows:
Firstly, acquiring a state variable of a target O i through a multi-target track recognition algorithm based on a convolutional neural network Displacement vectorSequence of state variables And a sequence of displacement vectorsMultiplying the space-time weight vector, inputting a sequence prediction model, and predicting the state variable sequence/>, in a next period of timeAnd displacement vector sequence
Then, recognizing a face in the video by adopting a face recognition algorithm, and checking whether special personnel appear; according to the detection result, two conditions are classified;
case one: special personnel are present; according to the position of the special person at the moment t A special person appeal weight w; correcting the original output result, wherein the predicted position of the target O i at the time t isCorrection vector isThe predicted result is the state variable sequence And a sequence of displacement vectors
And a second case: no special personnel exist; the predicted result is the state variable sequence And displacement vector sequence
In the step 4, model iteration is performed;
Because the actual crowd is dynamically changed, three model components in the method need to be iterated continuously according to the actual data so as to adapt to the change;
a. Track prediction model
At intervals, data sampling is carried out, new data are added into an original training set for retraining, the scale of the training set is continuously enlarged, and the accuracy of the model is improved;
b. Space-time vector weights
Sampling data at intervals; calculating space-time vectors according to the new data to ensure that the model can adapt to changing conditions;
c. Special personnel store
According to actual conditions, a special personnel library is continuously expanded, and the level is adjusted;
The deployment of the step 4 is implemented as follows:
The method has two deployment application scenarios;
a. Public place deployment
The method is oriented to public areas such as streets, has high mobility, and does not need to establish a special personnel library; the data of the space-time vector mainly comes from the morning and evening peaks;
b. Interior area deployment
The method is oriented to internal areas such as a park, people are relatively fixed, a relatively perfect internal personnel library can be established, and key personnel are extracted to be added into a special personnel library; and, space-time vector modeling can be performed according to a specific place.
(III) beneficial effects
The invention provides a method for predicting a group personnel track based on space-time characteristics, which comprises the steps of constructing a space-time characteristic weight vector and a special personnel weight vector, reallocating the weights of a multi-target track vector, introducing the influence of specific time, place and personnel into a multi-target track prediction algorithm, analyzing the influence of individuals and environments on individual tracks, and realizing the timely prediction of the group events and timely finding and removing risks.
Compared with the prior art, the invention has the following effects:
(1) The track prediction of the crowd personnel is realized, and personalized modeling is carried out aiming at different time, space and personnel.
(2) Different deployment implementation strategies are adopted in the public area and the internal area, so that the transition from passive treatment to active discovery of crowd prevention is realized, and the risk prediction, early warning analysis and crowd dispersion capacity are improved.
(3) And through data accumulation, the model is self-adaptively trained, and the prediction accuracy of the model is continuously improved.
Drawings
FIG. 1 is a block diagram of a multi-objective personnel trajectory prediction method.
Detailed Description
To make the objects, contents and advantages of the present invention more apparent, the following detailed description of the present invention will be given with reference to the accompanying drawings and examples.
In order to solve the technical problems, the invention provides a method for predicting a population personnel track based on space-time characteristics, which comprises the following steps:
step 1: multi-target track identification;
Step 2: constructing space-time weight vectors;
Step 3: constructing a special personnel weight vector;
Step 4: and (5) multi-target track prediction analysis.
In the step 1, multiple target tracks are identified; the step 1 comprises the following steps:
Step 11: target identification;
Step 12: tracking a target;
Step 13: and (5) vectorizing the representation of the target.
In the step 11, the target is identified;
In image recognition, the convolutional neural network algorithm is based on the best performance; in the method, the recognition of personnel is focused; the method comprises the steps of analyzing video streams frame by frame through a multi-layer convolutional neural network, identifying and positioning personnel in video frames, wherein for an input video V, V= { V 1,V2,……,Vt }, t is a frame at different moments;
The construction of the target recognition model is divided into two stages of model training and model testing; in the training stage, inputting video data V train with labeling information; the labeling content is a plurality of rectangular frames and is expressed as (x, y, h, w), wherein x is the left upper corner abscissa of the labeling frame, y is the left upper corner ordinate of the labeling frame, h is the height of the labeling frame, and w is the width of the labeling frame; after training and model convergence, a target recognition model is obtained; in the test stage, inputting video data V test without marking information, and testing the model effect; if the accuracy requirement is met, finishing model training; otherwise, the parameters are adjusted and retrained until the accuracy requirement is met.
Wherein, in the step 12, the target is tracked;
Arranging the target frame information of the video frame by frame according to a time sequence to obtain target track information; because the image of the adjacent frames in the video has little change, the operation cost of the frame-by-frame analysis is too high, so the video frames are sampled;
The standard video is 24 frames per second, and the sampling ratio value is in the method
Wherein, in the step 13, the target vectorization represents;
For each target O i, the state at time t 0 is represented as The displacement vector at time t thereafter isBy state variableRepresents the position of the target O i by displacement vectorThe direction and speed of movement of the object O i are indicated, describing the static and dynamic characteristics of the object.
In the step 2, a space-time weight vector is constructed;
The step 2 comprises the following steps:
step 21: constructing a time weight vector;
step 22: constructing a space weight vector;
Step 23: and (5) space-time vector fusion.
In the step 21, a time weight vector is constructed;
Considering life work and rest of people, the movement trend of people flow often has a periodic rule; for example, the mass flow of people in the morning is generally higher than at night; the difference of the human flow in different time periods leads to the difference of the reference dimension; therefore, different weight parameters need to be trained for the people stream prediction models of different time periods for such variability;
continuously sampling the people stream characteristics to obtain the total displacement vector Determining the number of t and the time interval of each sampling according to the sampling frequency; for example, if the sampling frequency is set to be abnormal for 5 minutes, the sampling time is 00:00,00:05, … …,23:55, and the sampling frequency is 288;
Time weighting Wherein ε is a small constant, prevent zero removal;
step 22: constructing a space weight vector;
the specific places can also influence the traffic flow, for example, people can be on buses such as buses, and people tend to show an aggregation trend; if such an effect is not eliminated, misjudgment of the personnel gathering event is likely to be caused;
For a specific observation point x, counting the number n of people appearing in the observation point x, and then weighting the observation point x in space Wherein ε is a small constant, prevent zero removal;
Step 23: space-time vector fusion;
For different observation points x of different time periods t, the time-space weight vector is as follows
Where w i,j is the product of the time vector and the space vector.
In the step 3, a special personnel weight vector is constructed;
Special personnel (such as public figures) can have a great influence on the track of the crowd, the crowd is easy to approach to the public figures, the density of people around the special personnel is increased, and the moving speed is reduced; the special personnel go out easily to cause a swarm event, and a great challenge is caused to security; the travel of special personnel is a scene which has to be considered, but is also a small probability event, and is applicable only in special scenes;
The step 3 comprises the following steps:
Step 31: constructing a special personnel library;
According to actual conditions, a face library of special personnel is established at different observation points, and a global face library can also be established; establishing a face library, and storing a unique ID (identity) of a person, a face picture, face characteristics and weight vectors;
Step 32: constructing a special personnel weight vector;
Special personnel have a certain attraction effect on surrounding people, and attraction weights are constructed according to the attraction effect intensity; estimating attractive weight of the special personnel by classifying the special personnel; because the number of special personnel is relatively small, the workload of constructing the special personnel weight vector is low.
In the step 4, multi-target track prediction analysis is performed;
The whole flow of the step 4 is as follows:
Firstly, acquiring a state variable of a target O i through a multi-target track recognition algorithm based on a convolutional neural network Displacement vectorSequence of state variables And a sequence of displacement vectorsMultiplying the space-time weight vector, inputting a sequence prediction model, and predicting the state variable sequence/>, in a next period of timeAnd displacement vector sequence
Then, recognizing a face in the video by adopting a face recognition algorithm, and checking whether special personnel appear; according to the detection result, two conditions are classified;
case one: special personnel are present; according to the position of the special person at the moment t A special person appeal weight w; correcting the original output result, wherein the predicted position of the target O i at the time t isCorrection vector isThe predicted result is the state variable sequence And a sequence of displacement vectors
And a second case: no special personnel exist; the predicted result is the state variable sequence And displacement vector sequence
In the step 4, model iteration is performed;
Because the actual crowd is dynamically changed, three model components in the method need to be iterated continuously according to the actual data so as to adapt to the change;
a. Track prediction model
At intervals, data sampling is carried out, new data are added into an original training set for retraining, the scale of the training set is continuously enlarged, and the accuracy of the model is improved;
b. Space-time vector weights
Sampling data at intervals; calculating space-time vectors according to the new data to ensure that the model can adapt to changing conditions;
c. Special personnel store
According to actual conditions, a special personnel library is continuously expanded, and the level is adjusted;
The deployment of the step 4 is implemented as follows:
The method has two deployment application scenarios;
a. Public place deployment
The method is oriented to public areas such as streets, has high mobility, and does not need to establish a special personnel library; the data of the space-time vector mainly comes from the morning and evening peaks;
b. Interior area deployment
The method is oriented to internal areas such as a park, people are relatively fixed, a relatively perfect internal personnel library can be established, and key personnel are extracted to be added into a special personnel library; and, space-time vector modeling can be performed according to a specific place.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.