CN117953009A

CN117953009A - A group personnel trajectory prediction method based on spatiotemporal characteristics

Info

Publication number: CN117953009A
Application number: CN202311864476.5A
Authority: CN
Inventors: 刘宏宇; 徐鑫; 陈�胜; 魏超凡; 彭一帆; 宋志雄
Original assignee: Aerospace Science And Engineering Intelligent Operation Research And Information Security Research Institute Wuhan Co ltd
Current assignee: Aerospace Science And Engineering Intelligent Operation Research And Information Security Research Institute Wuhan Co ltd
Priority date: 2023-12-28
Filing date: 2023-12-28
Publication date: 2024-04-30

Abstract

The present invention belongs to the field of video recognition technology, and specifically relates to a method for predicting the trajectory of a group of people based on spatiotemporal features. The method comprises: step 1: multi-target trajectory recognition; step 2: spatiotemporal weight vector construction; step 3: special personnel weight vector construction; step 4: multi-target trajectory prediction analysis. Compared with the prior art, the effects of the present invention are as follows: (1) The trajectory prediction of a group of people is realized, and personalized modeling is performed for different times, spaces, and personnel. (2) Different deployment and implementation strategies are adopted in public areas and internal areas to realize the transformation of mass incident prevention from passive disposal to active discovery, and improve risk prediction, early warning analysis, and crowd guidance capabilities. (3) Through data accumulation and adaptive training of models, the prediction accuracy of the model is continuously improved.

Description

Space-time feature-based crowd personnel trajectory prediction method

Technical Field

The invention belongs to the technical field of video identification, and particularly relates to a method for predicting a group personnel track based on space-time characteristics.

Background

With the development of socioeconomic performance and the deepening of urbanization, the population scale of cities rises year by year, and people gathering and congestion are easy to occur. People gathering has contingency and burstiness, and how to timely and accurately predict the track of the crowd personnel becomes a key for preventing the crowd events. In recent years, with the rapid development of smart cities and artificial intelligence technologies, cameras are distributed over streets and alleys of the cities, and video monitoring can cover a large number of public areas such as streets and the like, so that hardware foundation and data support are provided for crowd personnel track prediction. Through a CV (Computer Vision) algorithm, a Computer can automatically analyze video content without manually observing massive videos. The CV algorithm can locate and track a plurality of targets and forecast target tracks, so that the analysis, forecast and control capacity of the relevant management departments on the group behaviors can be improved, and the urban order and safety are ensured.

At present, although the multi-target tracking algorithm can identify people in a video and draw a personnel track, the influence of specific space-time conditions (such as rush hour, bus stop and the like) on the personnel track is not considered, the influence of special people (such as public people) on the crowd track is not considered, and the crowd personnel track cannot be analyzed aiming at a specific scene.

Disclosure of Invention

First, the technical problem to be solved

The invention aims to solve the technical problems that: how to provide a method for predicting the track of population personnel based on space-time characteristics.

(II) technical scheme

In order to solve the technical problems, the invention provides a method for predicting a population personnel track based on space-time characteristics, which comprises the following steps:

step 1: multi-target track identification;

Step 2: constructing space-time weight vectors;

Step 3: constructing a special personnel weight vector;

Step 4: and (5) multi-target track prediction analysis.

In the step 1, multiple target tracks are identified; the step 1 comprises the following steps:

Step 11: target identification;

Step 12: tracking a target;

Step 13: and (5) vectorizing the representation of the target.

In the step 11, the target is identified;

In image recognition, the convolutional neural network algorithm is based on the best performance; in the method, the recognition of personnel is focused; the method comprises the steps of analyzing video streams frame by frame through a multi-layer convolutional neural network, identifying and positioning personnel in video frames, wherein for an input video V, V= { V ₁,V₂,……,V_t }, t is a frame at different moments;

The construction of the target recognition model is divided into two stages of model training and model testing; in the training stage, inputting video data V _train with labeling information; the labeling content is a plurality of rectangular frames and is expressed as (x, y, h, w), wherein x is the left upper corner abscissa of the labeling frame, y is the left upper corner ordinate of the labeling frame, h is the height of the labeling frame, and w is the width of the labeling frame; after training and model convergence, a target recognition model is obtained; in the test stage, inputting video data V _test without marking information, and testing the model effect; if the accuracy requirement is met, finishing model training; otherwise, the parameters are adjusted and retrained until the accuracy requirement is met.

Wherein, in the step 12, the target is tracked;

Arranging the target frame information of the video frame by frame according to a time sequence to obtain target track information; because the image of the adjacent frames in the video has little change, the operation cost of the frame-by-frame analysis is too high, so the video frames are sampled;

The standard video is 24 frames per second, and the sampling ratio value is in the method

Wherein, in the step 13, the target vectorization represents;

For each target O _i, the state at time t ₀ is represented as The displacement vector at time t thereafter isBy state variableRepresents the position of the target O _i by displacement vectorThe direction and speed of movement of the object O _i are indicated, describing the static and dynamic characteristics of the object.

In the step 2, a space-time weight vector is constructed;

The step 2 comprises the following steps:

step 21: constructing a time weight vector;

step 22: constructing a space weight vector;

Step 23: and (5) space-time vector fusion.

In the step 21, a time weight vector is constructed;

Considering life work and rest of people, the movement trend of people flow often has a periodic rule; for example, the mass flow of people in the morning is generally higher than at night; the difference of the human flow in different time periods leads to the difference of the reference dimension; therefore, different weight parameters need to be trained for the people stream prediction models of different time periods for such variability;

continuously sampling the people stream characteristics to obtain the total displacement vector Determining the number of t and the time interval of each sampling according to the sampling frequency; for example, if the sampling frequency is set to be abnormal for 5 minutes, the sampling time is 00:00,00:05, … …,23:55, and the sampling frequency is 288;

Time weighting Wherein ε is a small constant, prevent zero removal;

step 22: constructing a space weight vector;

the specific places can also influence the traffic flow, for example, people can be on buses such as buses, and people tend to show an aggregation trend; if such an effect is not eliminated, misjudgment of the personnel gathering event is likely to be caused;

For a specific observation point x, counting the number n of people appearing in the observation point x, and then weighting the observation point x in space Wherein ε is a small constant, prevent zero removal;

Step 23: space-time vector fusion;

For different observation points x of different time periods t, the time-space weight vector is as follows

Where w _i,j is the product of the time vector and the space vector.

In the step 3, a special personnel weight vector is constructed;

Special personnel (such as public figures) can have a great influence on the track of the crowd, the crowd is easy to approach to the public figures, the density of people around the special personnel is increased, and the moving speed is reduced; the special personnel go out easily to cause a swarm event, and a great challenge is caused to security; the travel of special personnel is a scene which has to be considered, but is also a small probability event, and is applicable only in special scenes;

The step 3 comprises the following steps:

Step 31: constructing a special personnel library;

According to actual conditions, a face library of special personnel is established at different observation points, and a global face library can also be established; establishing a face library, and storing a unique ID (identity) of a person, a face picture, face characteristics and weight vectors;

Step 32: constructing a special personnel weight vector;

Special personnel have a certain attraction effect on surrounding people, and attraction weights are constructed according to the attraction effect intensity; estimating attractive weight of the special personnel by classifying the special personnel; because the number of special personnel is relatively small, the workload of constructing the special personnel weight vector is low.

In the step 4, multi-target track prediction analysis is performed;

The whole flow of the step 4 is as follows:

Firstly, acquiring a state variable of a target O _i through a multi-target track recognition algorithm based on a convolutional neural network Displacement vectorSequence of state variables And a sequence of displacement vectorsMultiplying the space-time weight vector, inputting a sequence prediction model, and predicting the state variable sequence/>, in a next period of timeAnd displacement vector sequence

Then, recognizing a face in the video by adopting a face recognition algorithm, and checking whether special personnel appear; according to the detection result, two conditions are classified;

case one: special personnel are present; according to the position of the special person at the moment t A special person appeal weight w; correcting the original output result, wherein the predicted position of the target O _i at the time t isCorrection vector isThe predicted result is the state variable sequence And a sequence of displacement vectors

And a second case: no special personnel exist; the predicted result is the state variable sequence And displacement vector sequence

In the step 4, model iteration is performed;

Because the actual crowd is dynamically changed, three model components in the method need to be iterated continuously according to the actual data so as to adapt to the change;

a. Track prediction model

At intervals, data sampling is carried out, new data are added into an original training set for retraining, the scale of the training set is continuously enlarged, and the accuracy of the model is improved;

b. Space-time vector weights

Sampling data at intervals; calculating space-time vectors according to the new data to ensure that the model can adapt to changing conditions;

c. Special personnel store

According to actual conditions, a special personnel library is continuously expanded, and the level is adjusted;

The deployment of the step 4 is implemented as follows:

The method has two deployment application scenarios;

a. Public place deployment

The method is oriented to public areas such as streets, has high mobility, and does not need to establish a special personnel library; the data of the space-time vector mainly comes from the morning and evening peaks;

b. Interior area deployment

The method is oriented to internal areas such as a park, people are relatively fixed, a relatively perfect internal personnel library can be established, and key personnel are extracted to be added into a special personnel library; and, space-time vector modeling can be performed according to a specific place.

(III) beneficial effects

The invention provides a method for predicting a group personnel track based on space-time characteristics, which comprises the steps of constructing a space-time characteristic weight vector and a special personnel weight vector, reallocating the weights of a multi-target track vector, introducing the influence of specific time, place and personnel into a multi-target track prediction algorithm, analyzing the influence of individuals and environments on individual tracks, and realizing the timely prediction of the group events and timely finding and removing risks.

Compared with the prior art, the invention has the following effects:

(1) The track prediction of the crowd personnel is realized, and personalized modeling is carried out aiming at different time, space and personnel.

(2) Different deployment implementation strategies are adopted in the public area and the internal area, so that the transition from passive treatment to active discovery of crowd prevention is realized, and the risk prediction, early warning analysis and crowd dispersion capacity are improved.

(3) And through data accumulation, the model is self-adaptively trained, and the prediction accuracy of the model is continuously improved.

Drawings

FIG. 1 is a block diagram of a multi-objective personnel trajectory prediction method.

Detailed Description

To make the objects, contents and advantages of the present invention more apparent, the following detailed description of the present invention will be given with reference to the accompanying drawings and examples.

step 1: multi-target track identification;

Step 2: constructing space-time weight vectors;

Step 3: constructing a special personnel weight vector;

Step 4: and (5) multi-target track prediction analysis.

Step 11: target identification;

Step 12: tracking a target;

Step 13: and (5) vectorizing the representation of the target.

In the step 11, the target is identified;

Wherein, in the step 12, the target is tracked;

Wherein, in the step 13, the target vectorization represents;

In the step 2, a space-time weight vector is constructed;

The step 2 comprises the following steps:

step 21: constructing a time weight vector;

step 22: constructing a space weight vector;

Step 23: and (5) space-time vector fusion.

In the step 21, a time weight vector is constructed;

Time weighting Wherein ε is a small constant, prevent zero removal;

step 22: constructing a space weight vector;

Step 23: space-time vector fusion;

Where w _i,j is the product of the time vector and the space vector.

In the step 3, a special personnel weight vector is constructed;

The step 3 comprises the following steps:

Step 31: constructing a special personnel library;

Step 32: constructing a special personnel weight vector;

In the step 4, multi-target track prediction analysis is performed;

The whole flow of the step 4 is as follows:

In the step 4, model iteration is performed;

a. Track prediction model

b. Space-time vector weights

c. Special personnel store

The deployment of the step 4 is implemented as follows:

The method has two deployment application scenarios;

a. Public place deployment

b. Interior area deployment

The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.

Claims

1. A method for predicting the trajectory of a group of people based on spatiotemporal features, characterized in that the method includes:

Step 1: Multi-target trajectory recognition;

Step 2: Constructing the spatiotemporal weight vector;

Step 3: Constructing weight vectors for special personnel;

Step 4: Multi-target trajectory prediction analysis.

2. The method for predicting the trajectory of a group of people based on spatiotemporal features as described in claim 1, characterized in that, in step 1, multi-target trajectory identification is performed; step 1 includes:

Step 11: Target identification;

Step 12: Target tracking;

Step 13: Vectorize the target representation.

3. The method for predicting the trajectory of a group of people based on spatiotemporal features as described in claim 2, characterized in that, in step 11, target identification;

In image recognition, convolutional neural network-based algorithms perform best. In this method, the focus is on the identification of people. This method uses a multi-layer convolutional neural network to analyze the video stream frame by frame, identify and locate people in the video frames. For the input video V, V = { _V1 , _V2 , ..., _Vt }, t is the frame at different times.

The target recognition model construction consists of two phases: model training and model testing. In the training phase, the input is video data V _train with labeled information. The labels consist of several rectangular borders, represented as (x, y, h, w), where x is the x-coordinate of the top-left corner of the border, y is the y-coordinate of the top-left corner, h is the height of the border, and w is the width of the border. After training, the model converges, resulting in the target recognition model. In the testing phase, the input is video data V _test without labeled information to test the model's performance. If the accuracy requirement is met, the model training is complete; otherwise, the parameters are adjusted, and the model is retrained until the accuracy requirement is met.

4. The method for predicting the trajectory of a group of people based on spatiotemporal features as described in claim 3, characterized in that, in step 12, target tracking;

The target bounding box information of each frame of the video is arranged in chronological order to obtain the target trajectory information. Since the image changes little between adjacent frames in the video, the computational cost of frame-by-frame analysis is too high, so the video frames are sampled.

Standard video is 24 frames per second; in this method, the sampling ratio is taken as...

5. The method for predicting the trajectory of a group of people based on spatiotemporal features as described in claim 4, characterized in that, in step 13, the target is represented by a vectorization.

For each objective O _i , the state at time t _0 is represented as: The displacement vector at the subsequent time t is Through state variables The position of target _Oi is represented by the displacement vector. It represents the direction and speed of movement of target _Oi , and describes the static and dynamic characteristics of the target.

6. The method for predicting the trajectory of a group of people based on spatiotemporal features as described in claim 5, characterized in that, in step 2, a spatiotemporal weight vector is constructed;

Step 2 includes:

Step 21: Constructing the time weight vector;

Step 22: Constructing the spatial weight vector;

Step 23: Spatiotemporal vector fusion.

7. The method for predicting the trajectory of a group of people based on spatiotemporal features as described in claim 6, characterized in that, in step 21, a time weight vector is constructed;

Considering people's daily routines, the movement of people often follows a periodic pattern; the scale of people in the morning is generally higher than at night; the difference in people flow at different times leads to different baseline dimensions; therefore, it is necessary to train different weight parameters for the people flow prediction model at different times to take into account this difference.

By continuously sampling the characteristics of the pedestrian flow, the total displacement vector can be obtained. The number of samples t and the time interval between each sample are determined based on the sampling frequency; if the sampling frequency is set to 5 minutes, then the sampling times are 00:00, 00:05, ..., 23:55, and the number of samples is 288.

Time weight Where ε is a small constant to prevent division by zero;

Step 22: Constructing the spatial weight vector;

Specific locations can also affect the flow of people; if this effect is not eliminated, it could lead to misjudgments of gatherings of people.

For a specific observation point x, if the number of people appearing at that point n is counted, then the spatial weight is... Where ε is a small constant to prevent division by zero;

Step 23: Spatiotemporal vector fusion;

For different observation points x at different time periods t, the spatiotemporal weight vector is:

Where w _i,j is the product of the time vector and the space vector.

8. The method for predicting the trajectory of a group of people based on spatiotemporal features as described in claim 7, characterized in that, in step 3, a special personnel weight vector is constructed;

Special individuals can significantly impact crowd movement, manifesting as a tendency for crowds to gravitate towards public figures, increased crowd density around them, and slower movement speed. Their travels can easily trigger mass incidents, posing a significant challenge to security. While such travel is a scenario that must be considered, it is also a low-probability event and should only be applied in specific situations.

Step 3 includes:

Step 31: Constructing a special personnel database;

Depending on the actual situation, a special personnel face database can be established at different observation points, or a global face database can be established; the face database should store the unique ID of each person, face image, face features, and weight vector;

Step 32: Constructing the weight vector for special personnel;

Special individuals have a certain attraction effect on the surrounding population. Based on the strength of the attraction effect, attraction weights are constructed. By classifying special individuals into different levels, the attraction weights of special individuals are estimated. Since the number of special individuals is relatively small, the workload of constructing the special individual weight vector is low.

9. The method for predicting the trajectory of a group of people based on spatiotemporal features as described in claim 8, characterized in that, in step 4, multi-target trajectory prediction analysis is performed;

The overall process of step 4 is as follows:

First, the state variables of target _Oi are obtained through a multi-target trajectory recognition algorithm based on a convolutional neural network. and displacement vector sequence of state variables and displacement vector sequence Multiply by the spatiotemporal weight vector and input into the sequence prediction model to predict the sequence of state variables over the next period of time. and displacement vector sequence

Then, a facial recognition algorithm is used to identify faces in the video to check for any suspicious individuals; based on the detection results, two scenarios are identified.

Scenario 1: Special personnel exist; based on the special personnel's position at time t. And the special personnel attraction weight w; the original output results are corrected, and the predicted position of target _Oi at time t is... The correction vector is The prediction result is the sequence of state variables. and displacement vector sequence

Scenario 2: No special personnel are present; the prediction result is the sequence of state variables. and displacement vector sequence

10. The method for predicting the trajectory of a group of people based on spatiotemporal features as described in claim 9, characterized in that, in step 4, model iteration is performed;

Since the actual population is dynamic, three model components in this method need to be continuously iterated based on actual data to adapt to changes;

a. Trajectory prediction model

Every so often, data is sampled, and new data is added to the original training set for retraining. The size of the training set is continuously expanded, thereby improving the accuracy of the model.

b. Spatiotemporal vector weights

Data is sampled periodically; spatiotemporal vectors are calculated based on the new data to ensure that the model can adapt to changing conditions.

c. Special Personnel Database

Based on actual circumstances, the pool of special personnel will be continuously expanded and their levels adjusted.

The deployment and implementation of step 4 are as follows:

This method has two deployment scenarios;

a. Deployment in public places

For public areas such as streets, where there is high population mobility, there is no need to establish a special personnel database; the spatiotemporal vector data mainly comes from morning and evening rush hours.

b. Internal area deployment

For internal areas such as industrial parks, where personnel are relatively fixed, a relatively complete internal personnel database can be established, and key personnel can be extracted and added to a special personnel database; furthermore, spatiotemporal vector modeling can be performed based on specific locations.