[go: up one dir, main page]

CN120954052A - Passenger Flow Statistics System and Method Based on Deep Learning - Google Patents

Passenger Flow Statistics System and Method Based on Deep Learning

Info

Publication number
CN120954052A
CN120954052A CN202511068910.8A CN202511068910A CN120954052A CN 120954052 A CN120954052 A CN 120954052A CN 202511068910 A CN202511068910 A CN 202511068910A CN 120954052 A CN120954052 A CN 120954052A
Authority
CN
China
Prior art keywords
passenger flow
deep learning
spatiotemporal
feature
flow statistics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202511068910.8A
Other languages
Chinese (zh)
Inventor
贾正旭
王家栋
张卿瑜
杨凯
陈林
张运飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanyang Power Supply Co of State Grid Henan Electric Power Co Ltd
Original Assignee
Nanyang Power Supply Co of State Grid Henan Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanyang Power Supply Co of State Grid Henan Electric Power Co Ltd filed Critical Nanyang Power Supply Co of State Grid Henan Electric Power Co Ltd
Priority to CN202511068910.8A priority Critical patent/CN120954052A/en
Publication of CN120954052A publication Critical patent/CN120954052A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the field of computer vision and data analysis, and particularly relates to a passenger flow statistical system and a statistical method based on deep learning. The method solves the defects of the existing passenger flow statistical method in the aspects of accuracy, adaptability, density sensitivity, control dislocation and the like.

Description

Passenger flow statistical system and statistical method based on deep learning
Technical Field
The invention belongs to the field of computer vision and data analysis, and particularly relates to a passenger flow statistical system and a passenger flow statistical method based on deep learning.
Background
In the current society, accurate passenger flow statistics is important for various public places such as electric business offices, malls, stations, scenic spots and the like. Conventional passenger flow statistics methods, such as based on infrared sensing, pressure sensing and the like, have various limitations. The infrared induction is easy to be interfered by environment, when a plurality of people pass through the induction area at the same time, counting errors are easy to occur, the pressure induction has higher requirements on the installation position and the ground condition, different individuals cannot be distinguished, and the statistical accuracy is difficult to ensure.
With the development of computer vision technology, a passenger flow statistical method based on video images is gradually rising. Early video-based passenger flow statistics methods relied on hand-designed features such as HOG (direction gradient histogram), LBP (local binary pattern), etc., and then combined with traditional classifiers such as SVM (support vector machine) for pedestrian detection. However, these methods do not perform well in complex scenes, for example, the detection accuracy is greatly reduced when the illumination changes severely and pedestrians are blocked seriously. Because the manually designed features are difficult to fully and accurately describe the complex features of pedestrians, and the generalization capability of the traditional classifier is limited.
The appearance of deep learning technology brings new breakthrough to passenger flow statistics. Deep learning can automatically learn more representative features from a large amount of data, thereby improving the accuracy of detection and statistics. However, current passenger flow statistical algorithms based on deep learning still have some problems. On one hand, a large amount of marking data is needed for training the model, manual marking data is time-consuming and labor-consuming, marking errors are easy to occur, and on the other hand, the existing model still needs to be further improved in the aspects of multi-target tracking and instantaneity under complex scene processing. For example, in a scene of high personnel density such as a market sales promotion, the problems of target loss, repeated counting and the like may occur in the existing algorithm, and the requirement of actual application on high-precision passenger flow statistics cannot be met.
Summarizing three technical bottlenecks exist in the existing passenger flow statistical system:
1. the environmental adaptability is poor, namely the false detection rate of the traditional visible light scheme in a low illumination scene is more than 40%;
2. The density sensitive defect is that the ID switching rate of a fixed parameter algorithm in a crowded scene (ρ >4 people/m 2) is up to 35%;
3. And predicting control dislocation, wherein the statistical result and emergency decision response delay is more than 8 seconds.
The invention patent with publication number CN113591876A only realizes static scene counting and does not solve the problem of algorithm switching under dynamic density, and the invention patent with publication number CN110675443A discloses a thermal imaging fusion scheme which does not relate to a feature level weighting mechanism.
Therefore, the development of the passenger flow statistics method based on deep learning, which can more accurately and efficiently carry out passenger flow statistics and is suitable for complex scenes, has important practical significance.
Disclosure of Invention
The invention aims to provide a passenger flow statistical system and a passenger flow statistical method based on deep learning aiming at the problems in the prior art, so as to solve the defects of the existing passenger flow statistical method in the aspects of accuracy, adaptability, density sensitivity, control and dislocation and the like.
To achieve the above object, the deep learning-based passenger flow statistics system of the present invention comprises;
The multi-mode feature extraction module adopts a parallel double-channel Convolutional Neural Network (CNN), wherein a first channel processes visible light video stream, a second channel processes infrared thermal imaging data, and an enhanced pedestrian feature map is output through a feature level fusion layer;
The dynamic target tracking module generates a motion trail initial value based on an optical flow method, fuses a Kalman filtering prediction result in an SORT algorithm, and dynamically adjusts tracking weights through an appearance characteristic matching function;
The space-time passenger flow analysis module is used for carrying out space-time modeling on the pedestrian track by utilizing a space-time diagram convolution network (STGCN) and outputting a regional passenger flow density thermodynamic diagram and a future flow prediction curve;
And the feedback optimization module is used for adaptively adjusting the receptive field size of the CNN feature extraction layer and the search window size of an optical flow method according to the tracking loss rate in the shielding scene.
Specifically, the calculation mode of the feature level fusion layer is as follows:
Ffusion=α·σ(Wv·FRGB)+(1-α)·ReLU(Wt·Fthermal);
α=1-e﹣β·Ι
Wherein F RGB is a visible light feature map, F thermal is a thermal imaging feature map, wv and Wt are trainable weight matrices, sigma is an illumination intensity self-adaptive coefficient, beta is an ambient illumination intensity, and I is an attenuation factor. The system relies on thermal imaging characteristics, and the false detection rate is reduced by 32%.
Compared with data-level fusion (direct fusion of original data), feature-level fusion is performed after key features of all data sources are extracted, redundant information can be removed, core features useful for tasks can be reserved, and fusion efficiency is improved. The feature extraction and fusion are carried out on the data sources, so that high calculation cost of directly processing massive original data is avoided, meanwhile, more simplified input is provided for subsequent models (such as a classifier and a predictor), the overall operation amount is reduced, and the calculation complexity is reduced.
Specifically, the appearance feature matching function is defined as:
Smatch=λ·loU(Bt,Bt-1)+(1-λ)·‖φ(ft)-φ(ft-1)‖2
Wherein, λ dynamic adjustment coefficient, ρ is crowd density, ρ max is maximum crowd density, φ (f t)、φ(ft-1) is appearance feature vector of current frame and previous frame extracted by CNN, II 2 is norm of L2, loU (B t,Bt-1) represents intersection ratio of target boundary frames in front and back frames, and dynamic adjustment is performed according to crowd density ρ. Appearance characteristics are dominant and matched, and the ID switching rate is reduced by 41%.
Specifically, the feedback optimization module performs the following operations:
when the tracking loss rate L track is more than 15%, the convolution kernel size of the last two layers of the CNN backbone network is increased by 50%, and the recall rate of the shielding scene is increased by 28%;
When the average displacement D move between continuous frames is less than 5px, the searching window of the optical flow method is reduced from 32X 32 to 16X 16, and the calculation time is reduced by 45%.
Specifically, the space-time passenger flow analysis module includes:
the space-time diagram construction unit is used for constructing a space-time diagram taking the track points of pedestrians as nodes and the motion relationship as edges;
the gating graph rolling unit is used for aggregating the space-time neighborhood information through a gating mechanism and updating the node state;
And the space-time prediction unit predicts the passenger flow distribution after the time T through a space-time attention mechanism based on the final layer node state h i (L). MAE was predicted as low as 6.8% (22% compared to traditional approach).
A passenger flow statistical method based on deep learning is applied to the system, and comprises the following steps:
s1, synchronously collecting visible light and a thermal imaging video stream and aligning time;
s2, extracting multi-modal features through the double-channel CNN, and fusing according to a calculation formula of a feature level fusion layer;
s3, initializing a pedestrian detection frame based on the fusion characteristics, and executing cross-frame target association according to the appearance characteristic matching function;
s4, executing dynamic optimization feature extraction and tracking parameters according to the feedback optimization module;
S5, generating a regional passenger flow thermodynamic diagram and a future prediction curve by the passenger flow analysis module according to the time and space as a passenger flow prediction result.
Specifically, in the step S2, fusion is performed by adopting a migration learning strategy:
the visible light channel is loaded with COCO pre-training weight, and the thermal imaging channel is loaded with FLIR pre-training weight;
The fusion layer weights are updated by end-to-end joint fine tuning.
Specifically, the guest flow prediction result in S5 is used to trigger a control instruction:
when the regional density is greater than the safety density, a diversion alarm signal is generated, so that people flow or traffic flow in the region is dispersed through reminding, and risks caused by the overhigh density are avoided;
When the predicted flow > maximum capacity, a restriction control command is generated, which is a direct intervention to ensure that the flow does not exceed the upper load limit of the system or area by limiting the amount of ingress.
The passenger flow statistical method based on deep learning has the technical effects of various aspects and remarkable:
By adopting a passenger flow statistical algorithm based on deep learning, high-precision personnel detection and tracking can be performed on real-time video data in a complex scene. The spatial characteristics in each frame of image are extracted through a Convolutional Neural Network (CNN), and the system can automatically detect personnel targets in the video and track and count the personnel based on motion information between continuous frames. Compared with the traditional image processing algorithm, the deep learning algorithm can effectively cope with complex conditions such as shielding, light change and the like in a scene, ensures the accuracy and robustness of passenger flow statistics, and has full scene robustness, namely low illumination (I <20 lux) accuracy rate is more than 93%, and high density (rho >5 people/m <2 >)) ID switching rate is less than 1.5%.
Person detection is performed depending on a Convolutional Neural Network (CNN), and continuous person tracking is realized by combining an Optical Flow method or a target tracking algorithm (such as Kalman filtering or SORT). The system processes each frame in the video stream through the CNN model, identifies and locates the pedestrian target in the image, and then based on the target tracking algorithm, the system can track the motion track of the same target across frames to prevent repeated counting or missing counting. By combining the time dimension, the system can analyze the passenger flow change trend of different time periods, so that data support is provided for flow control and management in a scene, and the time delay from prediction to control instruction generation is less than 1 second.
The passenger flow statistical system and the statistical method based on deep learning provided by the invention adopt dynamic fusion (alpha function) of illumination-thermal imaging to solve the environmental adaptability defect, a density driving matching mechanism (lambda dynamic adjustment) solves the high density tracking problem, loss rate feedback optimization realizes algorithm parameter self-adjustment, and a space-time diagram prediction engine opens a statistical-decision closed loop. The method has the advantages that good effects can be achieved under different scenes, the accuracy, adaptability and instantaneity of passenger flow statistics are effectively improved, and the method has wide application prospects and practical values.
Drawings
FIG. 1 is a flow chart of a deep learning-based passenger flow statistics system of the present invention;
FIG. 2 is a flow chart of the operation of the space-time passenger flow analysis module.
Detailed Description
The technical scheme of the present invention is described in detail below with reference to the accompanying drawings and the detailed description.
Example 1
The embodiment provides a passenger flow statistical system based on deep learning, which comprises the following steps of;
The multi-mode feature extraction module adopts a parallel double-channel Convolutional Neural Network (CNN), wherein a first channel processes visible light video stream, a second channel processes infrared thermal imaging data, and an enhanced pedestrian feature map is output through a feature level fusion layer;
The dynamic target tracking module generates a motion trail initial value based on an optical flow method, fuses a Kalman filtering prediction result in an SORT algorithm, and dynamically adjusts tracking weights through an appearance characteristic matching function;
The space-time passenger flow analysis module is used for carrying out space-time modeling on the pedestrian track by utilizing a space-time diagram convolution network (STGCN) and outputting a regional passenger flow density thermodynamic diagram and a future flow prediction curve;
And the feedback optimization module is used for adaptively adjusting the receptive field size of the CNN feature extraction layer and the search window size of an optical flow method according to the tracking loss rate in the shielding scene.
In this embodiment, the calculation mode of the feature level fusion layer is as follows:
Ffusion=α·σ(Wv·FRGB)+(1-α)·ReLU(Wt·Fthermal);
α=1-e﹣β·Ι
Wherein F RGB is a visible light feature map, F thermal is a thermal imaging feature map, wv and Wt are trainable weight matrices, sigma is an illumination intensity self-adaptive coefficient, beta is an ambient illumination intensity, and I is an attenuation factor. The system relies on thermal imaging features, and at low illumination (I <50 lux), α is about 0, with a 32% drop in false detection rate.
Compared with data-level fusion (direct fusion of original data), feature-level fusion is performed after key features of all data sources are extracted, redundant information can be removed, core features useful for tasks can be reserved, and fusion efficiency is improved. The feature extraction and fusion are carried out on the data sources, so that high calculation cost of directly processing massive original data is avoided, meanwhile, more simplified input is provided for subsequent models (such as a classifier and a predictor), the overall operation amount is reduced, and the calculation complexity is reduced.
In this embodiment, the appearance feature matching function is defined as:
Smatch=λ·loU(Bt,Bt-1)+(1-λ)·‖φ(ft)-φ(ft-1)‖2
Wherein, λ dynamic adjustment coefficient, ρ is crowd density, ρ max is maximum crowd density, φ (f t)、φ(ft-1) is appearance feature vector of current frame and previous frame extracted by CNN, II 2 is norm of L2, loU (B t,Bt-1) represents intersection ratio of target boundary frames in front and back frames, and dynamic adjustment is performed according to crowd density ρ. At high density (ρ=4 people/m 2), λ=0.4, the appearance characteristics dominate the matching, and the ID switching rate is reduced by 41%.
Further, the feedback optimization module performs the following operations:
When the tracking loss rate L track is more than 15%, the characteristic extraction capability of the current network to the target is possibly insufficient (for example, characteristic capture is inaccurate caused by target shielding, posture change and the like), the convolution kernel size of the last two layers of the CNN backbone network is increased by 50%, the characteristic receptive field can be enlarged, the capturing capability of the global characteristic and the context information of the target is enhanced, the tracking loss rate is possibly reduced, and the recall rate of the shielded scene is increased by 28%.
When the average displacement D move between consecutive frames is less than 5px, it indicates that the motion amplitude of the target is small (e.g., slow moving or near stationary), and the displacement information can be captured without a large search range. The search window of the optical flow method is reduced from 32×32 to 16×16, so that the accurate calculation of small displacement can be ensured, and the calculated amount (the area of the search window is reduced to 1/4 of the original area) can be reduced, thereby improving the algorithm efficiency, reducing the calculation time consumption by 45%, and being an adaptive optimization strategy considering both precision and speed, and being suitable for tasks relying on optical flow calculation such as video target tracking, motion analysis and the like.
In this embodiment, the space-time passenger flow analysis module includes:
the space-time diagram construction unit is used for constructing a space-time diagram taking the track points of pedestrians as nodes and the motion relationship as edges;
the gating graph rolling unit is used for aggregating the space-time neighborhood information through a gating mechanism and updating the node state;
And the space-time prediction unit predicts the passenger flow distribution after the time T through a space-time attention mechanism based on the final layer node state h i (L). MAE was predicted as low as 6.8% (22% compared to traditional approach). The specific implementation steps are shown in fig. 2.
In this embodiment, the gating graph rolling units (GATED GRAPH Convolutional Unit, GGCNU) combine the graph rolling network (GCN) and the gating mechanism to effectively aggregate the spatio-temporal neighborhood information so as to predict the passenger flow distribution at the future time T. The following specifically related formulas and operations:
1. First, a graph convolution operation is performed
Let us assume that we have a graph structure g= (V, E), where V is the set of nodes and E is the set of edges. For each node i, its characteristic is denoted as h_i { (l) }, where l represents the number of layers of the graph convolution.
The graph convolution operation can be expressed as:
Where N (i) is the set of neighbor nodes of node i. c ij is a normalization constant, typically set as the degree d i of node i or a more complex graph-Laplacian matrix-based normalization form (e.g. ) Common in GCN related to symmetric normalized laplace matrix), W (l) is the weight matrix of the first layer, used to linearly transform the features of neighboring nodes. b (l) is the bias vector of the first layer.
2. Gating mechanism
Introducing gating mechanisms to control the flow of information, including updating gatesReset gateThe calculation formula is as follows:
update door:
Reset gate:
wherein:
σ is a sigmoid activation function that compresses the input into (0, 1) intervals for calculating the weight of the gating value.
AndThe weight matrices of the update gate and the reset gate, respectively.
AndThe offset vectors of the update gate and the reset gate, respectively.
Representing the feature sums after convolving the graphCharacteristics of node i at the previous timeAnd (5) splicing.
3. Hidden state update
Candidate hidden states are computed by resetting gates and graph convolution resultsThen updating the hidden state of the node in combination with the update gate
Candidate hidden state:
Where tan h is a hyperbolic tangent activation function that maps the input to the (-1, 1) interval.
W (l)′ is a weight matrix for calculating candidate hidden states, and b (l)′ is a corresponding bias vector.
As indicated by the element level multiplication, i.e. reset gateSum-of-the-map convolution resultsElement-wise multiplication is performed.
Hidden state update:
This formula indicates that the hidden state of the current node is the hidden state at the previous time And candidate hidden statesIs updated by the update gateAnd (5) determining.
4. Predicting future T-moment passenger flow distribution
After the multi-layer gating graph rolling operation, the final hidden state representation of the node is obtained(L is the total number of layers). Then mapping the hidden state to a predicted value of the passenger flow distribution through a fully connected layer, and assuming that the predicted function is f:
Where W out is the weight matrix of the full connection layer, b out is the bias vector, The predicted value of the passenger flow distribution of the node i at the future time T is obtained. And may be formed in the form of icons for viewing based on the predicted values.
Example 2
The embodiment provides a passenger flow statistics method based on deep learning, which is applied to the system described in embodiment 1, and comprises the following steps:
s1, synchronously collecting visible light and a thermal imaging video stream and aligning time;
s2, extracting multi-modal features through the double-channel CNN, and fusing according to a calculation formula of a feature level fusion layer;
s3, initializing a pedestrian detection frame based on the fusion characteristics, and executing cross-frame target association according to the appearance characteristic matching function;
s4, executing dynamic optimization feature extraction and tracking parameters according to the feedback optimization module;
S5, generating a regional passenger flow thermodynamic diagram and a future prediction curve by the passenger flow analysis module according to the time and space as a passenger flow prediction result.
In step S2 described in this embodiment, fusion is performed by adopting a migration learning strategy, which specifically includes:
the visible light channel is loaded with COCO pre-training weight, and the thermal imaging channel is loaded with FLIR pre-training weight;
The fusion layer weights are updated by end-to-end joint fine tuning.
In the embodiment, the guest flow prediction result in S5 is used to trigger a management and control instruction:
when the regional density is greater than the safety density, a diversion alarm signal is generated, so that people flow or traffic flow in the region is dispersed through reminding, and risks caused by the overhigh density are avoided;
When the predicted flow > maximum capacity, a restriction control command is generated, which is a direct intervention to ensure that the flow does not exceed the upper load limit of the system or area by limiting the amount of ingress.
Example 3:
the embodiment controls the subway early peak passenger flow, and comprises the following specific steps:
1. And (3) data acquisition:
the acquisition was synchronized using a visible camera + thermal imager (30 fps).
2. Feature fusion:
ambient illuminance i=30 lux→α=0.2, thermal imaging weight 80%,
The coco+flir pre-training weights are loaded.
3. Tracking and optimizing:
Detection density ρ=5.2 human/m2→λ=0.38
The tracking loss rate L track =18% →increasing ResNet the last two layers of convolution kernels to 9×9.
4. And (3) prediction management and control:
STGCN prediction of region A after 5min
Trigger gate current limit r=120× (1-6.8/8) =30 people/min.
Effect contrast:
Index (I) Traditional scheme The invention is that
Counting accuracy 76.3% 98.1%
Predicting MAE 22.7% 6.8%
Instruction response delay 8.5s 0.3s
From the above example, the full scene robustness can be derived:
low light (I <20 lux) accuracy >93%
The high density (ρ >5 people/m 2) ID switching rate is <1.5%.
The passenger flow statistics method based on deep learning can obtain good effects in different scenes, effectively improves the accuracy, adaptability and instantaneity of passenger flow statistics, and has wide application prospects and practical values.
It should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the present invention can be modified or substituted for part of the technical features thereof without departing from the spirit and scope of the claimed technical solution.

Claims (8)

1.一种基于深度学习的客流统计系统,其特征在于,包括1. A passenger flow statistics system based on deep learning, characterized in that it includes: 多模态特征提取模块:采用并联的双通道卷积神经网络(CNN),第一通道处理可见光视频流,第二通道处理红外热成像数据,通过特征级融合层输出增强的行人特征图;Multimodal feature extraction module: It adopts a parallel dual-channel convolutional neural network (CNN). The first channel processes the visible light video stream, and the second channel processes the infrared thermal imaging data. The enhanced pedestrian feature map is output through the feature-level fusion layer. 动态目标跟踪模块:基于光流法生成运动轨迹初值,并融合SORT算法中的卡尔曼滤波预测结果,通过外观特征匹配函数动态调整跟踪权重;Dynamic target tracking module: Generates initial values of motion trajectory based on optical flow method, integrates Kalman filter prediction results from SORT algorithm, and dynamically adjusts tracking weights through appearance feature matching function; 时空客流分析模块:利用时空图卷积网络(STGCN)对行人轨迹进行时空建模,输出区域客流密度热力图及未来流量预测曲线;Spatiotemporal Passenger Flow Analysis Module: Utilizes a Spatiotemporal Graph Convolutional Network (STGCN) to perform spatiotemporal modeling of pedestrian trajectories, outputting a regional passenger flow density heatmap and future traffic prediction curves; 反馈优化模块:根据遮挡场景下的跟踪丢失率,自适应调整CNN特征提取层的感受野尺寸及光流法的搜索窗口大小。Feedback optimization module: Adaptively adjusts the receptive field size of the CNN feature extraction layer and the search window size of the optical flow method based on the tracking loss rate in occluded scenarios. 2.根据权利要求1所述基于深度学习的客流统计系统,其特征在于,所述特征级融合层的计算方式为:2. The passenger flow statistics system based on deep learning according to claim 1, characterized in that the calculation method of the feature-level fusion layer is as follows: Ffusion=α·σ(Wv·FRGB)+(1-α)·ReLU(Wt·Fthermal);F fusion =α·σ(Wv·F RGB )+(1-α)·ReLU(Wt·F thermal ); α=1-e﹣β·Ι α = 1 - e ^(-β·Ι) 其中FRGB为可见光特征图,Fthermal为热成像特征图,Wv、Wt为可训练权重矩阵,σ为光照强度自适应系数,β为环境光照强度,Ι为衰减因子。Where F RGB is the visible light feature map, F thermal is the thermal imaging feature map, Wv and Wt are trainable weight matrices, σ is the light intensity adaptive coefficient, β is the ambient light intensity, and Ι is the attenuation factor. 3.根据权利要求1所述基于深度学习的客流统计系统,其特征在于,在数据预处理步骤中,所述外观特征匹配函数定义为:3. The deep learning-based passenger flow statistics system according to claim 1, characterized in that, in the data preprocessing step, the appearance feature matching function is defined as: Smatch=λ·loU(Bt,Bt-1)+(1-λ)·‖φ(ft)-φ(ft-1)‖2 S match =λ·loU(B t ,B t-1 )+(1-λ)·‖φ(f t )-φ(f t-1 )‖ 2 其中,λ动态动态调整系数,ρ为人群密度,ρmax为最大人群密度,φ(ft)、φ(ft-1)分别为CNN提取的当前帧和前一帧外观特征向量,‖‖2为L2的范数,loU(Bt,Bt-1)表示前后两帧中目标边界框的交并比,根据人群密度ρ动态调整。Where λ is the dynamic adjustment coefficient, ρ is the crowd density, ρmax is the maximum crowd density, φ(f t ) and φ(f t-1 ) are the appearance feature vectors of the current frame and the previous frame extracted by CNN, ‖‖ 2 is the L2 norm, and loU(B t ,B t-1 ) represents the intersection-union ratio of the target bounding boxes in the two frames, which is dynamically adjusted according to the crowd density ρ. 4.根据权利要求1所述基于深度学习的客流统计系统,其特征在于,所述模型构建步骤中,所述反馈优化模块执行以下操作:4. The passenger flow statistics system based on deep learning according to claim 1, characterized in that, in the model construction step, the feedback optimization module performs the following operations: 当跟踪丢失率Ltrack>15%时,将CNN主干网络最后两层的卷积核尺寸增大50%;When the tracking loss rate L track > 15%, increase the kernel size of the last two layers of the CNN backbone by 50%. 当连续帧间平均位移Dmove<5px时,将光流法的搜索窗口从32×32缩小至16×16。When the average displacement D<sub> move </sub> between consecutive frames is less than 5px, the search window for optical flow is reduced from 32×32 to 16×16. 5.根据权利要求1所述基于深度学习的客流统计系统,其特征在于,所述时空客流分析模块包括:5. The deep learning-based passenger flow statistics system according to claim 1, characterized in that the spatiotemporal passenger flow analysis module includes: 时空图构建单元:构建以行人轨迹点为节点、运动关系为边的时空图;Spatiotemporal graph construction unit: Constructs a spatiotemporal graph with pedestrian trajectory points as nodes and motion relationships as edges; 门控图卷积单元:通过门控机制聚合时空邻域信息,更新节点状态;Gated graph convolutional units: aggregate spatiotemporal neighborhood information and update node states through a gating mechanism; 时空预测单元:基于最终层节点状态hi (L),通过时空注意力机制预测T时刻后的客流分布。Spatiotemporal prediction unit: Based on the final layer node state h i (L) , predict the passenger flow distribution after time T through a spatiotemporal attention mechanism. 6.一种基于深度学习的客流统计方法,应用于权利要求1-5任一所述系统,其特征在于,包括步骤:6. A deep learning-based passenger flow statistics method, applied to the system described in any one of claims 1-5, characterized in that it includes the following steps: S1.可见光与热成像视频流同步采集及时间对齐;S1. Synchronous acquisition and time alignment of visible light and thermal imaging video streams; S2.通过双通道CNN提取多模态特征,按特征级融合层的计算公式进行融合;S2. Extract multimodal features using a dual-channel CNN and fuse them according to the calculation formula of the feature-level fusion layer; S3.基于融合特征初始化行人检测框,按外观特征匹配函数执行跨帧目标关联;S3. Initialize pedestrian detection boxes based on fused features, and perform cross-frame target association according to the appearance feature matching function; S4.根据反馈优化模块执行动态优化特征提取与跟踪参数;S4. Perform dynamic optimization of feature extraction and tracking parameters based on the feedback optimization module; S5.按时空客流分析模块生成区域客流热力图及未来预测曲线作为客流预测结果。S5. Generate a regional passenger flow heat map and future prediction curve as the passenger flow prediction result using the spatiotemporal passenger flow analysis module. 7.根据权利要求6所述基于深度学习的客流统计方法,其特征在于,所述的步骤S2中采用迁移学习策略进行融合:7. The deep learning-based passenger flow statistics method according to claim 6, characterized in that, in step S2, a transfer learning strategy is used for fusion: 可见光通道加载COCO预训练权重,热成像通道加载FLIR预训练权重;The visible light channel is loaded with COCO pre-trained weights, and the thermal imaging channel is loaded with FLIR pre-trained weights. 融合层权重通过端到端联合微调更新。The weights of the fusion layer are updated through end-to-end joint fine-tuning. 8.根据权利要求6所述基于深度学习的客流统计方法,其特征在于,在客流统计步骤之后,所述的S5中客流预测结果用于触发管控指令:8. The deep learning-based passenger flow statistics method according to claim 6, characterized in that, after the passenger flow statistics step, the passenger flow prediction result in S5 is used to trigger control instructions: 当区域密度>安全密度时,生成分流告警信号;When the area density is greater than the safety density, a diversion alarm signal is generated; 当预测流量>最大容量时,生成限流控制指令。When the predicted flow exceeds the maximum capacity, a flow limiting control command is generated.
CN202511068910.8A 2025-07-31 2025-07-31 Passenger Flow Statistics System and Method Based on Deep Learning Pending CN120954052A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202511068910.8A CN120954052A (en) 2025-07-31 2025-07-31 Passenger Flow Statistics System and Method Based on Deep Learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202511068910.8A CN120954052A (en) 2025-07-31 2025-07-31 Passenger Flow Statistics System and Method Based on Deep Learning

Publications (1)

Publication Number Publication Date
CN120954052A true CN120954052A (en) 2025-11-14

Family

ID=97606774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202511068910.8A Pending CN120954052A (en) 2025-07-31 2025-07-31 Passenger Flow Statistics System and Method Based on Deep Learning

Country Status (1)

Country Link
CN (1) CN120954052A (en)

Similar Documents

Publication Publication Date Title
Yin et al. Recurrent convolutional network for video-based smoke detection
Tsakanikas et al. Video surveillance systems-current status and future trends
CN111126325B (en) Intelligent personnel security identification statistical method based on video
US10735694B2 (en) System and method for activity monitoring using video data
Li et al. Statistical modeling of complex backgrounds for foreground object detection
Wang et al. Robust video-based surveillance by integrating target detection with tracking
US8457401B2 (en) Video segmentation using statistical pixel modeling
Choudhury et al. An evaluation of background subtraction for object detection vis-a-vis mitigating challenging scenarios
Tomar et al. Crowd analysis in video surveillance: A review
CN109101888A (en) A kind of tourist&#39;s flow of the people monitoring and early warning method
Deng et al. Deep learning in crowd counting: A survey
CN111666860A (en) Vehicle track tracking method integrating license plate information and vehicle characteristics
Jeyabharathi et al. Vehicle Tracking and Speed Measurement system (VTSM) based on novel feature descriptor: Diagonal Hexadecimal Pattern (DHP)
Usha Rani et al. Real-time human detection for intelligent video surveillance: an empirical research and in-depth review of its applications
Zhang et al. Faster R-CNN based on frame difference and spatiotemporal context for vehicle detection
CN113627383A (en) Pedestrian loitering re-identification method for panoramic intelligent security
Eng et al. Robust human detection within a highly dynamic aquatic environment in real time
Lalonde et al. A system to automatically track humans and vehicles with a PTZ camera
CN110363100A (en) A video object detection method based on YOLOv3
Bou et al. Reviewing ViBe, a popular background subtraction algorithm for real-time applications
Hanif et al. Performance analysis of vehicle detection techniques: a concise survey
Kapoor A video surveillance detection of moving object using deep learning
Wang et al. An end-to-end traffic vision and counting system using computer vision and machine learning: the challenges in real-time processing
Jiang et al. A smartly simple way for joint crowd counting and localization
Wang et al. Research, applications and prospects of event-based pedestrian detection: a survey

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination