A Multitask Cascading CNN with MultiScale Infrared Optical Flow Feature Fusion-Based Abnormal Crowd Behavior Monitoring UAV †
<p>Schematic diagram of monitoring unmanned aerial vehicles (UAV). (<b>a</b>) UAV with an infrared camera; (<b>b</b>) remote control; (<b>c</b>) representative image in our new infrared-based abnormal crowd behavior dataset; (<b>d</b>) FLIR TAU2-336 infrared thermal imager; (<b>e</b>) ground control station.</p> "> Figure 2
<p>Framework of the proposed UAV system.</p> "> Figure 3
<p>Schematic diagram of the MC-CNN. (<b>a</b>) Input infrared image; (<b>b</b>) shared CNN; (<b>c</b>) first stage; crowd-count classification; (<b>d</b>) second stage; crowd-density estimate; (<b>e</b>) crowd-density estimate map corresponds to the input infrared image.</p> "> Figure 4
<p>Representative feature maps in the MC-CNN. (<b>a</b>) Shared CNN; (<b>b</b>) crowd-count classification stage; (<b>c</b>) first stage of crowd-density estimate.</p> "> Figure 5
<p>Density map obtained by geometric adaptation of Gaussian kernel. (<b>a</b>) Input image; (<b>b</b>) corresponding density map.</p> "> Figure 6
<p>Crowd-motion estimate.</p> "> Figure 7
<p>Three-level pyramids of two continuous frames.</p> "> Figure 8
<p>Result of motion corners detection and tracking.</p> "> Figure 9
<p>Representative images of our new crowd dataset. The samples in rows 1 to 4 indicate near the building, intersections, squares, and roads, respectively.</p> "> Figure 10
<p>Crowd behavior recognition. (<b>a</b>) Crowd-aggregating motion detection, (<b>b</b>) crowd-escaping motion detection.</p> "> Figure 11
<p>Crowd-status information of crossroad scene. (<b>a</b>) Normal; (<b>b</b>) abnormal.</p> "> Figure 12
<p>Crowd-status information of the scene close to the building. (<b>a</b>) Normal; (<b>b</b>) abnormal.</p> ">
Abstract
:1. Introduction
1.1. Motivation
1.2. Literature Review
1.3. Contributions
1.4. Organization
2. Algorithm Research and System Design
2.1. System Architecture
2.2. MC-CNN-Based Crowd-Density Estimation
2.2.1. Crowd-Count Classification
2.2.2. Density-Map Estimation
2.2.3. The Training of MC-CNN
2.3. MIR-OF-Based Crowd-Motion Estimate
2.3.1. Corners Detection and Multiscale Analysis and Tracking
- (1)
- Two consecutive frames of images H and I, were obtained at the same time, using corner detection on frame H and the successfully detected corners C1 from frame H were regarded as the initial point of the pyramid LK Optical flow for tracking;
- (2)
- The successfully detected and tracked corners from frame I were recorded as C2;
- (3)
- The amplitude of velocity were calculated, written mag, of the corresponding corner between C1 and C2;
- (4)
- We determined if the velocity amplitude of each corner in mag was greater than the small motion threshold. If it was greater than the small motion threshold, the speed information of the corner was preserved or vice versa.
2.3.2. Average Velocity
2.4. Decision Flow for Crowd Abnormal Behavior
3. Experimental Results and Validation
3.1. Our Self-Built Data Set: IR-Flying Dataset
3.2. Experiments for Crowd Abnormal Behavior Monitoring
4. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
CNN | Convolutional neural network |
L–K | Lucas–Kanade |
MC-CNN | Multitask cascading CNN |
MIR-OF | Multiscale infrared optical flow |
UAV | Unmanned aerial vehicles |
TP | True positive |
TN | True negative |
FP | False positive |
FN | False negative |
References
- Zhang, X.; Zhang, Q.; Hu, S.; Guo, C.; Yu, H. Energy level-based abnormal crowd behavior detection. Sensors 2018, 18, 423. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kok, V.J.; Lim, M.K.; Chan, C.S. Crowd behavior analysis: A review where physics meets biology. Neurocomputing 2016, 177, 342–362. [Google Scholar] [CrossRef] [Green Version]
- Zhang, Y.; Zhou, D.; Chen, S.; Gao, S.; Ma, Y. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 589–597. [Google Scholar]
- Kang, D.; Ma, Z.; Chan, A.B. Beyond counting: Comparisons of density maps for crowd analysis tasks—counting, detection, and tracking. IEEE Trans. Circuits Syst. Video Technol. 2019, 29, 1408–1422. [Google Scholar] [CrossRef]
- Grant, J.M.; Flynn, P.J. Crowd scene understanding from video: A survey. ACM Trans. Multimed. Comput. Commun. Appl. 2017, 13, 1–23. [Google Scholar] [CrossRef]
- Sindagi, V.A.; Patel, V.M. Cnn-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In Proceedings of the 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, Italy, 29 August–1 September 2017; pp. 1–6. [Google Scholar]
- Motlagh, N.H.; Taleb, T.; Arouk, O. Low-altitude unmanned aerial vehicles-based internet of things services: Comprehensive survey and future perspectives. IEEE Int. Things J. 2016, 3, 899–922. [Google Scholar]
- Gonzalez, L.F.; Montes, G.A.; Puig, E.; Johnson, S.; Mengersen, K.; Gaston, K.J. Unmanned aerial vehicles (uavs) and artificial intelligence revolutionizing wildlife monitoring and conservation. Sensors 2016, 16, 97. [Google Scholar] [CrossRef] [Green Version]
- Barmpounakis, E.N.; Vlahogianni, E.I.; Golias, J.C. Unmanned aerial aircraft systems for transportation engineering: Current practice and future challenges. Int. J. Transport. Sci. Technol. 2016, 5, 111–122. [Google Scholar] [CrossRef]
- Minaeian, S.; Liu, J.; Son, Y.J. Effective and efficient detection of moving targets from a uav’s camera. IEEE Trans. Intell. Transp. Syst. 2018, 19, 497–506. [Google Scholar] [CrossRef]
- Wu, K.; Cai, Z.; Zhao, J.; Wang, Y. Target tracking based on a nonsingular fast terminal sliding mode guidance law by fixed-wing uav. Appl. Sci. 2017, 7, 333. [Google Scholar] [CrossRef] [Green Version]
- Sandino, J.; Gonzalez, F.; Mengersen, K.; Gaston, K.J. Uavs and machine learning revolutionising invasive grass and vegetation surveys in remote arid lands. Sensors 2018, 18, 605. [Google Scholar] [CrossRef] [Green Version]
- Dollar, P.; Wojek, C.; Schiele, B.; Perona, P. Pedestrian detection: An evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 743–761. [Google Scholar] [CrossRef] [PubMed]
- Zhu, C.; Peng, Y. Discriminative latent semantic feature learning for pedestrian detection. Neurocomputing 2017, 238, 126–138. [Google Scholar] [CrossRef]
- Li, J.; Liang, X.; Shen, S.; Xu, T.; Feng, J.; Yan, S. Scale-aware fast r-cnn for pedestrian detection. IEEE Trans. Multimed. 2018, 20, 985–996. [Google Scholar] [CrossRef] [Green Version]
- Ke, Y.; Sukthankar, R.; Hebert, M. Volumetric features for video event detection. Int. J. Comput. Vis. 2010, 88, 339–362. [Google Scholar] [CrossRef] [Green Version]
- Idrees, H.; Saleemi, I.; Seibert, C.; Shah, M. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 2547–2554. [Google Scholar]
- Wang, Q.; Chen, M.; Nie, F.; Li, X. Detecting coherent groups in crowd scenes by multiview clustering. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 46–58. [Google Scholar] [CrossRef] [PubMed]
- Monajjemi, M.; Mohaimenianpour, S.; Vaughan, R. Uav, come to me: End-to-end, multi-scale situated hri with an uninstrumented human and a distant uav. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Korea, 9–14 October 2016; pp. 4410–4417. [Google Scholar]
- Hailing, Z.; Hui, K.; Lei, W.; Creighton, D.; Nahavandi, S. Efficient road detection and tracking for unmanned aerial vehicle. IEEE Trans. Intell. Trans. Syst. 2015, 16, 297–309. [Google Scholar]
- Shao, Y.; Mei, Y.; Chu, H.; Chang, Z.; He, Y.; Zhan, H. Using infrared hog-based pedestrian detection for outdoor autonomous searching uav with embedded system. In Proceedings of the 9th International Conference on Graphic and Image Processing, ICGIP 2017, Qingdao, China, 14–16 October 2017; SPIE: Qingdao, China, 2018; Volume 10615, pp. 106151–106155. [Google Scholar]
- Tome, D.; Monti, F.; Baroffio, L.; Bondi, L.; Tagliasacchi, M.; Tubaro, S. Deep convolutional neural networks for pedestrian detection. Signal Proc.Image 2016, 47, 482–489. [Google Scholar] [CrossRef] [Green Version]
- Song, W.; Li, S.; Liu, J.; Qin, H.; Zhang, B.; Zhang, S.; Hao, A. Multitask cascade convolution neural networks for automatic thyroid nodule detection and recognition. IEEE J. Biomed. Health Inform. 2019, 23, 1215–1224. [Google Scholar] [CrossRef]
- Loy, C.C.; Chen, K.; Gong, S.; Xiang, T. Crowd counting and profiling: Methodology and evaluation. In Modeling, Simulation and Visual Analysis of Crowds: A Multidisciplinary Perspective; Ali, S., Nishino, K., Manocha, D., Shah, M., Eds.; Springer: New York, NY, USA, 2013; pp. 347–382. [Google Scholar]
- Razavi, M.; Sadoghi Yazdi, H.; Taherinia, A.H. Crowd analysis using bayesian risk kernel density estimation. Eng. Appl. Artif. Intell. 2019, 82, 282–293. [Google Scholar] [CrossRef]
- Su, H.; Dong, Y.; Zhu, J.; Ling, H.; Zhang, B. Crowd scene understanding with coherent recurrent neural networks. In Proceedings of the International Joint Conference On Artificial Intelligence, New York, NY, USA, 22 May 2016; pp. 3469–3476. [Google Scholar]
- Liu, N.; Long, Y.; Zou, C.; Niu, Q.; Pan, L.; Wu, H. Adcrowdnet: An attention-injective deformable convolutional network for crowd understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVPR, Long Beach, CA, USA, 15–21 June 2019; pp. 3225–3234. [Google Scholar]
- Li, T.; Chang, H.; Wang, M.; Ni, B.; Hong, R.; Yan, S. Crowded scene analysis: A survey. IEEE Trans. Circ. Syst. Video Technol. 2015, 25, 367–386. [Google Scholar] [CrossRef] [Green Version]
- Savkin, A.V.; Huang, H. A method for optimized deployment of a network of surveillance aerial drones. IEEE Syst. J. 2019, 13, 4474–4477. [Google Scholar] [CrossRef]
- Tang, J. Conflict Detection and Resolution for Civil Aviation: A Literature Survey. IEEE Aerosp. Electr. Syst. Mag. 2019, 34, 20–35. [Google Scholar] [CrossRef]
- Tang, J.; Piera, M.A.; Guasch, T. Coloured Petri net-based traffic collision avoidance system encounter model for the analysis of potential induced collisions. Transp. Res. Part C Emerg. Technol. 2016, 67, 357–377. [Google Scholar] [CrossRef] [Green Version]
- Brox, T.; Malik, J. Large displacement optical flow: Descriptor matching in variational motion estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 500–513. [Google Scholar] [CrossRef]
- Dai, J.; He, K.; Sun, J. Instance-aware semantic segmentation via multi-task network cascades. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 3150–3158. [Google Scholar]
- Chen, J.; Kumar, A.; Ranjan, R.; Patel, V.M.; Alavi, A.; Chellappa, R. A cascaded convolutional neural network for age estimation of unconstrained faces. In Proceedings of the 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS), Washinton, DC, USA, 6–9 September 2016; pp. 1–8. [Google Scholar]
- Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the European Conference On Computer Vision 2014, Zurich, Switzerland, 6–12 September 2014; pp. 818–833. [Google Scholar]
- Jianbo, S.; Tomasi, C. Good features to track. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 21–23 June 1994; pp. 593–600. [Google Scholar]
- Mikolajczyk, K.; Schmid, C. Scale & affine invariant interest point detectors. Int. J. Comput. Vision 2004, 60, 63–86. [Google Scholar]
- Balntas, V.; Tang, L.; Mikolajczyk, K. Binary online learned descriptors. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 555–567. [Google Scholar] [CrossRef] [Green Version]
- Acevedo, J.J.; Maza, I.; Ollero, A.; Arrue, B.C. An Efficient Distributed Area Division Method for Cooperative Monitoring Applications with Multiple UAVs. Sensors 2020, 20, 3448. [Google Scholar] [CrossRef]
Velocity Factor | Density Factor | Normal/Abnormal |
---|---|---|
Becomes larger | Becomes smaller | Abnormal |
Becomes smaller | Becomes larger | |
Becomes larger | Becomes larger | |
Becomes larger | Constant | |
Constant | Becomes larger | |
Becomes smaller | Becomes smaller | Normal |
Becomes smaller | Constant | |
Constant | Becomes smaller | |
Constant | Constant |
Attribute | Attribute Values |
---|---|
Resolution | 336*256 |
Scene | 4 |
Image num | 970 |
Person num | 16,000 |
Frame rate | 7 |
Type of Behavior | Scenarios |
---|---|
#1: Aggregating | Traffic congestion Demonstration Trampled underfoot Fight |
#2: Escaping | Terrorist attack Fire alarm Earthquake |
Prediction Result | Meaning |
---|---|
TP | Prediction is abnormal, the actual is abnormal. |
TN | Prediction is normal, the actual is normal. |
FP | Prediction is abnormal, the actual is normal. |
FN | Prediction is normal, the actual is abnormal. |
Scene | TP | TN | FN | FP | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|---|---|---|---|
#1: Intersections | 27 | 1103 | 6 | 7 | 98.86% | 79.41% | 81.81% | 80.59% |
#2: Buildings | 40 | 1443 | 5 | 15 | 98.66% | 72.72% | 88.88% | 79.99% |
Scene | Detection Time/s |
---|---|
#1: Intersections | 0.224 |
#2: Buildings | 0.219 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shao, Y.; Li, W.; Chu, H.; Chang, Z.; Zhang, X.; Zhan, H. A Multitask Cascading CNN with MultiScale Infrared Optical Flow Feature Fusion-Based Abnormal Crowd Behavior Monitoring UAV. Sensors 2020, 20, 5550. https://doi.org/10.3390/s20195550
Shao Y, Li W, Chu H, Chang Z, Zhang X, Zhan H. A Multitask Cascading CNN with MultiScale Infrared Optical Flow Feature Fusion-Based Abnormal Crowd Behavior Monitoring UAV. Sensors. 2020; 20(19):5550. https://doi.org/10.3390/s20195550
Chicago/Turabian StyleShao, Yanhua, Wenfeng Li, Hongyu Chu, Zhiyuan Chang, Xiaoqiang Zhang, and Huayi Zhan. 2020. "A Multitask Cascading CNN with MultiScale Infrared Optical Flow Feature Fusion-Based Abnormal Crowd Behavior Monitoring UAV" Sensors 20, no. 19: 5550. https://doi.org/10.3390/s20195550