Fall Detection System Based on Deep Learning
and Image Processing in Cloud Environment
Leixian Shen, Qingyun Zhang, Guoxu Cao, and He Xu(&)
School of Computer Science, Nanjing University of Posts
and Telecommunications, Nanjing 210023, China
{b16041522,b16041523,b16041529,xuhe}@njupt.edu.cn
Abstract. Nowadays, the safety of the elderly living alone has drawn more and
more attention in China. In view of the early warning of the fall detection and
the application of the Internet of Things, the fall detection system based on the
wearable device and the environmental sensor has entered the market, but there
are some disadvantages, such as high invasion, low precision, poor robustness
and large environmental impact. This paper presents a fall detection system
based on depth learning and image processing in cloud environment, which does
not rely on wearable devices and sensors. The high-frequency images taken by
the camera are transmitted to the server which detects the key points of the
human body through the Deepcut neural network model. The output data of the
human body key points detection map is input into the deep neural network to
judge the fall through the softmax function and the prepared model which was
trained by using the training data of the key points distributed in all kinds of
human bodies prepared in advance. The relatives will also be informed through
relevant communication means. The experimental tests show that the proposed
method can effectively detect falls in different state of the fall and the human
body in various forms.
1 Introduction
In China, the number of elderly people over the age of 60 is rapidly increasing. The
“empty-nesters” who live alone are also growing at an unprecedented rate. According
to the “Circular of the State Council on Issuing the Plan for the Development of
National Old Age and Building the Pension System in the 13th Five-Year Plan Period
Guo Fa[2017] No. 13”, it is estimated that the number of elderly people over the age of
60 in China will increase to 255 million by 2020, accounting for the proportion of the
total population increased to about 17.8%; seniors will increase to 29 million. As the
parents of the first generation of only-begged children gradually enter old age and the
traditional Chinese family structure gradually disintegrates. Relevant research estimates
that the number of empty-nesters in China will increase to more than 200 million by
2030, accounting for 90% of China’s elderly population [1, 2]. On the one hand, the
elderly living alone lack mental comfort. On the other hand, their mobility is limited. If
accidents occur, it is even more difficult for them to inform their children or relatives in
time. In recent years, accidents among elderly people living alone have been fatal.
Relevant studies have shown that the fall of an elderly person can be promptly detected
© Springer International Publishing AG, part of Springer Nature 2019
L. Barolli et al. (Eds.): CISIS 2018, AISC 772, pp. 590–598, 2019.
https://doi.org/10.1007/978-3-319-93659-8_53
Fall Detection System Based on Deep Learning and Image Processing 591
and helped, the elderly can effectively reduce the risk of death by 80% and the long-
term hospitalization risk by 26%, and the consequences of such incidents can be largely
avoided or mitigated.
After a lot of actual case analysis, we found that the death causes of elderly people
living alone were mostly sudden illness and accidental falls, when they were unknown
and could not be dealt with promptly. The same thing lies in the fact that the elderly
move from a normal walking state to a lying down state after an accident occurs. In this
paper, we present a method of fall detection based on depth learning and image pro-
cessing in Cloud environment. Based on the existing webcam in the house or other
camera installed additionally, the status of the elderly in real time is transmitted to the
server through high-frequency images. Once the server detects that the elderly falls
through the fall detection algorithm, the server sends an alert message and the picture
taken by the camera to a specific social network account such as WeChat. The system
also designed a SMS reminder service to prevent users from not viewing the social
network message in time.
2 Related Research
Nowadays, the research on camera-based fall detection in the Internet of Things
environment is relatively rare. For the study of fall detection, the mainstream design is
based on wearable devices and related sensors, data is further processed through the
wearable device, and different types of sensing devices and related algorithms are used
to construct the fall detection system for different specific applications. A brief analysis
of the mainstream fall detection method are listed as follows:
The wearable fall detection system usually embeds the fall detection unit into the
mobile phone, the clothes, the belt, the jewelry and the like, and can collect the physical
parameters or physical parameters of the human body in real time, and determine
whether to fall through the detection algorithm. Wearable devices are generally highly
accurate and easy to operate, but their high degree of invasiveness are their drawbacks
[3]. Moreover, the wearable device greatly increases the manufacturing cost of the
device and increases the inconvenience of the elderly. However, the system will fail if
the old user forgets or cannot wear the device in a specific situation.
Fall detection system based on environment usually embedded in the human
activity area with wireless sensor networks, infrared sensors, sound sensors, pressure
sensors, radar and other sensors to capture body movements and posture information.
Most environmental-based devices use pressure sensors to detect and track objects
[4–7]. As it senses all the pressure changes around the object, it is prone to false alarms
and reduces the accuracy of fall detection.
Video fall detection system need to install surveillance cameras in the human
activity area, acquire human motion images, and use image analysis to achieve fall
judgment. The main points of image analysis include the head movement speed, body
length and so on [8–10]. This type of image-based fall detection does reduce costs, but
there are still some shortcomings in accuracy and processing speed.
From the above research, we can see that the fall detection system based on
wearable devices and environmental sensors has the disadvantages of high invasion,
592 L. Shen et al.
low precision, poor robustness, and environmental impact. In recent years, with the
continuous development of computer vision and digital image processing technology,
vision-based fall detection becomes an important method. The rapid development of
artificial intelligence and machine learning, in-depth study of deep learning, and deep
learning has obvious advantages in intelligence [11], so that more intelligent fall
detection can be possible. Compared with wearable devices and sensors, the method
based on deep learning and image processing is less intrusive to users and has higher
accuracy and robustness. Deep learning is a machine learning method based on
characterization of data learning. An image as an observation can be represented in
many ways, such as a vector of intensity values for each pixel, or more abstractly as a
series of edges, a region of a particular shape, and the like. Specific representations
makes it easier to learn tasks from instances. This paper presents a fall detection
method based on deep learning and image processing in Cloud environment. Based on
the existing webcam in the house or other cameras installed additionally, the video will
be cut into high-frequency images. After the server performs certain preprocessing on
the images, the human body key point detection is performed through the Deepcut
neural network model, then the output of the human key point detection map data are
input to the deep neural network, fall judgment is made through the model trained by
the data prepared in advance and the distribution of key points of human body under
various circumstances and softmax function classification, and informing the relatives
through the relevant means of communication. Different state of the fall and the human
body in various other forms of experimental tests show that the proposed method can
effectively detect the fall event.
3 System Design
3.1 System Overview
The whole system mainly includes a camera for collecting image data, a cloud server
for processing data and a smartphone for receiving messages. The three parts transmit
data through the network to realize system functions. The overall system framework
shown in Fig. 1.
3.2 Camera
The camera is based on the Raspberry Pi 3 Model B development board. The mjpeg-
stream auxiliary module is installed to monitor and transmit the image data [12].
Raspberry pi supporting software is listed as followings:
Putty: Log in to Raspberry Pi via IP.
Xrdp: remote connection login Raspberry Pi.
mjpeg-stream: cmake compiled after running the Raspberry Pi, streaming image
data. Devices in the same LAN can access the port number set by the user under the
current IP, which can easily share the picture data.
Fall Detection System Based on Deep Learning and Image Processing 593
SMS
Camera 01
Cloud
Camera 02 server
. Social
. network
Elderly home / Elderly family/
Nursing home care worker …
…
Fig. 1. System overall framework
3.3 Cloud Server
3.3.1 Body Detection
DeeperCut proposed a candidate region of human body parts. Each candidate region
acts as a node, all the nodes form a dense connection graph, the connection between
nodes are as the weight between nodes, which can be regarded as an optimization
problem to group parts (nodes) belonging to the same person into one class [13–15].
The DeeperCut model is used to detect the human body’s 14 key points to determine
the human body. Human body distribution of key points is shown in Fig. 2, the actual
results shown in Fig. 3.
Fig. 2. Key point distribution Fig. 3. Actual effect
594 L. Shen et al.
3.3.2 Classification
We pre-recorded 44 videos, and 42 videos were used as training data and 2 videos were
used as test data. These include two cameras, three angles, three scenes, situations of
falls in various angles and poses, and dozens of non-falling poses, including the sus-
pected fall of squatting, bending over, sitting down, etc. The video cut by frame into a
total of more than ten thousand pictures, in which about 2-3 thousand picture are fallen
state. Predicting more than 10,000 images through the DeeperCut model, we have
obtained more than ten thousand data that mark key points of the human body, each of
which contains the position of 14 points and the confidences, for a total of 42 pieces.
Keras is used to train more than 10,000 data through the deep neural network model.
Dense Layer is a commonly used fully connected layer, and its operation is out-
put = activation (dot (input, kernel) + bias), and activation is an element-calculated
activation function, kernel is the weight matrix of this layer, and bias is the bias vector.
The activation function used in the first five levels is tanh, while the last one uses
softmax functions for classification, which is shown in Fig. 4. The structure of deep
neural network is shown in Fig. 5.
1 e2x
tanhðxÞ ¼ ð1Þ
1 þ e2x
1
hh ð x Þ ¼ ð2Þ
1 þ exp hT x
Fig. 4. Softmax function image
After about 150 iterations of training, the test set accuracy rate stabilized at about
98%. The final selection is 200 iterative model. In the actual test phase, almost every
test got the right result unless the situation that even the naked eye cannot distinguish.
The training accuracy changes as shown in Fig. 6.
Fall Detection System Based on Deep Learning and Image Processing 595
Normal
Someone Falls
Feature Dense_1 Dense_2 Dense_3 Dense_4 Dense_5 Dense_6
Dimension 48 64 32 16 8 4 2
Activation tanh tanh tanh tanh tanh softmax
Fig. 5. Depth neural network structure diagram
Fig. 6. Changes in accuracy
3.4 Smartphone
Smartphone terminal is used to receive alarm messages. The system designs two ways:
SMS and Social network.
3.4.1 SMS
Through Twilio cloud communication SMS python API to achieve SMS communi-
cation, the effect is shown in Fig. 7.
596 L. Shen et al.
3.4.2 WeChat
The program can send reminders and real-time images to users through the web
WeChat API when needed. Users only need to add the platform official WeChat and
bind the corresponding camera, you can receive reminders. And such a system can
achieve multi-camera multi-user binding, the effect shown in Fig. 8.
Fig. 7. SMS notification Fig. 8. Social network notification
4 System Tests
Table 1 shows the test results for the various postures. And some tests phase are shown
in Fig. 9. The results show that our presented method and realized system performs
better in detecting fall.
Table 1. Final test result
Test sample Correct sample Accuracy (%)
Upright 222 221 99.55
Fall 324 324 100.00
Squat 73 70 95.89
Sit down 256 255 99.61
Blocked 64 59 92.19
Bend over 234 225 96.15
Edge 56 51 91.07
Total 1229 1205 98.05
Fall Detection System Based on Deep Learning and Image Processing 597
Fig. 9. Some detection examples
5 Conclusion
As wearable devices and environmental sensors fall detection system has disadvantages
of high invasion, low accuracy, poor robustness, and the impact of the environment and
other shortcomings, this paper presents a fall detection method based on deep learning
and image processing in Cloud environment. The high-frequency images taken by the
camera are transmitted to the server, and the server detects the key points of the human
body by using the Deepcut neural network model. The output data of detected key
points of the human body is passed into the deep neural network, making fall judgment
through the model trained by the data prepared in advance of the distribution of key
points of human body under various circumstances and softmax function classification,
and informing the relatives through the relevant means of communication. Different
state of the fall and the body of the various morphological tests show the effectiveness
of the method. This method is also a specific application of deep learning, which
provides a new way and a new method for the realization and application of fall
detection.
598 L. Shen et al.
Acknowledgments. This work is financially supported by the National Natural Science
Foundation of P. R. China (Nos. 61572260, 61572261, 61672296, 61602261), Natural Science
Foundation of Jiangsu Province (No. BK20140888), Jiangsu Natural Science Foundation for
Excellent Young Scholars (BK20160089), Scientific & Technological Support Project of Jiangsu
Province (Nos. BE2015702, BE2016185, BE2016777), China Postdoctoral Science Foundation
(No. 2014M561696), Jiangsu Planned Projects for Postdoctoral Research Funds (No. 14010
05B), Postgraduate Research and Practice Innovation Program of Jiangsu Province (No. KYCX
17_0798) and NUPT STITP (No. SZDG2018014).
References
1. Ye, C., Guo, X., Liang, G., et al.: Comprehensive comparison between empty nest and non-
empty nest elderly: a cross-sectional study among rural populations in Northeast China. Int.
J. Environ. Res. Public Health 13, 857 (2016)
2. Iio, T., Shiomi, M., Kamei, K., et al.: Social acceptance by senior citizens and caregivers of a
fall detection system using range sensors in a nursing home. Adv. Robot. 30, 190–205
(2016)
3. Özdemir, A.T.: An Analysis on sensor locations of the human body for wearable fall
detection devices. Princ. Pract. Sens. 16, 1161 (2016)
4. Lee, C.K., Lee, V.Y.: Fall detection system based on kinect sensor using novel detection and
posture recognition algorithm. In: International Conference on Smart Homes and Health
Telematics, pp. 238–244. Springer, Berlin (2013)
5. Cheng, W.C., Jhan, D.M.: Triaxial accelerometer-based fall detection method using a self-
constructing cascade-AdaBoost-SVM classifier. IEEE J. Biomed. Health Inform. 17, 411–
419 (2013)
6. Feldwieser, F., Gietzelt, M., Goevercin, M., et al.: Multimodal sensor-based fall detection
within the domestic environment of elderly people. Z. Gerontol. Geriatr. 47, 661–665 (2014)
7. Liu, Z.X., Liu, Q., Yuan, Y.Z., et al.: Location scheme in wireless sensor networks based on
Bayesian estimation, virtual force and genetic algorithm. Control Decis. 28, 899–903 (2013)
8. Hu, X., Qu, X.: Pre-impact fall detection. Biomed. Eng. Online, 15, 1–16 (2016)
9. Chua, J.L., Chang, Y.C., Lim, W.K.: A simple vision-based fall detection technique for
indoor video surveillance. Signal Image Video Process. 9, 623–633 (2015)
10. Yang, L., Ren, Y., Hu, H., et al.: New fast fall detection method based on spatio-temporal
context tracking of head by using depth images. Sensors 15, 23004–23019 (2015)
11. Chotitham, S., Wongwanich, S., Wiratchai, N.: Deep learning and its effects on achievement.
Proc. Soc. Behav. Sci. 116, 3313–3316 (2014)
12. Khot, S.B., Gaikwad, M.S.: Development of cloud-based light intensity monitoring system
for green house using Raspberry Pi. In: International Conference on Computing
Communication Control and Automation. IEEE (2017)
13. Insafutdinov, E., Pishchulin, L., Andres, B., et al.: DeeperCut: a deeper, stronger, and faster
multi-person pose estimation model, pp. 34–50 (2016)
14. Insafutdinov, E., Andriluka, M., Pishchulin, L., et al.: ArtTrack: Articulated Multi-Person
Tracking in the Wild (2016)
15. Pishchulin, L., Insafutdinov, E., Tang, S., et al.: DeepCut: Joint Subset Partition and
Labeling for Multi Person Pose Estimation, pp. 4929–4937 (2015)