0% found this document useful (0 votes)

34 views30 pages

Project Report

This project report details the development of a Human Pose Estimation model using machine learning techniques, specifically convolutional neural networks, to accurately detect and analyze human body movements in real-time from visual data. The project aims to address challenges such as occlusions and varying environmental conditions, with applications in healthcare, sports, and human-computer interaction. The report includes acknowledgments, an abstract, objectives, and a literature survey on existing methodologies in the field.

Uploaded by

bajishaik9705198251

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views30 pages

Project Report

Uploaded by

bajishaik9705198251

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Human Pose Estimation using Machine Learning

A Project Report

submitted in partial fulfillment of the requirements

AICTE Internship on AI: Transformative Learning

with
TechSaksham – A joint CSR initiative of Microsoft & SAP

Baji Baba Shaik , bajishaikh18@gmail.com

Under the Guidance of

Aditya Prashant Ardak

Master Trainer, Edunet Foundation

pg. I
ACKNOWLEDGEMENT

Words cannot explain the gratitude I have for my trainers, Mr. Aditya Prashant Ardak, and
the Edunet Team. They were part and parcel of whatever help and guidance I have
received while carrying out my project on the Human Pose Estimation Project using
Machine Learning. With the knowledge imparted by Mr. Ardak in machine learning and
computer vision, my own understanding of the field grew considerably. Mr. Ardak's
guidance for me was extremely helpful, especially as he would explain deeper things in a
simplified way and provide guidance on the research.

Gratefulness and appreciation to the trainers: Mr. Aditya Prashant Ardak and the entire
Edunet Team for their uninterrupted support during the tenure of the Human Pose
Estimation Project using Machine Learning. It would not have been possible to complete
the work without their guidance and help. Moreover, with Mr. Ardak's guidance in a
manner that kept me on track, my understanding of machine learning and computer vision
deepened considerably. Mr. Ardak's guidance for me was extremely helpful, especially as
he would explain deeper things in a simplified way and provide guidance on the research.

This project has been an enriching experience, and without Mr. Ardak and the Edunet
Team supporting me, I don’t think I would have made it this far. Their commitment to
teaching and growing their students truly makes a difference for my learning experience. I
am looking forward to further applying skills and techniques learned under their tutelage.

Ultimately, I would like to express my gratitude for both trainers and Edunet Team for
continuous motivation they have provided me. Their professionalism and devotion towards
the achievement of students have made an indelible impression upon me. Their time and
attention to conversion made a major contribution toward the success of this project, for
which I am deeply and sincerely appreciative.

pg. I
ABSTRACT
Human pose estimation, often referred to as pose estimation, is the process of specifying
and tracking human poses in images or videos and has applications in fields such as computer
vision, robotics, and sports analysis. The main aim of this project was to train an efficient
machine learning model that can accurately estimate the human pose in real-time from visual
input. Zoning for the complexity in human movements, the different body shapes, and
environment conditions makes the pose estimation problem an interesting and difficult
problem to solve. To this end, using convolutional neural networks and deep learning
models, which have shown competence in extracting spatial features from images, this
project has been undertaken. The model was trained on a very huge dataset that contained
annotated human pose landmarks. A key element was using pre-trained models such as
OpenPose and PoseNet and fine-tuning those models for specific pose-estimation tasks. The
model was evaluated in terms of accuracy, robustness, and speed, relying on both qualitative
and quantitative metrics. The results showed a very significant improvement in pose
accuracy among others with key body joints like the elbows, knees, and wrists even under
difficult occlusions and varying poses. It also enables almost real-time performance suitable
for live applications. This project showed that the prospect of using machine learning
techniques is promising in solving the problem of human pose estimation. Potential impact
of this work includes potential applications Sir in gesture recognition, augmented reality,
and human-computer interaction in real-time systems. The future directions will continue
research and optimizations of the model to increase the potential applications in this field.

pg. I
TABLE OF CONTENT

Abstract ............................................................................................................... I

Chapter 1. Introduction .........................................................................................1

1.1 Problem Statement ...............................................................................1
1.2 Motivation .............................................................................................2
1.3 Objectives ..............................................................................................3
1.4. Scope of the Project .............................................................................4
Chapter 2. Literature Survey ................................................................................5
2.1 Relevant Literatures .............................................................................5
2.2 Existing Models, Techniques or Methodologies ...................................5
2.3 Gaps and Limitations ............................................................................8
Chapter 3. Proposed Methodology .....................................................................11
3.1 System Design .....................................................................................11
3.2 Requirement Specification ..................................................................12
Chapter 4. Implementation and Results ............................................................19
4.1 Snap Shots of the Result .....................................................................19
4.2 GitHub Link for Code .........................................................................21
Chapter 5. Discussion and Conclusion ..............................................................22
5.1 Future Work & Model Improvements .................................................22
5.2 Summary of Overall Impact and Contribution....................................23
References ..................................................................................................................25

pg. I
LIST OF FIGURES

Page
Figure No. Figure Caption
No.

Figure 1 A Man Standing 11

Figure 2 A Women Running 19

Figure 3 Kid Playing Football 20

Figure 4 A Man Running 20

Figure 5 A Women Standing and Smiling at Camera 21

pg. I
CHAPTER 1
Introduction
1.1 Problem Statement:
Human Pose Estimation (HPE) defines the process to identify and classify various
positions or places of limbs, joints, and other prominent landmarks in an image or a
video. These technologies are key to various domains of healthcare, sports analytics,
entertainment, human-computer interaction, and security. The ultimate challenge is
to develop a strong system that can work out the pose of any human in any real-world
scenario, which is largely affected by environmental conditions and movements of
people.

The problem this project addresses is estimating diverse and dynamic poses of human
humans with reliability and in efficiency. Human poses are generally very complex
and dynamic due to differences in body shapes, body postures, and movements.
These variations are often creating occlusions, meaning parts of the body could be
hidden from the camera, or ambiguous poses, meaning there could be two or more
different poses appearing to be very similar under visual inspection. The complexity
of this problem is enhanced with the addition of background clutter, lighting
conditions, and the requirement for real-time processing. For example, a person can
be partly or totally occluded by objects, or otherwise, they can be posed at an unusual
angle with respect to the camera. Such scenarios make it challenging for traditional
computer vision techniques to work without advanced models to obtain the pose
precisely.

Human pose estimation is important due to its wide range of applicability across
several industries. In healthcare, accurate pose estimation can be helpful in physical
therapy, monitoring the movements of patients, and helping them in rehabilitation.
In sports analytics, it can be used to track and analyze athletes' movements to
optimize performance and reduce risks of injury. Moreover, in entertainment and
gaming, pose estimation can contribute to creating more immersive and interactive
experiences by enabling gesture recognition and motion capture for virtual
characters. In security and surveillance, it could be applied in the area of unusual
behavior, identifying individuals, and monitoring activities in crowds.

pg. 1
The problem is even significant in HCI, where human gestures and body language
play crucial roles in interacting intuitively with devices. Towards more advanced
interfaces, such as AR and VR, the role of human pose estimation is fundamental to
ensure effective user experience. Additional applications of precise human pose
estimation include autonomous systems, such as robots or self-driving vehicles that
need to perceive and navigate safe human environments.

This project has focused on developing an advanced human pose estimation model
using state-of-the-art machine learning techniques for optimizing accuracy and
efficiency under real-world conditions. The project aims to improve the methods in
handling variability in human pose, addressing occlusion, and dealing with
environmental conditions, providing a solution with the potential to transform
industries and their applications.

1.2 Motivation:
The motivation behind this project is the increasing importance of human pose
estimation in various fields, which is driven by the advancement of computer vision
and artificial intelligence. Human pose estimation allows machines to understand and
interpret human body movements, making it a very important technology for
applications in healthcare, sports, entertainment, and human-computer interaction.

In healthcare, pose estimation will help in the monitoring of patients during

rehabilitation to ensure that exercises are done correctly and for recovery. In sports,
detailed performance analysis helps athletes optimize movements and reduce the risk
of injuries. Similarly, in entertainment, HPE will enhance VR and gaming
experiences by translating real-time human movements into digital environments,
thus enhancing interactions.

Another significant role of pose estimation in human-computer interaction is

enabling gesture-based controls and intuitive touchless interfaces. Added to this, the
advancement in AI and deep learning provides new approaches to accuracy,
occlusion, and real-time processing challenges in this project, making it highly timely

pg. 2
and impactful. This project aims at contributing to building the most efficient and
precise solution in pose estimation systems, which should help transform some of
the numerous industries, whether it is medical, entertainment-based, or related to
other branches, into exciting and accessible human experiences.

1.3 Objective:
This project aims to build a strong and precise Human Pose Estimation (HPE)
model, which can apply machine learning algorithms to detect and analyze human
body movements in real-time from visual data, like images or videos. The specific
objectives of the project are outlined below:

Development of a Machine Learning Model: The first objective is to develop a

machine learning model, utilizing deep learning techniques, such as convolutional
neural networks (CNNs), to perform human pose estimation. This model should be
capable of identifying key body landmarks (e.g., joints, limbs) and predicting
human poses accurately across a variety of conditions.

Dataset Preparation and Model Training: To train the model effectively, a large,
diverse dataset containing annotated human pose landmarks is used. The dataset
will help the model learn and generalize human poses under different poses,
occlusions, and environmental conditions.

Real-time Performance Optimization: The objective of the project is to achieve

real-time pose estimation, where visual data is processed without introducing
significant latency and delay. Such applications include those in health care,
gaming, and human-computer interaction that require very low latencies.

The model will be evaluated with various performance metrics such as accuracy,
robustness, and speed. Special focus will be placed on improving the model's
capability to handle challenges like occlusions, varying body shapes, and different
viewing angles.

pg. 3
Demonstration of application: Lastly, the project will intend to show how the
model will practically and practically apply to the real life in healthcare, sports
analytics, and virtual reality by giving its potential and transforming industries
altogether.

1.4 Scope of the Project:

Scope: This project develops a Human Pose Estimation model using machine learning to
detect and track human body poses from visual data, including images and videos. Primary
scope includes the following:

Human Pose Estimation: Identify and track important body landmarks like joints (for
example, elbows, knees, wrists) and limbs in both static and dynamic environments.

Machine Learning Integration: The work uses deep learning techniques, such as CNNs, to
improve the accuracy of pose prediction. It further fine-tunes the pre-trained models,
OpenPose and PoseNet, to achieve performance.

Real-Time Processing: One of the primary goals is to have the model pose estimation in real-
time, useful for applications in healthcare, sports, gaming, and human-computer interaction.

Evaluation and Optimization: The project will evaluate the performance of the model in
terms of accuracy, speed, and robustness, especially under changing conditions such as
occlusions or different body angles.

Limitations: Despite its objectives, the project faces several limitations:

Dataset Limitations: The quality and diversity of the dataset used to train the model limit its
performance. Limited datasets with insufficient representations of various body types,
movements, and environmental conditions can impair the generalization of the model.

Occlusion and Viewpoint Variations: The model would fail in situations involving
occlusions, for example, where parts of the body are not visible, and extreme variations of
body posture and viewpoint, causing a decrease in accuracy.

Computational Resources: Real-time pose estimation takes up many computational

resources and could become a performance limiter in resource-constrained environments or
devices.

pg. 4
CHAPTER 2
Literature Survey

2.1 Relevant Literatures

2.1.1 Human Pose Estimation Using Deep Learning: A Systematic Literature

Review [1]

Samkari, E., Arif, M., Alghamdi, M., & Al Ghamdi, M. A. (2023). Human pose
estimation using deep learning: a systematic literature review. Machine Learning and
Knowledge Extraction, 5(4), 1612-1659.

2.1.2 Deep Learning-based Human Pose Estimation: A Survey [2]

Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., ... & Shah, M. (2023). Deep
learning-based human pose estimation: A survey. ACM Computing Surveys, 56(1), 1-
37.

2.1.3 Human Activity Recognition Using Pose Estimation and Machine

Learning Algorithm [3]

Gupta, A., Gupta, K., Gupta, K., & Gupta, K. (2021). Human Activity Recognition Using
Pose Estimation and Machine Learning Algorithm. In ISIC (Vol. 21, pp. 25-27).

2.2 Existing Models, Techniques, or Methodologies Related to the

Problem
2.2.1 OpenPose (Cao et al., 2017)
OpenPose is one of the most widely known models for human pose estimation. It uses
a multi-stage architecture consisting of several convolutional neural networks (CNNs)
to detect human body keypoints (e.g., joints, limbs) in both single-person and multi-

pg. 5
person settings. OpenPose works in two stages: a first stage generates part confidence
maps, and the second stage refines those maps to produce final joint locations.
OpenPose has been a breakthrough in real-time pose estimation, enabling efficient and
accurate detection of keypoints even in challenging scenarios like occlusions or
varying poses. It's also capable of detecting facial landmarks and hand poses, making it
a comprehensive framework.

2.2.2 PoseNet (Google Research, 2017)

PoseNet is a lightweight model designed for real-time human pose estimation. It uses a
single neural network for both pose detection and localization. Unlike OpenPose,
PoseNet operates in a more streamlined fashion and is optimized for mobile devices,
which makes it suitable for applications requiring low-latency performance, such as
augmented reality (AR) and robotics. PoseNet can perform pose estimation on both
images and video streams, providing a trade-off between accuracy and processing
speed. While it might not achieve the same level of accuracy as OpenPose, its
efficiency in real-time applications is a major advantage.

2.2.3 Convolutional Pose Machines (CPM) (Chen et al., 2016)

Convolutional Pose Machines is an architecture designed to detect human pose in a
progressive manner. Unlike traditional methods that directly predict keypoint locations,
CPM refines predictions across multiple stages. In each stage, the model improves the
pose estimation by considering contextual information from previous stages. This
approach enhances the accuracy of pose predictions, especially in cases where there are
occlusions or complex poses. CPM has proven effective in handling single-person pose
estimation and is often used in research applications.

2.2.4 HRNet (Sun et al., 2019)

HRNet (High-Resolution Network) focuses on preserving high-resolution
representations throughout the network. It is known for maintaining high accuracy in
human pose estimation, especially in scenarios with occlusions or small-body parts.
HRNet employs multiple parallel networks that work at different resolutions, merging
features from each to enhance the final pose estimation. This approach has set new
benchmarks for pose estimation, demonstrating superior performance in human pose

pg. 6
detection tasks. HRNet is particularly effective for fine-grained details and has been
recognized as one of the state-of-the-art methods.

2.2.5 AlphaPose (Fang et al., 2017)

AlphaPose is another notable model for multi-person human pose estimation. It is
based on a two-stage architecture: the first stage detects human bounding boxes, and
the second stage identifies keypoints within those boxes. AlphaPose achieves high
accuracy in detecting poses in crowded scenes, making it one of the best models for
applications in surveillance or public spaces. AlphaPose is known for its robustness
and ability to track multiple people simultaneously, handling occlusions better than
many other models.

2.2.6 Integral Pose Regression (Xiao et al., 2018)

Integral Pose Regression is a method that directly regresses human pose from an image
in a single step, rather than using intermediate representations like heatmaps or part
confidence maps. This model predicts the positions of body keypoints by applying a
regression technique on the image features, offering faster and more efficient
processing. While it may not offer the same level of detailed refinement as multi-stage
methods (like CPM or OpenPose), it significantly improves processing speed, making
it suitable for real-time applications with lower computational requirements.
2.2.7 Mask R-CNN (He et al., 2017)
Although primarily known for object detection, Mask R-CNN has been extended to
human pose estimation through modifications that enable the detection of keypoints.
Mask R-CNN combines region-based convolutional neural networks (R-CNN) with a
segmentation mask and keypoint detection to output both object segmentation and
human pose estimation. It is effective in detecting keypoints in both static and dynamic
environments, and its ability to handle complex scenes and occlusions makes it
versatile for pose estimation in various real-world applications.

pg. 7
2.3 Gaps or Limitations in Existing Solutions and How This Project
Addresses Them
Despite excellent improvement in Human Pose Estimation, by models like OpenPose,
PoseNet, and HRNet, there are still many gaps and limitations that prevent the solution
from being widely deployed into real applications and that prevent it from reaching better
performance. As follows, key limitations of current solutions along with how this project
will address them:

2.3.1 Handling of Occlusions and Overlapping Poses

One of the greatest challenges when doing human pose estimation is trying to identify
accurate keypoints for all the parts of a human body with either partial occlusions, meaning
those body parts being covered, or if multiple people were found in a frame. Current state-
of-the-art models OpenPose and AlphaPose could manage the former situations reasonably
but did poorly with highly occluded people and overlapped poses in crowding
environments.

Project Contribution: This project focuses on enhancing the robustness of pose

estimation models through improved techniques in handling occlusions and overlapping
poses. By using more advanced model architectures, such as HRNet, that preserve high-
resolution features, this project will attempt to minimize the errors associated with
detecting keypoints when parts of the body are obscured.

2.3.2 Real-Time Performance with High Accuracy

Existing Limitation: While models like PoseNet offer real-time pose estimation, they
sacrifice some level of accuracy, especially in challenging conditions such as extreme body
angles, varying poses, or different lighting conditions. OpenPose, although more accurate,
requires significant computational resources and is not ideal for real-time applications on
resource-constrained devices.

Project Contribution: The objective of this project is to balance between accuracy and
real-time performance. Model optimization will be used to speed up the processing with no
loss of accuracy, mainly through model pruning and transfer learning, which increases
computational efficiency, especially for real-time or mobile applications.

pg. 8
2.3.3 Generalization to Diverse Human Poses and Body Types

Existing Limitation: Many existing models struggle to generalize well across diverse
human body types, ages, and poses. For instance, models trained on limited datasets might
not perform well with unusual poses or on datasets that contain people of various body
shapes, ethnicities, or in non-ideal conditions (e.g., poor lighting, low-quality video).

Project Contribution: This project will focus on generalizing the model by using
diversified and extensive datasets for training. In addition, data augmentation techniques
will be applied to increase variability and robustness in model performance, making sure
that the system can handle different body types, movements, and challenging
environmental conditions.

2.3.4 Scalability in Multi-Person Scenarios

Existing Limitation: Multi-person pose estimation, especially in crowded scenes, is still a

challenging task. Models like OpenPose and AlphaPose can handle multiple people, but
performance degrades with increasing numbers of individuals, especially when people are
tightly packed or in close proximity. This limitation affects the application of pose
estimation in areas like surveillance, sports team analysis, and crowded public spaces.

Project Contribution: This project contributes to the scalability of the multi-person pose
estimation capabilities. With the help of PAFs and optimization of the network for
detecting poses of multiple individuals at once, the system would be more effective in
handling dense crowds and would provide better tracking for multiple people.

2.3.5 Latency and Processing Speed

Existing Limitation: Low latency is a requirement in applications like AR or VR for a

seamless user experience. The current pose estimation models, however, which rely on
high-resolution networks or multiple stages of refinement (such as OpenPose, HRNet),
tend to be plagued by high processing latency.

Project Contribution: This project will center around reducing latency by optimizing the
architecture to make faster inference without losing too much accuracy. Techniques such
as model quantization, knowledge distillation, and backend optimization for faster pose

pg. 9
estimation are possible ways to make it feasible in real-time application for AR/VR and
robotics.

2.3.6 Training/model deployment complexity

Existing Limitation: Most of the existing pose estimation models require large
computational resources for training and inference. This makes them challenging to deploy
in low-resource environments such as mobile devices or edge computing platforms.

Project Contribution: This limitation will be addressed by the project by implementing

lightweight, efficient versions of the model that can be deployed on mobile or edge
devices. This includes simplifying the network architecture and using transfer learning
from pre-trained models to ensure the system can work on devices with lower
computational power.

pg. 10
CHAPTER 3
Proposed Methodology

3.1 System Design

Image Description and Pose Estimation Overview

The image shows a man standing upright in a neutral pose, probably taken in a
controlled environment. In the image, the human figure is well defined, with
various body parts such as the head, shoulders, elbows, wrists, hips, knees, and
ankles forming the key points that are essential for pose estimation. The algorithm,
therefore, has managed to track the position of each of these key landmarks and,
hence, outline the human body's skeletal structure using visual markers such as dots
or lines at each joint position.

pg. 11
Considering pose estimation, this is the perfect case; the body is fully visible with
no occlusions, and the algorithm can perfectly predict the positioning of all major
joints and limbs. The pose estimation system has probably utilized deep learning
approaches, such as CNNs, to identify the posture of the person, and then translate
that into a digital skeleton representation. The accuracy of the system can be seen
in the exact placement of joints and limbs, and each keypoint, such as the nose,
elbows, knees, and wrists, is correctly placed and connected to form the complete
skeleton.

The human pose, as detected by the system, is a pose that reflects alignment and symmetry
in the body. Since the subject is standing, the pose will generally be considered neutral, as
the limbs will be relaxed, and the weight of the body will be equally distributed. Such a
pose would be useful in a wide variety of applications: physical therapy for posture
analysis, surveillance, and even sports performance analysis for body alignment. The
model has successfully tracked the subject pose with a high degree of accuracy, which
proves the efficiency of a machine learning system in identifying human body keypoints
with accurate bounding boxes for such a static pose.

3.2 Requirement Specification

3.2.1 Hardware Requirements:

1. CPU (Processor)
 Recommended: Intel Core i5 or i7 (or equivalent AMD Ryzen)
o For image processing and running machine learning models (especially when
using frameworks like OpenCV), a multi-core processor helps to efficiently
handle the parallel processing of images.
o Models like Pose Estimation often involve heavy computation, so a multi-
core processor will speed up data manipulation and model inference.
 Minimum: Intel Core i3 or equivalent AMD processor
o This can work for lighter tasks or less complex models, but performance may
degrade with larger models or datasets.

pg. 12
2. GPU (Graphics Processing Unit)
 Recommended: NVIDIA GPU with at least 6GB VRAM (e.g., NVIDIA GTX 1060,
1660, RTX 2060, or better)
o For deep learning tasks like pose estimation using models like OpenPose,
HRNet, or PoseNet, having a dedicated GPU is crucial for speeding up the
model's training and inference times.
o A CUDA-enabled GPU is necessary to utilize GPU acceleration,
significantly improving performance when using deep learning libraries like
TensorFlow, PyTorch, or OpenCV.
 Minimum: NVIDIA GTX 1050 Ti, 4GB VRAM (or equivalent)
o If you're working with smaller models or a pre-trained model (without fine-
tuning), this GPU should still allow you to run pose estimation with decent
performance. However, for large datasets or real-time processing, this might
be slower.

3. RAM (Memory)
 Recommended: 16 GB or more
o Pose estimation algorithms, especially when dealing with high-resolution
images or video data, require a good amount of RAM to load and process
data efficiently. For deep learning tasks (model inference or training), more
memory ensures smooth processing and faster performance.
 Minimum: 8 GB
o While 8 GB RAM can work for basic image processing tasks, you might
experience slower performance or memory-related issues when working with
more complex models, large datasets, or real-time applications.

4. Storage (Hard Drive)

 Recommended: SSD (Solid State Drive) with at least 256 GB (preferably 512 GB
or more)
o An SSD will improve the speed of data loading and model inference
significantly compared to traditional hard drives. SSDs allow faster access to
your images, datasets, and models.

pg. 13
o If you plan to store large video datasets or process real-time streams, a larger
SSD would be beneficial.
 Minimum: HDD with at least 1 TB (or SSD with 120 GB)
o A traditional HDD might suffice for small datasets or offline processing, but
it will be much slower in data access and may negatively impact overall
performance. If you're working on large datasets, an SSD is highly
recommended.

5. Operating System
 Recommended: Linux (Ubuntu or other distributions) or Windows 10 (64-bit)
o Linux is often preferred for machine learning tasks due to better compatibility
with various libraries, packages, and faster overall performance. It also
provides better support for GPU acceleration through CUDA.
o Windows 10 is also fine for pose estimation and offers better compatibility
with certain frameworks like TensorFlow and OpenCV, but Linux can
sometimes offer better performance and ease of use for deep learning models.
 Minimum: Windows 10 (64-bit) or macOS
o These operating systems are suitable for development and can support most
machine learning tools, though Linux is generally preferred for training
models and handling large-scale data.

6. Other Peripheral Devices

 Webcam (Optional for real-time applications): If you plan to perform pose
estimation in real-time (e.g., for webcam-based human pose tracking), you’ll need a
good quality webcam. A 720p or 1080p webcam is sufficient for real-time pose
estimation.
 External Storage (Optional): If you are working with large datasets, you may want
external storage (such as an external HDD or SSD) to store raw image/video data,
model checkpoints, or results.

pg. 14
Summary of Average Hardware Requirements:

Component Recommended Minimum

Intel Core i3 or AMD

CPU Intel i5/i7 or AMD Ryzen 5/7
equivalent

NVIDIA GTX 1060, 1660, or RTX 2060 NVIDIA GTX 1050 Ti (4GB
GPU
(6GB VRAM) VRAM)

RAM 16 GB or more 8 GB

Storage 256 GB SSD or more 1 TB HDD or 120 GB SSD

Windows 10 (64-bit) or
OS Linux (Ubuntu) or Windows 10 (64-bit)
macOS

3.2.2 Software Requirements:

1. Python (Programming Language)

 Version: 3.7 or later (3.8, 3.9, or 3.10 are also fine, but make sure to check
compatibility with the libraries you are using).
 Description: Python is the primary programming language for machine learning and
computer vision tasks. It's widely used in the development of pose estimation models
due to its simplicity and the extensive support provided by libraries like OpenCV,
TensorFlow, PyTorch, NumPy, etc.

2. Python Libraries
The following Python libraries (which you already mentioned) are essential for
your project:
1. opencv_python_headless==4.5.1.48

pg. 15
o OpenCV is used for computer vision tasks such as image reading,
manipulation, and video processing. The headless version is ideal for
environments where no graphical interface is needed.
2. streamlit==0.76.0
o Streamlit enables you to create interactive web applications for data science
projects with minimal effort. It’s useful for visualizing results like the pose
estimation output.
3. numpy==1.18.5
o NumPy is used for numerical computations, particularly for array
manipulations. It’s crucial for handling image data and performing the
necessary mathematical operations for pose estimation.
4. matplotlib==3.3.2
o Matplotlib is a plotting library that allows you to visualize images, graphs,
and results of your pose estimation. It’s essential for displaying the pose
estimation results in a comprehensible way.
5. Pillow==8.1.2
o Pillow is a library for image processing, enabling you to read, edit, and save
images in various formats. It’s useful for image loading and preprocessing
before running pose estimation algorithms.

3. Deep Learning Frameworks (Optional, Depending on Model Choice)

If you're using a deep learning-based model like OpenPose, HRNet, or others,
you may need deep learning frameworks for building and training the models.
 TensorFlow (Recommended: version 2.x)
o A comprehensive open-source framework for machine learning and deep
learning tasks. TensorFlow is commonly used for training and deploying
machine learning models, including human pose estimation.
 PyTorch
o Another popular framework for machine learning that’s widely used for deep
learning research and production systems. PyTorch is preferred by many
researchers due to its dynamic computation graph and ease of use.
 Keras (if you're using TensorFlow)

pg. 16
o Keras is a high-level neural networks API, written in Python, running on top
of TensorFlow. It simplifies the process of building and training deep
learning models.
 ONNX (Optional)
o Open Neural Network Exchange (ONNX) is a format that allows models to
be transferred across different frameworks (e.g., TensorFlow to PyTorch). If
your project involves working with different deep learning frameworks,
ONNX can be beneficial.

4. Package Management
 pip (Python Package Installer)
o Use pip to install, upgrade, or remove Python libraries and packages from the
Python Package Index (PyPI).
o Command example: pip install opencv-python-headless numpy streamlit
matplotlib Pillow
 virtualenv or conda (Optional)
o virtualenv or conda (Anaconda) helps you create isolated environments for
Python projects. This is useful when you need specific library versions or
avoid conflicts with system-wide packages.
For virtualenv:
o Create a virtual environment: python -m venv your_project_name
o Activate it: source your_project_name/bin/activate (Linux/macOS)
your_project_name\Scripts\activate (Windows)
For conda:
o Create a new environment:
conda create -n your_project_name python=3.8
o Activate it: conda activate your_project_name

pg. 17
5. Additional Tools & Dependencies
 CUDA & cuDNN (for NVIDIA GPUs)
o If you're using an NVIDIA GPU for acceleration, you will need to install
CUDA and cuDNN to take advantage of GPU computing. These are essential
for running models efficiently in frameworks like TensorFlow or PyTorch.
o CUDA: A parallel computing platform and programming model that enables
software to use GPU hardware for general-purpose computing.
o cuDNN: A GPU-accelerated library for deep neural networks, useful for
faster training and inference of models.
These can be installed by following the official guidelines provided by NVIDIA
for setting up CUDA and cuDNN with your chosen deep learning framework
(TensorFlow or PyTorch).
 Jupyter Notebook (Optional)
o Jupyter Notebook is an interactive environment where you can run Python
code, visualize outputs, and create documents with code and results together.
This can be useful during the development phase for experimenting with code
and visualizing intermediate results.

6. Operating System
The operating system you use will play a role in determining how you install
and use the above software packages.
 Recommended: Linux (Ubuntu, CentOS)
o Linux is the most widely used OS for deep learning tasks because of its
compatibility with deep learning libraries, good package management, and
support for GPU acceleration (CUDA).
 Alternative: Windows 10 (64-bit)
o While Windows is perfectly capable of running most software, Linux is
typically preferred for machine learning tasks due to its better support for
certain libraries and GPU frameworks.
 macOS
o macOS can also be used, but it may not be as well-suited for GPU-accelerated
deep learning tasks (unless you're using Apple's M1 chip, which has growing
support for machine learning).

pg. 18
CHAPTER 4
Implementation and Result

4.1 Snap Shots of Result:

Fig 1

The picture of a woman running with real-time movements tracked using human pose
estimation technology is captured. This detects and shows important body joints on
the model, such as the head, shoulders, elbows, hips, knees, and ankles, to create a
form of mapping out the posture of running. This technology gives an accurate
reflection of the motion of a woman, enabling the system to track her dynamic pose
and analyze her gait. It is, therefore, a possibility that this kind of estimation in such
images can be put to application, for instance in fitness analysis, motion capture, or
health monitoring. The points are shown tracked on her fluid movement as she runs.

pg. 19
Fig 2

The child in the image is dribbling the football, giving a lively and energetic feel of
the movements. With their body postures, it is quite likely that they are dribbling
with a lot of concentration and enthusiasm. Using pose estimation technology, the
body joints such as the feet, knees, hips, and torso are tracked for analyzing the stance
and motion. This can be useful for deriving how the child coordinates their balance
and movement to prevent falls while playing. The dynamic view of active play by
the child makes it useful for any sporting training, injury prevention, or simply how
children move during sport activities like football.

Fig 3

pg. 20
The image shows a man running, with his body posture indicating speed and strength.
His legs are in full stride, and his arms are likely swinging to maintain balance. Pose
estimation technology can track the key points of his body, such as his head,
shoulders, elbows, knees, and ankles, to analyze his running form. The data from
these key points can provide insights into his running efficiency, posture, and
biomechanics, helping in areas like athletic performance improvement, injury
prevention, or even providing feedback for optimizing running techniques. The man's
dynamic motion is captured as he speeds ahead.

Fig 4

It has a woman apparently standing with weight even distribution between the legs.
The person's posture, therefore, will be interpreted in terms of how calm or stable
she looks. If it has been followed and tracked with some pose estimation technology,
her key points from her head to wrists, elbows, shoulders, hips, knees, and ankles,
would have their corresponding mapped outlines to study further her posture as well
as align. Such analysis may help in examining body posture or establishing
ergonomical health and can help detect imbalances or the unhygienic postures
leading to ineffective communication. It is still with relatively calmer movements, as
the system and application based on tracking are more dynamic in comparison.

4.2 GitHub Link for Code:

You can explore my projects and contributions on GitHub:

[https://github.com/bajishaikh18/AICTE-Internship]
pg. 21
CHAPTER 5
Discussion and Conclusion

5.1 Future Work & Model Improvements

Accuracy Boost: Though the current pose estimation model offers excellent
results, its precision can still be improved. Most notably, such scenarios like
occlusions, overlapping body parts, or abnormal poses can make accuracy less
effective. Advanced techniques include multi-scale pose estimation,
transformer-based models, etc, which might aid in the overall performance
boost of the system under such cases.

Real-time Performance: The model currently does not work optimally for
real-time applications, especially for video streams. Optimizing the model for
faster inference times or employing lightweight architectures such as
MobileNetV2 or EfficientNet can enable smoother real-time tracking for
mobile devices or edge devices.

Pose Estimation for Multiple People: Enhancing the model to be able to

handle multiple people in the same frame would enhance its application in
environments like crowded sports events or group fitness training. Techniques
such as multi-person pose estimation, which detects and tracks multiple
individuals at the same time, would be valuable.

Data Augmentation: In order to improve the robustness of the model,

incorporating more diverse data through data augmentation techniques can help
handle varied real-world conditions. Augmenting with images/videos from
different environments, lighting conditions, or people of varying body types
could make the model more generalized.

pg. 22
Integration with Other Sensors: By combining the pose estimation with other
sensors such as depth cameras or IMUs, spatial information accuracy could
improve. Such applications would especially find value for virtual reality,
fitness, or rehabilitation use cases.

User Feedback Loop: An inclusion of the mechanism for the feedback of a

user might allow the refining of the model for pose estimation in real world.
Users might provide feedback based on accuracy and use this information for
further refinement over time.

Application scope: The application currently focuses on pose tracking. Future

work may expand its scope to include gesture recognition, action recognition,
or emotion analysis based on pose information for a deeper understanding of
human movement.

5.2 Summary of Overall Impact and Contribution

The Human Pose Estimation project is a groundbreaking development in the field of
computer vision, especially in tracking and analyzing human movement. Its major
contribution is the provision of an effective and efficient method to detect and track
human poses, both in images and videos, which can be widely applied in various
domains, from fitness tracking and rehabilitation to entertainment, sports, and augmented
reality.

The heart of this work is the potential to provide accurate real-time pose estimation.
State-of-the-art deep learning models, such as CNNs, and advanced models like
OpenPose or MediaPipe, are applied to identify main body landmarks in order to perform
precise movement tracking. This ability ensures users can inspect posture, gestures, and
body alignment as a method of contributing to fitness applications by ensuring proper
form during exercises, which is very crucial for avoiding injuries and maximizing
effectiveness. Besides, for an individual undergoing physical therapy, the model can
keep track of progress and suggest corrections in movements, and this is a significant
role in rehabilitation.

pg. 23
Image and video pose estimation is one of the immense contributions of the project. This
versatility in the system helps it be adaptable to various cases, from still image analysis
through the analysis of sports actions and artistic poses, to dynamic video tracking, of
use in surveillance, sports performance analysis, or virtual training scenarios. The
functionality of processing feeds in real time enhances its applied use in such live settings
as the sports event or fitness class or interactive games.

Its ability to follow the human pose in less controlled environments, such as different
lighting or complex backgrounds, helps explain why this model seems to be robust and
reliable. This aspect of the project broadens the application fields where it can work
efficiently. In addition, the flexibility of the app in accepting both image and video inputs
makes it an accessible tool for a wide range of users, from fitness enthusiasts and athletes
to healthcare providers and developers in need of pose data.

This is a pioneering tool for the analysis of human movement, and in showing the power
of AI and deep learning in understanding and interpreting human posture, it contributes
meaningfully in practical applications within healthcare, fitness, entertainment, and
myriad other industry-specific uses. By improving human-computer interaction, it has
the potential to transform various industries toward personalized health and sports and
even interactive technologies.

pg. 24
REFERENCES

Human Pose Estimation Using Deep Learning: A Systematic Literature Review [1]

Samkari, E., Arif, M., Alghamdi, M., & Al Ghamdi, M. A. (2023). Human pose
estimation using deep learning: a systematic literature review. Machine Learning and
Knowledge Extraction, 5(4), 1612-1659.

Deep Learning-based Human Pose Estimation: A Survey [2]

Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., ... & Shah, M. (2023). Deep
learning-based human pose estimation: A survey. ACM Computing Surveys, 56(1), 1-
37.

Human Activity Recognition Using Pose Estimation and Machine

Learning Algorithm [3]

Gupta, A., Gupta, K., Gupta, K., & Gupta, K. (2021). Human Activity Recognition Using
Pose Estimation and Machine Learning Algorithm. In ISIC (Vol. 21, pp. 25-27).

pg. 25

Body Posture Detection Report
No ratings yet
Body Posture Detection Report
47 pages
Research Proposal PDF
No ratings yet
Research Proposal PDF
4 pages
Minor Project Presentation
100% (1)
Minor Project Presentation
15 pages
Roo Project
No ratings yet
Roo Project
16 pages
CNN, MTCNN, Caps-net Face Recognition Analysis
No ratings yet
CNN, MTCNN, Caps-net Face Recognition Analysis
35 pages
AI Fitness Trainer
No ratings yet
AI Fitness Trainer
5 pages
Be - Computer Engineering - Semester 5 - 2023 - October - Data Science and Visualization DSV 2019 Pattern
No ratings yet
Be - Computer Engineering - Semester 5 - 2023 - October - Data Science and Visualization DSV 2019 Pattern
2 pages
Optimizing Huffman Coding For Modern GPU Architectures
No ratings yet
Optimizing Huffman Coding For Modern GPU Architectures
10 pages
Latest Seminar Report Yash Ingole
No ratings yet
Latest Seminar Report Yash Ingole
35 pages
Face Recognition Attendance System
No ratings yet
Face Recognition Attendance System
12 pages
Python and Machine Learning: A Practical Training Report On
No ratings yet
Python and Machine Learning: A Practical Training Report On
65 pages
Malicious Twitter Bots Detection Using Machine Learning: A Mini Project Report
No ratings yet
Malicious Twitter Bots Detection Using Machine Learning: A Mini Project Report
54 pages
Diabetes Project Using Machine Learning
100% (1)
Diabetes Project Using Machine Learning
49 pages
ML Project Report
No ratings yet
ML Project Report
40 pages
Google ML Kit Body Pose Detection
No ratings yet
Google ML Kit Body Pose Detection
12 pages
Image Fusion Techniques for Students
100% (1)
Image Fusion Techniques for Students
48 pages
Stress Detection via Machine Learning
No ratings yet
Stress Detection via Machine Learning
9 pages
Content-Based Fake News Detection With Machine and Deep Learning
No ratings yet
Content-Based Fake News Detection With Machine and Deep Learning
13 pages
Predicting Cyberbullying On Social Media in The Big Data Era Using Machine Learning Algorithms Review of Literature and Open Challenges PDF
No ratings yet
Predicting Cyberbullying On Social Media in The Big Data Era Using Machine Learning Algorithms Review of Literature and Open Challenges PDF
18 pages
Detecting Stress Based On Social Interactions in Social Networks
100% (1)
Detecting Stress Based On Social Interactions in Social Networks
4 pages
Final Document
No ratings yet
Final Document
51 pages
Classification of Lung Diseases Using Deep Learning Models
No ratings yet
Classification of Lung Diseases Using Deep Learning Models
120 pages
Skin Cancer Detection Using Convolutional Neural Network
No ratings yet
Skin Cancer Detection Using Convolutional Neural Network
8 pages
AI Transforming Healthcare
No ratings yet
AI Transforming Healthcare
7 pages
Application of Explainable AI For Diagnosis of Coronary Heart Disease
No ratings yet
Application of Explainable AI For Diagnosis of Coronary Heart Disease
8 pages
Project Detecto!: A Real-Time Object Detection Model
No ratings yet
Project Detecto!: A Real-Time Object Detection Model
3 pages
Machine Learning for Prescription OCR
No ratings yet
Machine Learning for Prescription OCR
4 pages
Autoencoders
No ratings yet
Autoencoders
72 pages
Image Processing Final Report
No ratings yet
Image Processing Final Report
44 pages
Report
100% (1)
Report
32 pages
Final Project Presentation On "Ai Face Detection"
No ratings yet
Final Project Presentation On "Ai Face Detection"
12 pages
ECE 5th Sem Syllabus
0% (1)
ECE 5th Sem Syllabus
84 pages
Unit - 1 Deep Learning Techniques
No ratings yet
Unit - 1 Deep Learning Techniques
18 pages
Bird Species Project Report Final
No ratings yet
Bird Species Project Report Final
50 pages
Mani Bharathi
No ratings yet
Mani Bharathi
1 page
APT3F1706CYB CE00336 6 IPCV Incourse Assignment
0% (1)
APT3F1706CYB CE00336 6 IPCV Incourse Assignment
9 pages
Parkison's Diseases Prediction Using Machine Learning
No ratings yet
Parkison's Diseases Prediction Using Machine Learning
10 pages
Human Activity Recognition
No ratings yet
Human Activity Recognition
40 pages
Divyansh - Goel - Resume - Latex - v3 - Divyansh Goel
No ratings yet
Divyansh - Goel - Resume - Latex - v3 - Divyansh Goel
2 pages
Bigdata and Hadoop
No ratings yet
Bigdata and Hadoop
27 pages
Data Mining for Liver Disorder Prediction
No ratings yet
Data Mining for Liver Disorder Prediction
8 pages
Resume Microsoft Swe
No ratings yet
Resume Microsoft Swe
1 page
Object Detection and Tracking in Video Sequences
No ratings yet
Object Detection and Tracking in Video Sequences
6 pages
6364e8cketans PPT Stroke Prediction
No ratings yet
6364e8cketans PPT Stroke Prediction
8 pages
Seminar
No ratings yet
Seminar
10 pages
YOLO Object Detection Report
No ratings yet
YOLO Object Detection Report
42 pages
Image Analysis - Pattern Recognition - Pattern Patterns Represent Knowledge
No ratings yet
Image Analysis - Pattern Recognition - Pattern Patterns Represent Knowledge
22 pages
SVCE Seminar Report Format (FINAL)
No ratings yet
SVCE Seminar Report Format (FINAL)
6 pages
Soft Computing Honors Course
No ratings yet
Soft Computing Honors Course
5 pages
Resume Microsoft Swe
No ratings yet
Resume Microsoft Swe
1 page
Deep Neural Networks Explained
No ratings yet
Deep Neural Networks Explained
12 pages
Module 1:image Representation and Modeling
No ratings yet
Module 1:image Representation and Modeling
48 pages
AI in Daily Life IEEE Paper
No ratings yet
AI in Daily Life IEEE Paper
2 pages
A Comparative Study Deepfake Detection Using Deep-Learning
No ratings yet
A Comparative Study Deepfake Detection Using Deep-Learning
5 pages
MD Kashif (Mini Project)
No ratings yet
MD Kashif (Mini Project)
28 pages
Harsh Kathiriya Resume
No ratings yet
Harsh Kathiriya Resume
1 page
AICTE Internship 2024 Project Report Template 2
No ratings yet
AICTE Internship 2024 Project Report Template 2
14 pages
Major - Project Report VIII Sem
No ratings yet
Major - Project Report VIII Sem
87 pages
Pose Detection System
No ratings yet
Pose Detection System
26 pages
Human Pose Estimation Using Ai Machine Learning Algorithms ICCIDT2K23 226
No ratings yet
Human Pose Estimation Using Ai Machine Learning Algorithms ICCIDT2K23 226
6 pages
Third Grade Physical Science Unit
No ratings yet
Third Grade Physical Science Unit
2 pages
Change Request Form Restore (New)
No ratings yet
Change Request Form Restore (New)
1 page
Narrative Report - DLAC - Varied Teaching Strategies
No ratings yet
Narrative Report - DLAC - Varied Teaching Strategies
9 pages
Dada Olatunde Unnoficial Transcript
No ratings yet
Dada Olatunde Unnoficial Transcript
5 pages
Compounds Stress
No ratings yet
Compounds Stress
7 pages
Easement or Servitude
No ratings yet
Easement or Servitude
16 pages
Module 1PYQ
No ratings yet
Module 1PYQ
14 pages
Vectra XL 250 Vacuum Pump Data
No ratings yet
Vectra XL 250 Vacuum Pump Data
1 page
Study 1
No ratings yet
Study 1
6 pages
1.3.8 Write - Prepare A Personal Narrative (Writing Guide)
No ratings yet
1.3.8 Write - Prepare A Personal Narrative (Writing Guide)
5 pages
Kahoot! Admin Guide for Schools
No ratings yet
Kahoot! Admin Guide for Schools
13 pages
Activity - Chapter 8
No ratings yet
Activity - Chapter 8
3 pages
MCQ On Food 0 Nutrition
100% (2)
MCQ On Food 0 Nutrition
5 pages
Cse3009 Internet-Of-Things Eth 1.0 37 Cse3009
No ratings yet
Cse3009 Internet-Of-Things Eth 1.0 37 Cse3009
2 pages
2019 Handbook
No ratings yet
2019 Handbook
88 pages
Sephora Case
No ratings yet
Sephora Case
9 pages
APUD Cell - Wikipedia
No ratings yet
APUD Cell - Wikipedia
1 page
Useful - HATUNGIMANA Juvent
No ratings yet
Useful - HATUNGIMANA Juvent
136 pages
The Attribution of Human Traits To That Which Is Non-Human. E.G. "The Machine Howled.... The Machine Slowed Its Scream Fell To A Murmur. "Time Steps
No ratings yet
The Attribution of Human Traits To That Which Is Non-Human. E.G. "The Machine Howled.... The Machine Slowed Its Scream Fell To A Murmur. "Time Steps
6 pages
Bits CSP Faqs
100% (1)
Bits CSP Faqs
6 pages
Python Programming: Presented by - Rashmi Bca Section B' Roll No-38
No ratings yet
Python Programming: Presented by - Rashmi Bca Section B' Roll No-38
12 pages
Bearing
No ratings yet
Bearing
6 pages
Green Building Site Planning Guide
No ratings yet
Green Building Site Planning Guide
10 pages
Single-Cell Profiling of Acral Melanoma Infiltrating Lymphocytes Reveals A Suppressive Tumor Microenvironment
No ratings yet
Single-Cell Profiling of Acral Melanoma Infiltrating Lymphocytes Reveals A Suppressive Tumor Microenvironment
17 pages
Detailed Lesson Plan in Horticulture
95% (57)
Detailed Lesson Plan in Horticulture
5 pages
Marketing Plan - Hell Crust Pizza 14.09.24
No ratings yet
Marketing Plan - Hell Crust Pizza 14.09.24
39 pages
Poker Training Manual
83% (6)
Poker Training Manual
18 pages
1 005 9341 - KaVo Autoclave Ed 07
100% (1)
1 005 9341 - KaVo Autoclave Ed 07
36 pages
Tsheets For Ii B.tech Ii Semester R20
No ratings yet
Tsheets For Ii B.tech Ii Semester R20
60 pages
Grade 8 Module 1
No ratings yet
Grade 8 Module 1
6 pages

Project Report

Uploaded by

Project Report

Uploaded by

Human Pose Estimation using Machine Learning

submitted in partial fulfillment of the requirements

AICTE Internship on AI: Transformative Learning

Baji Baba Shaik , bajishaikh18@gmail.com

Under the Guidance of

Aditya Prashant Ardak

Chapter 1. Introduction .........................................................................................1

Figure 1 A Man Standing 11

Figure 2 A Women Running 19

Figure 4 A Man Running 20

In healthcare, pose estimation will help in the monitoring of patients during

Another significant role of pose estimation in human-computer interaction is

Development of a Machine Learning Model: The first objective is to develop a

Real-time Performance Optimization: The objective of the project is to achieve

1.4 Scope of the Project:

Limitations: Despite its objectives, the project faces several limitations:

Computational Resources: Real-time pose estimation takes up many computational

2.1 Relevant Literatures

2.1.1 Human Pose Estimation Using Deep Learning: A Systematic Literature

2.1.2 Deep Learning-based Human Pose Estimation: A Survey [2]

2.1.3 Human Activity Recognition Using Pose Estimation and Machine

2.2 Existing Models, Techniques, or Methodologies Related to the

2.2.2 PoseNet (Google Research, 2017)

2.2.3 Convolutional Pose Machines (CPM) (Chen et al., 2016)

2.2.4 HRNet (Sun et al., 2019)

2.2.5 AlphaPose (Fang et al., 2017)

2.2.6 Integral Pose Regression (Xiao et al., 2018)

2.3.1 Handling of Occlusions and Overlapping Poses

Project Contribution: This project focuses on enhancing the robustness of pose

2.3.2 Real-Time Performance with High Accuracy

2.3.4 Scalability in Multi-Person Scenarios

Existing Limitation: Multi-person pose estimation, especially in crowded scenes, is still a

2.3.5 Latency and Processing Speed

Existing Limitation: Low latency is a requirement in applications like AR or VR for a

2.3.6 Training/model deployment complexity

Project Contribution: This limitation will be addressed by the project by implementing

3.1 System Design

Image Description and Pose Estimation Overview

3.2 Requirement Specification

4. Storage (Hard Drive)

6. Other Peripheral Devices

Component Recommended Minimum

Intel Core i3 or AMD

Storage 256 GB SSD or more 1 TB HDD or 120 GB SSD

3.2.2 Software Requirements:

1. Python (Programming Language)

3. Deep Learning Frameworks (Optional, Depending on Model Choice)

4.1 Snap Shots of Result:

4.2 GitHub Link for Code:

You can explore my projects and contributions on GitHub:

5.1 Future Work & Model Improvements

Pose Estimation for Multiple People: Enhancing the model to be able to

Data Augmentation: In order to improve the robustness of the model,

User Feedback Loop: An inclusion of the mechanism for the feedback of a

Application scope: The application currently focuses on pose tracking. Future

5.2 Summary of Overall Impact and Contribution

Deep Learning-based Human Pose Estimation: A Survey [2]

Human Activity Recognition Using Pose Estimation and Machine

You might also like