Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
Available Online at www.ijcsmc.com
International Journal of Computer Science and Mobile Computing
A Monthly Journal of Computer Science and Information Technology
ISSN 2320–088X
IMPACT FACTOR: 7.056
IJCSMC, Vol. 9, Issue. 5, May 2020, pg.35 – 45
Computer Vision Based Mouse
Control Using Object Detection
and Marker Motion Tracking
Faiz Khan1; Basit Halim2; Asifur Rahman3
1,2,3
Department of Computer Science and Engineering, Sikkim Manipal Institute of Technology, India
1
faizk6797@gmail.com; 2 basithalim@gmail.com; 3 asifhasiba@gmail.com
Abstract— There have been a lot of developments towards the Humans Computers Interaction (HCI). Many modules have
been developed to help the physical world interact with the digital world. Here, the proposed paper serves to be a new
approach for controlling mouse movement using Colored object and marker motion tracking. The project mainly aims at
mouse cursor movements and click events based on the object detection and marker identification. The software is
developed in Python Language and OpenCV and PyAutoGUI for mouse functions. We have used colored object to perform
actions such as movement of mouse and click events. This method mainly focuses on the use of a Web Camera to develop a
virtual human computer interaction device in a cost effective manner.
Keyword--- Color Detection, HCI, PyAutoGUI, Marker Motion Identification, Object Detection
I. INTRODUCTION
The importance of computers is increasing constantly. Computer can be used for many purposes. We often used hardware
devices such as mouse and keyboard to interact with the computers. In today‟s world, technologies are evolving day by day.
One of the example is the Human-Computer Interface (HCI). In a wired mouse there is no scope to extend limit and have to
carry it everywhere. In the wireless mouse, one should have a Bluetooth installed in the system with Bluetooth dongle. Touch
screen is little expensive for average users. An alternative way can be the creation of HCI device virtually.
In recent years, research efforts seeking to provide more natural, human-centered means of interacting with computer
have gained growing interest. There are various opportunities to apply sensor technologies to heightened the user and computer
experience. Computer Vision-Based Mouse is a system to control the cursor of our computer without using any physical device
even a mouse.
Our system basically used image processing, object detection and marker motion tracking to control the mouse activities
such as its movement, left-click, right-click and double click. The user will simply hold the colored object of his choice, place
it in the viewing area of the webcam and ROI, and thus controlling the mouse by color detection also the workspace required
© 2020, IJCSMC All Rights Reserved
35
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
could be reduced. One tape will be used for controlling the movement while other colored tape can be used for click events.
Our project will provide a new experience for HCI.
The rest of the paper is as follows. In Section 2, we present similar works done by the other researcher. In Section 3, we
provide the system overview of what this project is about followed by Section 4 where will present system setup and
technologies with an overview of the functional block diagram. Section 5 and 6 concludes the paper with results and
conclusion respectively.
II. LITERATURE REVIEW
Many researchers have tried to interact with computer through video devices and web camera. Each of them used different
approaches to make a mouse cursor move without any hardware.
One method by Erdem[1] et al have control of mouse cursor using image segmentation and gesture control. In reference
[2] K. Pendke et al used hand gesture technology to interact with the computer. They have used an intuitive method to detect
hand gesture. In this approach, EmguCV SDK technology was used instead of more traditional Matlab in order to achieve more
accuracy and have a more control over the source code. [3] Abhirup Ghosh et al idea was based on color detection. But the
coloured objects in the background may cause a problem like give an erroneous response.
Our project was inspired by a paper of Ankur Yadav[4] et al where they used web camera and colored tapes to control
mouse cursor.But they used Matlab for image processing and mouse action events. Deeksha Verma[5] et al in their journal also
used Matlab for image processing. But here we propose a system to use OpenCV library for image processing and Python‟s
own GUI module PyAutoGUI to control the mouse cursor and its clicking events.
III. SYSTEM OVERVIEW
Computer Vision-Based Mouse is a system to control the cursor of our computer without using any physical device even a
mouse. Here we will essentially have a colored object in our hand. The video of the motion of our palm has been captured by
the web-camera which acts as a sensor. The colored objects are tracked and using their motion, the cursor of the mouse is
controlled. In our work, we have used 4 colors for 4 typical actions of the mouse. Yellow color for mouse cursor movement,
green color for left click, orange color for right-click and blue color for double click.
In order for it to work, we will simply use respective colors within the viewing area of the camera. The colored objects
should be placed in the Region of Interest(ROI). The video generated by the camera is detected and analyzed using image
processing and the computer cursor moves or displays its click events according to color movements, of which all the details
are provided below. Mouse handling activities are achieved by PyAutoGUI which is the python GUI module.
IV. METHODS AND TECHNOLOGIES USED
The proposed system is a computer vision application that is based on real time application system. It makes the use of
OpenCV for image processing and image acquisition and PyAutoGUI for handling mouse control in order to replace the actual
mouse with the colored object. The basic block diagram of computer vision pipeline is shown in figure 1.
Fig 1 Block Diagram of Computer Vision Pipeline
© 2020, IJCSMC All Rights Reserved
36
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
Each Steps are further divided into smaller steps, based on the application and the project.
A. Image Acquisition
Image or the frame that is detected by the web-cam is acquired as the digital representation of the visual characteristics of
the physical world. An image sensor or the web-cam is used to detect and capture the information required to make an image.
B. Image Processing
Image acquired are then processed in the next step. The signals in the acquired images are filtered to remove the noise or
any irreverent frequencies. If needed, images are padded and transformed into a different space, so to make them ready for the
actual analysis.
C. Image Analysis
The processed image is analyzed to extract useful information. This step involves many important image properties like
pattern identification, color recognition, object recognition, feature extraction, motion tracking, and image segmentation.
D. Decision Making
High dimensional data obtained from all the above steps are used to produce meaningful numerical information, which
leads to making decisions.
The detailed working of the proposed system is shown in the following flowchart.
Fig 2 Flowchart of the proposed work
© 2020, IJCSMC All Rights Reserved
37
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
A. Capture video
In the proposed architecture, the input is the colored object which is given to the web-camera. The camera reads the frame
of the video. Here, the web-camera is used as a sensor which will help in detecting and tracking the object. Video is a
collection of multiple visual images and divided into Frame Rate which ranges from 6 or 8 per second for the old mechanical
camera and 120 or more frames per second for a more professional camera.
Video can be captured by OpenCV function:
cam = cv2.VideoCapture(index)
(1) Where: VideoCapture() is a function of OpenCV which is used to capture the image from the web cam.
Index is a no. of argument or no. of camera that we want to set.
cam is output where that function will be stored.
B. Read Frame from the video
After the video is captured by the webcam, frames are read from video by the function „cam.read()‟, where the cam is the
output where the video is captured. After the frames are obtained by the OpenCV function, the image that we get is inverted.
That means if we move our object in the right direction, the image of the pointer will move towards the left and vice-versa. It is
similar when we stand in front of the mirror where left is detected as right and right is detected as left. To avoid it, we need to
vertically flip the Image[].In OpenCV, the image can be conveniently flipped vertically by the function. cv 2. flip(0) .Here 0 is
the index because we want to flip our image vertically. The image is in RGB format.
C. Image Smoothing
We need to smooth our image i.e, blur an image. Image Smoothing is useful for removing high frequencies of contents
like noise and edges from the image. Image Smoothing is very effective in reducing the Pixelation of an image. Smoothing
replaces a pixel value with the average of values of the pixels in its neighborhood. It makes use of Kernel which is a small
matric used to apply the smoothing algorithm on the image.
Fig 3 Original vs Blur
© 2020, IJCSMC All Rights Reserved
38
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
D. ROI Definition and Grid Generation
After acquiring the image, applying image smoothing and removing the noise, we need to create a Region of Interest(ROI)
and a grid. This will increase the accuracy of the marker identification and tracking. The grid will help in translating the
position of the marker to the cursor movement. ROI is where we will perform all the operations of the mouse and where our
sensor will detect the marker. We have set the ROI in the top left corner of the screen. The dimension of the grid is 3x3 and the
color of the outline is red. After the generation of ROI, We need to convert the image in BGR format to HSV format because it
is easier to identify Coloured objects in the HSV space by using the function:
cv 2. cvtColor(image, cv 2. COLOR_BGR 2 HSV) .
After converting the image from HSV, we will threshold the image to identify the objects of the specific color range. This
technique converts a grayscale image into a binary image, based on some criteria. It is mainly used for object detection.
Fig 4 ROI and Grid Generation
Once all the objects of the specific color are separated using color recognition, we will identify the object of our interest.
Now that the image is successfully converted into a binary image, we will find the contours of the colored objects from the
binary image.
E. Marker Identification
Marker Identification are of three steps:
Fig 5 Stages of Marker Identification
© 2020, IJCSMC All Rights Reserved
39
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
Marker Identification identifies the marker from all the object recognized in the image once the markers are identified in
each frame.
1) Contours for colored object: Contours is a curve joining all the continuous points and having the same color or
intensity. It is a python list of all the contours in the image. Each contour is a numpy array of x,y coordinates of boundary
points of the object. We had found the contours of the colored object by cv2.findContours(image, mode, method). This method
outputs a modified image, the contours, and the hierarchy. Hierarchy is the representation of the relationship between contours
found from an image.
2) Contours with maximum Area: Contours with maximum Area acts as contours of interest. It identifies as the object. We
have ignored the smaller contours as they might be due to some error in setting the bounds or some extraneous pixels that do
not belong to the specific object.
3) Features of contour with maximum Area: We have used contour features like centroid and bounding rectangle to track
the motion of an object. The identified object can be easily tracked from its centroid and the bounding box. The object is
bounded with the rectangle and circle as shown below.
Fig 6 Contours Identification
F. Marker Motion Tracker
Once the markers in each frame are identified, the motion is tracked in different frames by locating the centroid of the
markers in the Region of interest(ROI). We have made use of centroid location relative to the grid. Contour features are useful
to find the contour around the markers on the palm. After getting the centroid of the marker, we got the location of the centroid
in the grid that we have created. It can fall in any of the 9 regions as shown:
© 2020, IJCSMC All Rights Reserved
40
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
TABLE I
LOCATION OF THE CENTROID IN GRID
Top-Left
Top-Center
Top-Right
Mid-Left
Mid-Center
Mid-Right
Bottom-Left
Bottom-Center
Bottom-Right
In the above TABLE I, we can locate where the object is from that 9 centroids in the above table. If it is in the Top-Left
region, the cursor should move towards the top left part of the screen. If the marker is in the Top-Center region of the grid, the
cursor should move toward the top of the screen and so on. If we hold the marker in the central region, the cursor should not
move and stay still. This is how we control the cursor, based on the marker motion.
V. RESULT AND EVALUATION
In this paper, we tried to improvise the interaction between humans and the machine by controlling the cursor of the mouse
without any physical device. The proposed system tried to control the functions of the mouse pointer by detecting any yellow
colored object, right-click by orange color, left click and double click by detecting green and blue colored objects respectively.
© 2020, IJCSMC All Rights Reserved
41
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
A. Edge Detection
Fig 7 Canny Edge Detection
Figure 7 displays the edge detection of the object by canny edge detection algorithm which is a very powerful edge
detection algorithm. As we can see, everything is invisible except for the edges in the object.
B. Object Tracking
Fig 8 Object Tracking
© 2020, IJCSMC All Rights Reserved
42
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
As stated earlier, we have chosen yellow-colored objects for the cursor movement. We have set the required HSV range
of the yellow color. The detected object will be enclosed by the blue color bounding rectangle box. But in order to detect the
object, it should be placed within the viewing area of the webcam.
C. Mouse cursor movement and click-events
Instead of Matlab, we have used PyAutoGUI to move the cursor based on marker location in the grid and also click based
on the color of our choice(in this case, green for left click, orange for right-click and blue for double click) of the marker. The
mouse functions of PyAutoGUI make use of x and y coordinates of the pixels on the screen, just like any image. So, the
resolution of a screen which determines how many pixels wide and tall our screen is a significant parameter.
1) pyautogui. size() : This function returns a 2 integer tuple of the screen‟s width and height in pixels. Depending on ours
screen resolution, the return value may be different. The width and height of the screen can be stored from the above function
in variables like width and height, to access them in the programs.
2) pyautogui. moveRel(x, y) : This function will instantly move the mouse cursor from its current position to a specified
position on the screen. We have made use of this function to move our mouse cursor based on the yellow color. Integer values
of the x and y coordinates make up the function‟s first and second arguments respectively. Once the coordinates have been
determined, the coordinates are sent to the cursor. The cursor places itself in the required position with these coordinates. As
the user moves the colored object across the view of the ROI, the mouse cursor also moves across the screen.
3) pyautogui. click() : The control action of the mouse is performed by this function. The sensor has to detect the specified
color for performing the mouse event action. We have set the blue-color for double-click, green-color for left-click and
orange-color for right-click.
Fig 9 Cursor Movement
© 2020, IJCSMC All Rights Reserved
43
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
Fig 10 Left Click Event
Fig 11 Right Click Event
VI. CONCLUSION
The proposed system architecture will completely change the way people would use the computer system. This project
eliminated the need for a mouse or any physical device for cursor control.
The use of object detection, marker motion tracking and PyAutoGUI for the implementation of our proposed work proved to be
successful and the movement of the mouse and click events is achieved with the high precision accuracy. This also led to better
Humans -Computer Interaction(HCL). The proposed work has many advantages. This technology can be useful for those
patients who cannot control their limbs [5]. They can just use any colored objects from their fingertips. It also has wide
applications in the modern Gaming, Augmented Reality and Computer Graphics. In the case of computer graphics and gaming,
this technology has been applied in modern gaming consoles to create interactive games where a person‟s motions are tracked
and interpreted as commands[4]. The main aim is to reduce the use of any physical device and make use of only web cam
which is readily available with the laptop. No additional costs are required.
© 2020, IJCSMC All Rights Reserved
44
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
Though this project has many advantages, it does suffer from some drawbacks. If the background image clashed with the
specified image, then it can give an erroneous response and may not work properly [3]. So it is advisable to use this technology
where the operating background light and does not mix with the color in hand. System may run slower on certain computers of
low resolution and computational capabilities. If the camera is high on resolution., then the system might run slow. But
problem can be solved by reducing the resolution of the image.
REFERENCES
[1] A. Erdem, E. Yardimci, Y. Atalay, V. Cetin, A. E, “Computer Vision Based Mouse”, Acoustics, Speech, and Signal
Processing, Proceedings. (ICASS). IEEE International Conference, 2002.
[2] K. Pendke, P.Khuje, S. Narnaware, S. Thool, S. Nimje, “Computer Cursor Control Mechanism by Using Hand Gesture
Recognition”, International Journal of Computer Science and Mobile Computing, Vol.4 Issue.3, March- 2015,pg.293-300.
[3] A. Banerjee, A. Ghosh, K. Bharadwaj, H. Saikia, “Mouse Control Using a Web Camera Based on Colour Detection”,
International Journal of Computer Trends and Technology (IJCTT) – volume 9 number 1– Mar 2014.
[4] A. Yadav, A. Pandey, A. Singh, A. Kumar, “Computer Mouse Implementation Using Object Detection and Image
Processing ”, International Journal of Computer Applications. (IJCA), Volume 69-Number 21, 2013.
[5] D. Verma, et al., “Vision Based Computer Mouse Controlling Using Hand Gestures”, International Journal of Engineering
Sciences & Research Technology.(IJESRT), 7(6): June, 2018.
[6] Hojoon Park, “A Method for Controlling the Mouse Movement using a Real Time Camera”, Brown University, Providence,
RI, USA, Department of computer science, 2013.
[7] Chu-Feng Lien, “Portable Vision-Based HCI – A Realtime Hand Mouse System on Handheld Devices”, National Taiwan
University, Computer Science and Information Engineering Department.
[8] Kamran Niyazi, Vikram Kumar, Swapnil Mahe, Swapnil Vyawahare, “Mouse Simulation Using Two Coloured Tapes”,
Department of Computer Science, University of Pune, India, International Journal of Information Sciences and Techniques
(IJIST) Vol.2, No.2, March 2012.
© 2020, IJCSMC All Rights Reserved
45
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
Available Online at www.ijcsmc.com
International Journal of Computer Science and Mobile Computing
A Monthly Journal of Computer Science and Information Technology
ISSN 2320–088X
IMPACT FACTOR: 7.056
IJCSMC, Vol. 9, Issue. 5, May 2020, pg.35 – 45
Computer Vision Based Mouse
Control Using Object Detection
and Marker Motion Tracking
Faiz Khan1; Basit Halim2; Asifur Rahman3
1,2,3
Department of Computer Science and Engineering, Sikkim Manipal Institute of Technology, India
1
faizk6797@gmail.com; 2 basithalim@gmail.com; 3 asifhasiba@gmail.com
Abstract— There have been a lot of developments towards the Humans Computers Interaction (HCI). Many modules have
been developed to help the physical world interact with the digital world. Here, the proposed paper serves to be a new
approach for controlling mouse movement using Colored object and marker motion tracking. The project mainly aims at
mouse cursor movements and click events based on the object detection and marker identification. The software is
developed in Python Language and OpenCV and PyAutoGUI for mouse functions. We have used colored object to perform
actions such as movement of mouse and click events. This method mainly focuses on the use of a Web Camera to develop a
virtual human computer interaction device in a cost effective manner.
Keyword--- Color Detection, HCI, PyAutoGUI, Marker Motion Identification, Object Detection
I. INTRODUCTION
The importance of computers is increasing constantly. Computer can be used for many purposes. We often used hardware
devices such as mouse and keyboard to interact with the computers. In today‟s world, technologies are evolving day by day.
One of the example is the Human-Computer Interface (HCI). In a wired mouse there is no scope to extend limit and have to
carry it everywhere. In the wireless mouse, one should have a Bluetooth installed in the system with Bluetooth dongle. Touch
screen is little expensive for average users. An alternative way can be the creation of HCI device virtually.
In recent years, research efforts seeking to provide more natural, human-centered means of interacting with computer
have gained growing interest. There are various opportunities to apply sensor technologies to heightened the user and computer
experience. Computer Vision-Based Mouse is a system to control the cursor of our computer without using any physical device
even a mouse.
Our system basically used image processing, object detection and marker motion tracking to control the mouse activities
such as its movement, left-click, right-click and double click. The user will simply hold the colored object of his choice, place
it in the viewing area of the webcam and ROI, and thus controlling the mouse by color detection also the workspace required
© 2020, IJCSMC All Rights Reserved
35
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
could be reduced. One tape will be used for controlling the movement while other colored tape can be used for click events.
Our project will provide a new experience for HCI.
The rest of the paper is as follows. In Section 2, we present similar works done by the other researcher. In Section 3, we
provide the system overview of what this project is about followed by Section 4 where will present system setup and
technologies with an overview of the functional block diagram. Section 5 and 6 concludes the paper with results and
conclusion respectively.
II. LITERATURE REVIEW
Many researchers have tried to interact with computer through video devices and web camera. Each of them used different
approaches to make a mouse cursor move without any hardware.
One method by Erdem[1] et al have control of mouse cursor using image segmentation and gesture control. In reference
[2] K. Pendke et al used hand gesture technology to interact with the computer. They have used an intuitive method to detect
hand gesture. In this approach, EmguCV SDK technology was used instead of more traditional Matlab in order to achieve more
accuracy and have a more control over the source code. [3] Abhirup Ghosh et al idea was based on color detection. But the
coloured objects in the background may cause a problem like give an erroneous response.
Our project was inspired by a paper of Ankur Yadav[4] et al where they used web camera and colored tapes to control
mouse cursor.But they used Matlab for image processing and mouse action events. Deeksha Verma[5] et al in their journal also
used Matlab for image processing. But here we propose a system to use OpenCV library for image processing and Python‟s
own GUI module PyAutoGUI to control the mouse cursor and its clicking events.
III. SYSTEM OVERVIEW
Computer Vision-Based Mouse is a system to control the cursor of our computer without using any physical device even a
mouse. Here we will essentially have a colored object in our hand. The video of the motion of our palm has been captured by
the web-camera which acts as a sensor. The colored objects are tracked and using their motion, the cursor of the mouse is
controlled. In our work, we have used 4 colors for 4 typical actions of the mouse. Yellow color for mouse cursor movement,
green color for left click, orange color for right-click and blue color for double click.
In order for it to work, we will simply use respective colors within the viewing area of the camera. The colored objects
should be placed in the Region of Interest(ROI). The video generated by the camera is detected and analyzed using image
processing and the computer cursor moves or displays its click events according to color movements, of which all the details
are provided below. Mouse handling activities are achieved by PyAutoGUI which is the python GUI module.
IV. METHODS AND TECHNOLOGIES USED
The proposed system is a computer vision application that is based on real time application system. It makes the use of
OpenCV for image processing and image acquisition and PyAutoGUI for handling mouse control in order to replace the actual
mouse with the colored object. The basic block diagram of computer vision pipeline is shown in figure 1.
Fig 1 Block Diagram of Computer Vision Pipeline
© 2020, IJCSMC All Rights Reserved
36
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
Each Steps are further divided into smaller steps, based on the application and the project.
A. Image Acquisition
Image or the frame that is detected by the web-cam is acquired as the digital representation of the visual characteristics of
the physical world. An image sensor or the web-cam is used to detect and capture the information required to make an image.
B. Image Processing
Image acquired are then processed in the next step. The signals in the acquired images are filtered to remove the noise or
any irreverent frequencies. If needed, images are padded and transformed into a different space, so to make them ready for the
actual analysis.
C. Image Analysis
The processed image is analyzed to extract useful information. This step involves many important image properties like
pattern identification, color recognition, object recognition, feature extraction, motion tracking, and image segmentation.
D. Decision Making
High dimensional data obtained from all the above steps are used to produce meaningful numerical information, which
leads to making decisions.
The detailed working of the proposed system is shown in the following flowchart.
Fig 2 Flowchart of the proposed work
© 2020, IJCSMC All Rights Reserved
37
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
A. Capture video
In the proposed architecture, the input is the colored object which is given to the web-camera. The camera reads the frame
of the video. Here, the web-camera is used as a sensor which will help in detecting and tracking the object. Video is a
collection of multiple visual images and divided into Frame Rate which ranges from 6 or 8 per second for the old mechanical
camera and 120 or more frames per second for a more professional camera.
Video can be captured by OpenCV function:
cam = cv2.VideoCapture(index)
(1) Where: VideoCapture() is a function of OpenCV which is used to capture the image from the web cam.
Index is a no. of argument or no. of camera that we want to set.
cam is output where that function will be stored.
B. Read Frame from the video
After the video is captured by the webcam, frames are read from video by the function „cam.read()‟, where the cam is the
output where the video is captured. After the frames are obtained by the OpenCV function, the image that we get is inverted.
That means if we move our object in the right direction, the image of the pointer will move towards the left and vice-versa. It is
similar when we stand in front of the mirror where left is detected as right and right is detected as left. To avoid it, we need to
vertically flip the Image[].In OpenCV, the image can be conveniently flipped vertically by the function. cv 2. flip(0) .Here 0 is
the index because we want to flip our image vertically. The image is in RGB format.
C. Image Smoothing
We need to smooth our image i.e, blur an image. Image Smoothing is useful for removing high frequencies of contents
like noise and edges from the image. Image Smoothing is very effective in reducing the Pixelation of an image. Smoothing
replaces a pixel value with the average of values of the pixels in its neighborhood. It makes use of Kernel which is a small
matric used to apply the smoothing algorithm on the image.
Fig 3 Original vs Blur
© 2020, IJCSMC All Rights Reserved
38
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
D. ROI Definition and Grid Generation
After acquiring the image, applying image smoothing and removing the noise, we need to create a Region of Interest(ROI)
and a grid. This will increase the accuracy of the marker identification and tracking. The grid will help in translating the
position of the marker to the cursor movement. ROI is where we will perform all the operations of the mouse and where our
sensor will detect the marker. We have set the ROI in the top left corner of the screen. The dimension of the grid is 3x3 and the
color of the outline is red. After the generation of ROI, We need to convert the image in BGR format to HSV format because it
is easier to identify Coloured objects in the HSV space by using the function:
cv 2. cvtColor(image, cv 2. COLOR_BGR 2 HSV) .
After converting the image from HSV, we will threshold the image to identify the objects of the specific color range. This
technique converts a grayscale image into a binary image, based on some criteria. It is mainly used for object detection.
Fig 4 ROI and Grid Generation
Once all the objects of the specific color are separated using color recognition, we will identify the object of our interest.
Now that the image is successfully converted into a binary image, we will find the contours of the colored objects from the
binary image.
E. Marker Identification
Marker Identification are of three steps:
Fig 5 Stages of Marker Identification
© 2020, IJCSMC All Rights Reserved
39
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
Marker Identification identifies the marker from all the object recognized in the image once the markers are identified in
each frame.
1) Contours for colored object: Contours is a curve joining all the continuous points and having the same color or
intensity. It is a python list of all the contours in the image. Each contour is a numpy array of x,y coordinates of boundary
points of the object. We had found the contours of the colored object by cv2.findContours(image, mode, method). This method
outputs a modified image, the contours, and the hierarchy. Hierarchy is the representation of the relationship between contours
found from an image.
2) Contours with maximum Area: Contours with maximum Area acts as contours of interest. It identifies as the object. We
have ignored the smaller contours as they might be due to some error in setting the bounds or some extraneous pixels that do
not belong to the specific object.
3) Features of contour with maximum Area: We have used contour features like centroid and bounding rectangle to track
the motion of an object. The identified object can be easily tracked from its centroid and the bounding box. The object is
bounded with the rectangle and circle as shown below.
Fig 6 Contours Identification
F. Marker Motion Tracker
Once the markers in each frame are identified, the motion is tracked in different frames by locating the centroid of the
markers in the Region of interest(ROI). We have made use of centroid location relative to the grid. Contour features are useful
to find the contour around the markers on the palm. After getting the centroid of the marker, we got the location of the centroid
in the grid that we have created. It can fall in any of the 9 regions as shown:
© 2020, IJCSMC All Rights Reserved
40
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
TABLE I
LOCATION OF THE CENTROID IN GRID
Top-Left
Top-Center
Top-Right
Mid-Left
Mid-Center
Mid-Right
Bottom-Left
Bottom-Center
Bottom-Right
In the above TABLE I, we can locate where the object is from that 9 centroids in the above table. If it is in the Top-Left
region, the cursor should move towards the top left part of the screen. If the marker is in the Top-Center region of the grid, the
cursor should move toward the top of the screen and so on. If we hold the marker in the central region, the cursor should not
move and stay still. This is how we control the cursor, based on the marker motion.
V. RESULT AND EVALUATION
In this paper, we tried to improvise the interaction between humans and the machine by controlling the cursor of the mouse
without any physical device. The proposed system tried to control the functions of the mouse pointer by detecting any yellow
colored object, right-click by orange color, left click and double click by detecting green and blue colored objects respectively.
© 2020, IJCSMC All Rights Reserved
41
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
A. Edge Detection
Fig 7 Canny Edge Detection
Figure 7 displays the edge detection of the object by canny edge detection algorithm which is a very powerful edge
detection algorithm. As we can see, everything is invisible except for the edges in the object.
B. Object Tracking
Fig 8 Object Tracking
© 2020, IJCSMC All Rights Reserved
42
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
As stated earlier, we have chosen yellow-colored objects for the cursor movement. We have set the required HSV range
of the yellow color. The detected object will be enclosed by the blue color bounding rectangle box. But in order to detect the
object, it should be placed within the viewing area of the webcam.
C. Mouse cursor movement and click-events
Instead of Matlab, we have used PyAutoGUI to move the cursor based on marker location in the grid and also click based
on the color of our choice(in this case, green for left click, orange for right-click and blue for double click) of the marker. The
mouse functions of PyAutoGUI make use of x and y coordinates of the pixels on the screen, just like any image. So, the
resolution of a screen which determines how many pixels wide and tall our screen is a significant parameter.
1) pyautogui. size() : This function returns a 2 integer tuple of the screen‟s width and height in pixels. Depending on ours
screen resolution, the return value may be different. The width and height of the screen can be stored from the above function
in variables like width and height, to access them in the programs.
2) pyautogui. moveRel(x, y) : This function will instantly move the mouse cursor from its current position to a specified
position on the screen. We have made use of this function to move our mouse cursor based on the yellow color. Integer values
of the x and y coordinates make up the function‟s first and second arguments respectively. Once the coordinates have been
determined, the coordinates are sent to the cursor. The cursor places itself in the required position with these coordinates. As
the user moves the colored object across the view of the ROI, the mouse cursor also moves across the screen.
3) pyautogui. click() : The control action of the mouse is performed by this function. The sensor has to detect the specified
color for performing the mouse event action. We have set the blue-color for double-click, green-color for left-click and
orange-color for right-click.
Fig 9 Cursor Movement
© 2020, IJCSMC All Rights Reserved
43
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
Fig 10 Left Click Event
Fig 11 Right Click Event
VI. CONCLUSION
The proposed system architecture will completely change the way people would use the computer system. This project
eliminated the need for a mouse or any physical device for cursor control.
The use of object detection, marker motion tracking and PyAutoGUI for the implementation of our proposed work proved to be
successful and the movement of the mouse and click events is achieved with the high precision accuracy. This also led to better
Humans -Computer Interaction(HCL). The proposed work has many advantages. This technology can be useful for those
patients who cannot control their limbs [5]. They can just use any colored objects from their fingertips. It also has wide
applications in the modern Gaming, Augmented Reality and Computer Graphics. In the case of computer graphics and gaming,
this technology has been applied in modern gaming consoles to create interactive games where a person‟s motions are tracked
and interpreted as commands[4]. The main aim is to reduce the use of any physical device and make use of only web cam
which is readily available with the laptop. No additional costs are required.
© 2020, IJCSMC All Rights Reserved
44
Faiz Khan et al, International Journal of Computer Science and Mobile Computing, Vol.9 Issue.5, May- 2020, pg. 35-45
Though this project has many advantages, it does suffer from some drawbacks. If the background image clashed with the
specified image, then it can give an erroneous response and may not work properly [3]. So it is advisable to use this technology
where the operating background light and does not mix with the color in hand. System may run slower on certain computers of
low resolution and computational capabilities. If the camera is high on resolution., then the system might run slow. But
problem can be solved by reducing the resolution of the image.
REFERENCES
[1] A. Erdem, E. Yardimci, Y. Atalay, V. Cetin, A. E, “Computer Vision Based Mouse”, Acoustics, Speech, and Signal
Processing, Proceedings. (ICASS). IEEE International Conference, 2002.
[2] K. Pendke, P.Khuje, S. Narnaware, S. Thool, S. Nimje, “Computer Cursor Control Mechanism by Using Hand Gesture
Recognition”, International Journal of Computer Science and Mobile Computing, Vol.4 Issue.3, March- 2015,pg.293-300.
[3] A. Banerjee, A. Ghosh, K. Bharadwaj, H. Saikia, “Mouse Control Using a Web Camera Based on Colour Detection”,
International Journal of Computer Trends and Technology (IJCTT) – volume 9 number 1– Mar 2014.
[4] A. Yadav, A. Pandey, A. Singh, A. Kumar, “Computer Mouse Implementation Using Object Detection and Image
Processing ”, International Journal of Computer Applications. (IJCA), Volume 69-Number 21, 2013.
[5] D. Verma, et al., “Vision Based Computer Mouse Controlling Using Hand Gestures”, International Journal of Engineering
Sciences & Research Technology.(IJESRT), 7(6): June, 2018.
[6] Hojoon Park, “A Method for Controlling the Mouse Movement using a Real Time Camera”, Brown University, Providence,
RI, USA, Department of computer science, 2013.
[7] Chu-Feng Lien, “Portable Vision-Based HCI – A Realtime Hand Mouse System on Handheld Devices”, National Taiwan
University, Computer Science and Information Engineering Department.
[8] Kamran Niyazi, Vikram Kumar, Swapnil Mahe, Swapnil Vyawahare, “Mouse Simulation Using Two Coloured Tapes”,
Department of Computer Science, University of Pune, India, International Journal of Information Sciences and Techniques
(IJIST) Vol.2, No.2, March 2012.
© 2020, IJCSMC All Rights Reserved
45