Personal Voice Assistant

“PERSONAL VOICE ASSISTANT”
by
Pradeepa J Moolya
“VI” Semester
“Bachelor of Computer Applications”
Reg. No: BCA21034
Project Report submitted to the University of Mysore in

partial fulfillment of the requirements of “VI Semester”
“Bachelor of Computer Applications” degree
examinations “2024”
University of Mysore Manasagangothri,

Mysore– 570006
DECLARATION
I, PRADEEPA J MOOLYA, hereby declare that the project, entitled “PERSONAL VOICE
ASSISTANT”, submitted to the University of Mysore in partial fulfillment of the requirements
for the award of the Degree of BACHELOR OF COMPUTER APPLICATIONS is
submitted to the Directorate of Outreach and Online Programmes, University of Mysore. It has
not formed the basis for awarding any Degree/ Fellowship or Other similar titles to any
candidate of any University.
Place: Udupi
Date: 07-08-2024
Signature of the Student:

ACKNOWLEDGMENT
“Task successful” makes everyone happy. But happiness will be gold without glitter if we don’t
state the people who have supported us to make it a success. Success will be crowned to people
who made it a reality but the people whose constant guidance and encouragement made it
possible will be crowned first on the eve of success.
This acknowledgment transcends the reality of formality when we would like to express deep
gratitude and respect to all those people behind the screen who guided, inspired, and helped
me with the completion of our project work.
I consider myself lucky enough to get such a good project. This project would add an asset to
my academic profile.
ABSTRACT
The adoption of social network sites and the use of smartphones with several sensors has
digitized users’ activities in real-time. Smartphone applications such as calendars, email, and
notes contain a lot of user information and provide a view into the user’s activities. In contrast,
sensors such as GPS sensors can be used to find information about the user passively. In
addition to this user and device data, these devices have access to the Internet that can be
leveraged to build powerful applications.
Personal voice assistant software (smart agent) can be used as an interface to the digital world
to make the consumption of this information timely and efficient for the user’s specific tasks.
The goal of the thesis is to design personal assistant software that understands the semantics of
the task, can decompose the task into multiple tasks within the context of the user, and plan
these tasks for the user. It will be designed using semantic web technologies and knowledge
databases to understand the relations between the tasks. The agent will be integrated with online
web services to harvest the data available online with the data available on the device and help
the user manage his or her tasks.

TABLE OF CONTENTS
1. INTRODUCTION
1.1 Introduction 1-4
1.2 Problem Statement 4-8
1.3 Background 8 - 11
1.4 Objectives 12 -14
2. LITERATURE SURVEY
2.1. Related Work 15 - 17
3. METHODOLOGY
3.1. Existing system 18
3.2. Proposed system 18-19
3.3. Objective of the Project 20
3.4. Software and Hardware requirements 21
3.4.1. Software requirement 22
3.4.2. Hardware requirement 22
3.4.3. Libraries 23 -26
3.5. Programming Languages 26
3.5.1. Python 27-43
3.5.2. Domain 43-45
3.6. System Architecture 46
3.6.1 System Architecture Figure
3.7. Algorithms Used 47-48
3.7.1. Speech Recognition Module
3.8. System Design 49-53
3.8.1 component diagram
3.8.2 sequence diagram
3.8.3 sequence diagram answering user
3.9 Feasibility Study 53 - 54
3.10. Types of operation 55- 57
4. PERSONAL ASSISTANT SOFTWARE IN THE MARKET
4.1 Goals of Personal Assistant Software 58 -60
4.2 Different Types of Personal Assistant Software 61
4.2.1 Voice Recognition as Input Entry Medium 62
4.2.2 Voice Recognition-Based Task Automation or Information Retrieval 63-64
4.2.3 Planning 65
4.3 History of Voice Assistants 66 -67
4.4 What are Intelligent Personal Assistants or Automated Personal Assistants? 68
4.5 How do Artificial Intelligence Assistants Interact with People? 69 -72
5. IMPLEMENTATION
5.1 Building a Personal Voice Assistant 73-74
5.2 Dependencies and Requirements 75
5.3 Let’s Start Building Our Voice Assistant Using Python 76-88
5.3. 1 Screenshot 89-91
5.4 Flow-chart 92
5.5 Data Flow Diagram 93
6. RESULT and ANALYSIS
6.1 Working Result 94-97
6.2 Pros 98-99
6.3 Cons 99 -100
6.4 Advantages of Artificial Intelligence in Personal Voice Assistant 101-102
6.5 Disadvantages of Artificial Intelligence in Personal Voice Assistant 103-105
7. CONCLUSION 106-107
8. FUTURE ENHANCEMENTS 108 -111
9. BIBLIOGRAPHY 112
Personal Voice Assistant
CHAPTER 1
INTRODUCTION
1.1 INTRODUCTION
The first voice-activated product, Radio Rex, was released in 1922. This simple toy featured a dog that
would remain inside a doghouse until the user exclaimed its name, "Rex." At that point, the dog would
jump out. This was achieved through an electromagnet tuned to a frequency like the vowel sound in
"Rex," predating modern computers by over 20 years.
In the 21st century, human interaction is increasingly being replaced by automation. Performance is a
key driver of this change, with a significant shift in technology rather than mere advancement.
Today, we train machines to perform tasks autonomously or to think like humans using technologies
such as Machine Learning and Neural Networks. In our current era, virtual assistants allow us to
communicatewith machines through voice commands.
Virtual assistants are software programs designed to ease daily tasks, such as showing weather reports,
providing daily news, and searching the internet. These assistants can take voice commands, activated
by an invoking or wake word, followed by the user's command. Examples of popular virtual assistants
include Apple’s Siri, Amazon’s Alexa, and Microsoft’s Cortana. The development and success of these
assistants inspired our project
1|Page
Our system is designed for efficient use on desktops. Voice assistants are programs on digital devices
that listen and respond to verbal commands. For example, a user can ask, “What's the weather?” the
voice assistant will provide the weather report for that day and location.
Voice assistants are artificial intelligence (AI) systems designed to facilitate user interaction with
digital devices. The basic idea behind this project is to create a simple, stand-alone application that
helps less tech-savvy individuals use computers without feeling ignorant or computer illiterate. Over
time, computers have become especially important and less expensive, making accessibility crucial.
Our application functions similarly to Siri or Google Assistant but is primarily designed to interact
with the computer itself. The user interface (UI) of the application is self-explanatory and minimal.
Currently, it takes text as input, as most people may not be comfortable with speaking commands.
Mobile technology has become renowned for its user experience, allowing easy access to applications
and services from any geo-location. Various famous and commonly used mobile operating systems
include Android, Apple, Windows, and Blackberry. These operating systems provide a plethora of
applications and services. For instance, contact applications store user contact details and facilitate
calls or SMS. Similar applications are available worldwide via the Apple Store and Play Store. These
features have led to the implementation of various sensors and functionalities in mobile devices.
The most famous application on the iPhone is “SIRI,” which allows end users to communicate with
their mobile devices using voice commands. Similarly, Google developed “Google Voice Search” for
Android phones. However, this application primarily requires an internet connection. Our proposed
2|Page
system, named Personal Assistant with Voice Recognition Intelligence, can function with or without
internet connectivity. It accepts user input in the form of voice or text, processes it, and returns the
output in various forms, such as performing an action or dictating a search result to the end user.
One of the goals of artificial intelligence is the realization of natural dialogue between humans and
machines. In recent years, dialogue systems, also known as interactive conversational systems, have
become one of the fastest-growing areas in AI. Many companies have used dialogue system technology
to establish various kinds of Virtual Personal Assistants (VPAs), such as Microsoft’s Cortana, Apple’s
Siri, Amazon Alexa, Google Assistant, and Facebook’s.
In this proposal, we have utilized a single-modal dialogue system that processes user input modes, such
as speech, to design the next generation of PVAs (Personal voice assistants). This new model aims to
increase interaction between humans and machines using technologies such as gesture recognition,
image/video recognition, speech recognition, and vast dialogue and conversational knowledge bases.
Additionally, the new PVAs system can be applied in various areas, including education assistance,
medical assistance, personal
voice assistants for vehicles, systems for people with disabilities, home automation, and security access
control. Call Digitization offers new possibilities to facilitate the activities of our daily lives through
assistive technology. This is the new way to connect with technology. Currently, the voice assistant is
especially useful for a person. The voice assistant takes less time. With the help of the voice assistant,
3|Page
we can participate in other works and save time. Voice assistants are a great innovation that can change
people's lives in other ways. The voice assistant was first introduced on smartphones and gained
popularity on the back of its popularity. it was widely accepted by all. It can be conveniently used in
all age groups. Speech recognition is the process of converting speech into text. It is typically used by
voice assistants such as Alexa and Siri. Python provides an API called Speech Recognition. You can
use it to convert audio to text for further processing. Python's Speech Recognition API allows you to
convert large or long audio files to text. Users can give the commands in verbal and written form as
well. The user can open an application (if installed on your system), search for queries on Google,
Wikipedia, and YouTube, or calculate mathematical questions. Just give a voice command. Used the
Google Speech Recognition API and Google Text-to-Speech for voice input and output, respectively.
In addition, you can use the Wolfram Alpha API to calculate formulas.
1.2 PROBLEM STATEMENT
Identifying the Need
In today's technology-driven world, the demand for more intuitive, efficient, and accessible ways to
interact with digital devices is ever-increasing. Personal voice assistants (PVAs) have emerged as a
pivotal solution to meet this demand, providing users with a hands-free, efficient, and natural way to
perform tasks and access information. Here are some key reasons why PVAs are needed:
1. Efficiency and Productivity: PVA (Personal voice assistants) helps users' complete tasks quickly
without needing to navigate through multiple menus or type out commands. For instance, setting
4|Page
reminders, sending messages, or making calls can be done swiftly through voice commands, saving
time and effort.
2. Accessibility: PVAs are crucial for people with disabilities, such as visual impairments or mobility
issues, as they provide an alternative method to interact with technology without relying on traditional
input devices like keyboards or touch screens.
3. Multitasking: In a fast-paced world, the ability to multitask is invaluable. PVAs allow users to
perform various tasks simultaneously, such as checking the weather while cooking or sending emails
while driving, thereby enhancing overall productivity.
4. Integration with Smart Devices: As smart home technology becomes more prevalent, PVAs act as
central hubs for controlling various smart devices. They can manage home automation systems, control
lighting, adjust thermostats, and even lock doors, contributing to a more connected and efficient living
environment.
5. Personalization: PVAs can learn user preferences and habits over time, providing personalized
responses and suggestions. This level of customization enhances user experience by making
interactions more relevant and tailored to individual needs.
Challenges
Despite their numerous benefits, there are several challenges faced without the widespread adoption of
personal voice assistants:
5|Page
1. Limited Interaction: Without PVAs, users are confined to traditional methods of interaction like
typing and clicking, which can be slower and less intuitive. This limitation can hinder productivity and
the overall user experience.
2. Accessibility Barriers: For individuals with disabilities, the absence of PVAs can significantly
impact their ability to use technology effectively. Traditional interfaces may not be suitable for
everyone, creating barriers to access.
3. Increased Cognitive Load: Navigating through menus and interfaces to perform simple tasks can
increase cognitive load, leading to frustration and decreased efficiency. PVAs simplify this process by
understanding and executing voice commands.
4. Lack of Integration: Without PVAs, managing multiple smart devices individually can be
cumbersome. PVAs streamline this process by providing a unified interface to control various devices,
making smart home management more seamless.
5. Privacy Concerns: While PVAs raise privacy concerns, the absence of sophisticated voice
recognition systems can lead to security vulnerabilities in other forms of digital interactions. PVAs,
when designed with robust security measures, can enhance privacy by ensuring secure and
authenticated access to sensitive information.
Scope of the Project
This project aims to develop a comprehensive personal voice assistant that addresses the identified
needs and challenges. The scope of the project includes:
6|Page
1. Development of Core Functionalities: The project will focus on building essential features such as
voice recognition, natural language processing, and task execution. This includes functionalities like
setting reminders, sending messages, controlling smart devices, and retrieving information.
2. Integration with Smart Devices: The PVA will be designed to seamlessly integrate with various
smart home devices, enabling users to control their environment through voice commands. This
includes compatibility with popular smart home ecosystems and devices.
3. User Interface Design: A user-friendly interface will be developed to facilitate easy interaction with
the PVA. This includes designing both voice and visual interfaces to ensure a seamless user experience.
4. Security and Privacy Measures: The project will incorporate robust security protocols to protect user
data and ensure privacy. This includes implementing encryption, authentication, and secure data
storage mechanisms.
5. Personalization and Learning Capabilities: The PVA will include machine learning algorithms to
learn from user interactions and preferences, providing personalized responses and suggestions over
time.
6. Accessibility Features: Special attention will be given to making the PVA accessible to users with
disabilities. This includes voice commands that cater to specific needs and interface adjustments for
better usability.
7|Page
7. Performance and Scalability: The PVA will be designed to handle multiple users and high volumes
of interactions efficiently. Performance optimization and scalability will be key considerations during
development.
8. Testing and Evaluation: The project will involve rigorous testing to ensure functionality, reliability,
and user satisfaction. User feedback will be gathered and analysed to make iterative improvements.
By addressing these areas, the project aims to develop a personal voice assistant that enhances user
experience, improves accessibility, and integrates seamlessly with modern smart home ecosystems.
1.3 BACKGROUND
Historical Evolution
Voice recognition technology has roots dating back to the mid-20th century when researchers began
experimenting with electronic speech synthesis and recognition. Early systems, such as the "Audrey"
system developed in the 1950s, laid foundational principles for converting spoken language into
machine-readable format. These systems were rudimentary compared to modern standards, relying on
simple pattern matching and acoustic modelling.
The evolution accelerated in the 1970s and 1980s with the development of more sophisticated
techniques, including Hidden Markov Models (HMM) and Dynamic Time Warping (DTW). These
methods improved accuracy and enabled broader applications, albeit still limited by computational
power and data availability.
8|Page
By the late 1990s and early 2000s, advancements in machine learning, particularly the advent of neural
networks, revolutionized voice recognition. Systems like Dragon NaturallySpeaking introduced neural
networks for speech-to-text conversion, significantly enhancing accuracy and usability.
Today, modern voice assistants like Siri, Google Assistant, and Amazon Alexa represent the
culmination of decades of research and development. They integrate advanced algorithms, vast
datasets, and cloud computing to provide seamless voice interaction across various devices and
applications.
Technological Advancements
Key technological milestones in voice recognition include:
Neural Networks: The adoption of deep learning techniques has dramatically improved accuracy by
enabling systems to recognize complex patterns in speech data.
Big Data and Cloud Computing: Access to large-scale datasets and cloud-based processing has enabled
more accurate and efficient voice recognition, transcending the limitations of local hardware.
Natural Language Processing (NLP): Integration with NLP allows voice assistants to understand
context, intent, and even emotions, making interactions more natural and intuitive.
Multimodal Integration: Combining voice with other input modalities (such as text and gestures)
enhances user experience and expands functionality.
Current Trends
Contemporary trends in AI and voice assistant technology include:

9|Page
Personalization: Voice assistants are becoming more personalized, learning user preferences, and
adapting responses accordingly.
-Integration into IoT: Voice control is increasingly integrated with Internet of Things (IoT) devices,
enabling smart homes, and connected environments.
Privacy and Security: Heightened concerns over data privacy and security have driven advancements
in secure voice authentication and data encryption.
Multilingual Support: Efforts are underway to support multiple languages and dialects, making voice
technology accessible to a global audience.
Domain-specific Applications: Voice assistants are expanding into specialized domains like healthcare,
finance, and education, offering tailored solutions and improving efficiency in various sectors.
Siri. Siri is Apple Inc.’s cloud software that can answer users’ various questions and give
recommendations, due to its voice processing mechanisms. When in use, Siri studies the user’s
preferences (like contextual advertising) to provide each person with an entirely individual
approach. This software solution is also useful for developers; the presence of an API called Siri Kit
provides smooth integration with new applications developed for iOS and Watch OS platforms.
Ok, Google. Ok, Google is an Android-based voice recognition application, which is launched by users
uttering commands of the same name. This software features very advanced functions including web
search, route optimization, memo scheduling, etc. that can collectively help users solve a wide array of
daily tasks. Like Siri, the creators of OK Google offer Google Voice Interaction API. This interface
10 | P a g e
can become a truly indispensable tool in the development of mobile applications for the Android
platform.
Cortana. A virtual intelligent assistant with the function of voice recognition and AI elements, Cortana
was developed for such platforms as Windows, iOS, Android, and Xbox One. It can predict users’
wants and needs based on their search requests, e-mails, etc. One of Cortana’s distinguishable features
is her sense of humour. “She” can sing, make jokes, and speak to users
Amazon Echo. Amazon Echo combines hardware and software that can search the web, help with
scheduling upcoming tasks, and play various sound files all based on voice recognition. A small
speaker equipped with sound sensors; the device can be automatically activated by exclaiming “Alexa.”
Nina. Software with AI elements that has a main goal of narrowing down the amount of physical effort
spent on the solution of daily tasks (web search, scheduling, etc.) Due to elaborate analytical
mechanisms, Nina becomes “smarter” with every day of personal utilization.
Bixby. Samsung’s Bixby application is another successful implementation of the AI concept. It also
builds a unique user approach, based on interests and habits. Bixby features advanced voice recognition
mechanisms and uses the camera to identify images, based on markers and GPS.
11 | P a g e
1.4 0BJECTIVE
The primary goal of this project is to demonstrate the feasibility of developing a personal voice assistant
software, referred to as a smart agent, using Python and leveraging various data sources available on
the web, user-generated content, knowledge databases, and inference technologies from Web 3.0.
Design and Functionality
1. Contextual Understanding:
The smart agent will gather contextual information about the user, such as location, current time,
calendar appointments, relationships between tasks, task decomposition, and past task history. It will
also consider user interests and preferences (e.g., likes, and dislikes).
This contextual understanding will enable the agent to interpret tasks more accurately and decompose
them into actionable steps based on sequences stored in its knowledge base.
2. Task Management and Planning:
The agent's core functionality will involve managing and planning tasks. It will optimize task
management by grouping related tasks that can be completed simultaneously and in proximity, thus
enhancing resource efficiency.
By leveraging data gathered about the user and environment, the agent will improve productivity by
suggesting optimal sequences for completing tasks and allocating resources effectively.
12 | P a g e
3. Feedback Loop:
A feedback mechanism will be integrated to allow the user to provide input and validate decisions
made by the agent, especially in scenarios where multiple paths are possible or when the agent lacks
sufficient information.
This feedback loop will help refine the agent's decision-making process over time, enhancing its ability
to adapt to user preferences and evolving contexts.
4. Assumptions, Limitations, and Constraints:
The thesis will identify and discuss assumptions, limitations, and constraints inherent in the solution.
For instance, limitations may include data availability and accuracy from web sources, while
constraints might involve computational resources required for real-time decision-making.
5. Additional Infrastructure:
Any additional infrastructure necessary to complement the smart agent system, such as specific APIs
for data retrieval, integration with existing applications, or computational resources for intensive
processing tasks, will be identified and discussed.
Expected Outcomes
Demonstration of Feasibility: The Project will provide evidence and practical implementation of how
Python, web data sources, and inference technologies can be integrated to build a functional smart
agent.
13 | P a g e
Improvement in Productivity: Success will be measured by the agent's ability to optimize task
management, improve productivity, and provide valuable insights based on user interactions and
feedback.
Identification of Future Directions: The project will highlight potential future directions for enhancing
the smart agent's capabilities, such as integrating more advanced AI models or expanding into new
domains of application.
Customization and Learning: The assistant should allow for the customization of responses and
preferences. It should have the ability to learn from user interactions to improve its responses and
functionality over time.
Task Automation: The assistant should be able to handle a range of tasks, such as setting reminders,
scheduling appointments, sending emails, or fetching information from the web. It should integrate
with various APIs (e.g., Google Calendar, Email services) to perform these tasks.
14 | P a g e
CHAPTER 2
LITERATURE SURVEY
2.1 RELATED REVIEW
Nivedita Singh (2021) et al. proposed a voice assistant using a Python speech to text (STT) module
and performed some API calls and system calls which led to the development of a voice assistant using
Python that allows the user to run any type of command through voice without interaction of keyboard.
This can also run on hybrid platforms. Therefore, this paper lacks in some parts like the system calls
that are not much supported (Agrawal et al., 2021).
Abeed Sayyed (2021) et al. presented a paper on Desktop Assistant AI using Python with IOT (Internet
of Things) features and used Artificial Intelligence (AI) features along with an SQLite DB with the use
of Python. This Project has a Database connection and a query framework but lacks API call and
System call features (Sayyed et al., 2021).
Krishnagar (2021) et al. presented a project on Portable Voice Recognition with GUI Automation, this
system uses Google's online speech recognition system for converting speech input to text along with
Python. Therefore, this project has a GUI and has a portable framework. The accuracy of this text-to-
speech (TTS) engine is comparatively less and lacks IoT (Krishna raj et al., 2021).
15 | P a g e
Rajdip Paul (2021) et al. presented a project named A Novel Python-based Voice Assistance System
for Reducing the Hardware Dependency of Modern Age Physical Servers. This Author has proposed
an assistant project with Python as a backend supporting system calls, API calls, and various features.
This Project is quite well responsive to API calls but also needs improvement in understanding and
reliability (Paul & Mukhopadhyay, 2021).
V. Geetha (2021) et al. presented a project named The Voice-Enabled Personal Assistant for PC using
Python. This Author has proposed an assistant project with Python as a backend and features like
turning our PC off, restarting it, or reciting some latest news, is just one voice command away. Also,
this project has well well-supported library not every API will have the capability to convert the raw
JSON data into text. And there is a delay in processing request calls (Geetha et al., 2021).
Dilawar Shah Zwakman (2021) et al. proposed the Usability Evaluation of Artificial Intelligence.
Based Voice Assistants which can give proper response to the user's request. It also has a feature where
it can make an appointment with the person mentioned by the user through voice, but it lacks API calls
(Zwakman et al., 2021).
Philipp Sprengholz (2021) et al. have proposed OK Google: Using virtual assistants for data collection
in psychological and behavioural Research which is a survey mate that they have developed which is
an extension of the Google Assistant that was used to check the reliability and validity of data collected
by this test. Answers and synonyms are defined for every different type of question so, it can be used
to analyse the behaviour of an individual. As it is a psychological and behavioural research assistant
(Sprengholz & Betsch, 2021).
16 | P a g e
Dimitrios Buhalis (2021) et al. proposed a paper on In-room Voice-Based AI Digital Assistants
Transforming On-Site Hotel Services and Guests’ Experiences. Where voice assistant is used for hotel
services. It will be especially useful in this current COVID-19 era. Human Touch is considered as a
danger in this COVID time and with a voice assistant, loss of human touch is not considered as an
advantage. It can also be used to control the temperature controls and room light controls, but it needs
Complex Integration and Staff Training (Buhalis & Moldavska, 2021).
Benedict D. C (2020) et al. proposed Consumer decisions with artificially intelligent voice assistants
that will have stronger psychological reactions to the system's look on human like behaviours. The
assistant has Internet of Things features. It can also order stuffs that the user want but there are some
cons in this paper. Voice assistant relies on the speaker’s ability to represent the decision alternatives
to catch up in voice dialogues and another main disadvantage is that it lacks system calls (Dellaert et
al., 2020).
17 | P a g e
CHAPTER 3
METHODOLOGY
3.1. EXISTING SYSTEM
Existing projects often rely heavily on speech recognition augmented by emotional networks. While
these systems achieve a certain level of accuracy, their practical application and suitability for real-
world use are limited. They primarily employ basic methods, among which context-aware computing
stands out. Context-aware computing encompasses programs capable of sensing their physical
environment and adjusting their responses accordingly. In the realm of speech recognition, this
capability allows systems to identify words spoken by individuals with varying accents, tones, or
speech patterns. Moreover, context-aware systems can also correct words that may have been
mispronounced, ensuring more accurate transcription, and understanding of spoken language. This
adaptive approach not only enhances the robustness of speech recognition systems but also improves
their usability across diverse user demographics and environments.
3.2. PROPOSED SYSTEM
The conceptual model that describes a system's structure, behaviour, and other aspects is called system
architecture. A formal description and representation of a system that is set up to facilitate analysis of
18 | P a g e
its structures and behaviours are called an architecture description. System architecture can comprise
designed subsystems and system components that will cooperate to implement the entire system. This
section gives a succinct summary of our findings after analysing and comparing our suggested work.
We have used Python, machine learning, and AI to implement this concept. Our primary goal is to
enable consumers to do their jobs using voice commands. This can be accomplished in two steps.
Initially, with the aid of the Voice Recognition API, turn the user's audio input into an English sentence.
1) The system will continuously listen for commands, and it can adjust the amount of time it spends
doing so as per user preferences.
2) The system will keep requesting the user to repeat their input the desired number of times if it cannot
get the information from it.
3) The user's preferences can determine whether the system uses male or female voices.
4) The current version supports features including playing music, sending emails and texts, searching
Wikipedia, opening system-installed programs, and accessing any website.
5) The system will continue to listen for commands, and it can adjust the duration of that listening
based on user needs.
6) The system will keep requesting the user to repeat their input till the desired number of times if it
cannot get information from it.
7) The user's preferences can determine whether the system uses male or female voices
19 | P a g e
3.3. OBJECTIVE OF THE PROJECT
The main objective of developing personal assistant software, or virtual assistant, is to leverage
semantic data sources from the web, user-generated content, and knowledge databases to effectively
answer user queries. This intelligent virtual assistant serves various purposes, such as providing
customer support on business websites through chat interfaces or offering mobile-based services where
users interact via voice commands. By automating responses to user inquiries, virtual assistants
significantly reduce the time spent on manual online research and report preparation, thereby enhancing
productivity and efficiency. the objective of this project is to show the feasibility of building a personal
voice assistant software (a smart agent) using Python data sources available on the web, user-generated
content, data providing knowledge from knowledge databases as well as from inference technologies
of web 3.0.
To design a smart agent that has contextual information about the user and helps in managing and
planning tasks, using Python web technologies and open data available on the Internet. Contextual
information about the user can be location, current time, calendar appointments, relation between tasks,
decomposition of tasks, history of tasks, user interests, likes, etc. Agent can use data gathered about
the user as well as environment data to better understand what each of the tasks means and decompose
the tasks based on a sequence of steps stored in its knowledge base and then plan individual tasks.
The planning part of the agent will strive to optimize resources and try to improve the productivity of
the user. It can be used as a time management application as well as a task management application.
20 | P a g e
By combining, related tasks together that can be completed at the same time and around the same
location, the agent will optimize the user’s resources to complete these tasks.
A feedback loop from the user will help the agent to make decisions when there are multiple paths, and
the agent does not have sufficient information to make those decisions.
Assumptions, limitations, and constraints in the solution will be highlighted and any additional
infrastructure necessary as a complement to the system will be identified.
3.4. SOFTWARE AND HARDWARE REQUIREMENTS
REQUIRED FEATURES OF SYSTEM
Usability
The system is designed with a completely automated process hence there is no or less user
intervention.
Reliability
The system is more reliable because of the qualities that are inherited from the chosen platform
php. The code built by using PHP is more reliable.
Performance
This system is developed in high-level languages and uses the advanced front-end and back-end
technologies it will give a response to the end-user on the client system within extraordinarily
little time.
21 | P a g e
Supportability
The system is designed to be cross-platform supportable. The system is supported on a wide range
of hardware and any software platform, which is having Apache, built into the system.
Implementation
The system is implemented in a web environment using core PHP. Apache is used as the web
server and Windows XP Professional is used as the platform.
REQUIRED SOFTWARE SPECIFICATION
• OS WINDOWS 10X OR EARLIER UPTO 7X
• PYTHON 3. X VERSION
• PYCHARM IDE 2019.1.3
REQUIRED HARDWARE SPECIFICATION
• Processor :- INTEL CORE I3 OR ABOVE
• RAM :- 4 GB
• Hard Disk :- 500GB
• Monitor :- Colour monitor
• Keyboard :- 104 keys
• Mouse :- Any pointing device
22 | P a g e
3.4.3. LIBRARIES
Pyttsx3- It is a text-to-speech conversion library in Python that is used to convert the text given in the
parenthesis to speech. It is compatible with Python 2 and 3. An application invokes the pyttsx3.init()
factory function to get a reference to a pyttsx3. it is a very easy-to-use tool that converts the entered
text into speech. The pyttsx3 module supports two voices first is female and the second is male which
is provided by “sapi5” for Windows.
Command to install: - pip install pyttsx3
It supports three TTS engines: -
sapi5- To run on windows
nsss - NSSpeechSynthesizer on Mac OS X
espeak – eSpeak on every other platform
Speech recognition- It allows computers to understand human language. Speech recognition is a
machine's ability to listen to spoken words and identify them. We can then use speech recognition in
Python to convert the spoken words into text, make a query or give a reply. Python supports many
speech recognition engines and APIs, including Google Speech Engine, Google Cloud Speech API.
Command to install: - pip install Speech Recognition
WolfarmAlpha- Wolfram Alpha is an API that can compute expert-level answers using Wolfram's
algorithms, knowledgebase, and AI technology. It is made possible by the Wolfram Language. The
23 | P a g e
WolfarmAlpha API provides a web-based API allowing the computational and presentation
capabilities of Wolfram Alpha to be integrated into web, mobile, and desktop applications.
Command to install: - pip install wolframalpha
Rand facts- Rand facts are a Python library that generates random facts. We can use
randfacts.get_fact() to return a random fun fact.
Command to install: - pip install Rand facts
Py jokes- Py jokes is a Python library that is used to create one-line jokes for users. Informally, it can
also be referred to as a fun Python library which is simple to use.
Command to install: - pip install Py jokes
Date Time- This module is used to get the date and time for the user. This is a built-in module so there
is no need to install this module externally. Python Datetime module supplies classes to work with date
and time. Date and Date Time are an object in Python, so when we manipulate them, we are
manipulating objects and not strings or timestamps.
Random2- Python version 2 has a module named "random". This module provides a Python 3 ported
version of Python 2.7's random module. It has also been backported to work in Python 2.6. In Python
3, the implementation of randrange() was changed, so that even with the same seed you get different
sequences in Python 2 and 3.
24 | P a g e
Math- This is a built-in module that is used to perform mathematical tasks. For example, math. Cos ()
which returns the cosine of a number, or math.log () returns the natural logarithm of a number or the
logarithm of a number to base.
Warnings- The warning module is a subclass of Exception which is a built-in class in Python. A
warning in a program is distinct from an error. Conversely, a warning is not critical. It shows some
messages, but the program runs.
OS- The OS module is a built-in module that provides functions with which the user can interact with
the OS when they are running the program. This module provides a portable way of using operating
system-dependent functionality. This module has functions with which the user can open the file which
is mentioned in the program.
Serial- This module encapsulates the access for the serial port. It provides backends for Python running
on Windows, OSX, Linux, BSD, and Iron Python. The module named ¡§serial¡¨ automatically selects
the appropriate backend.
Command to install: - pip install pyserial
Wikipedia is a Python library that makes it easy to access and parse data from Wikipedia. Search
Wikipedia, get article summaries, get data like links and images from a page, and more. Wikipedia is
a multilingual online encyclopaedia.
Command to install: - pip install Wikipedia
25 | P a g e
Selenium Web drive- The Selenium module is used to automate web browser interaction from Python.
Several browsers/drivers are supported (Firefox, Chrome, Internet Explorer), as well as the Remote
protocol. The supported Python versions are Python 3.5 and above.
Command to install: - pip install selenium
Requests- The requests module allows you to send HTTP requests using Python. The HTTP request
returns a Response Object with all the response data. With it, we can add content like headers, form
data, multipart files, and parameters via simple Python libraries. It also allows you to access the
response data of Python in the same way.
Command to install: - pip install requests
Web browser- The Web browser module is a convenient web browser controller. It provides a high-
level interface that allows displaying Web-based documents to users. web browser can also be used as
a CLI tool. It accepts a URL as the argument with the following optional parameters: -n opens the URL
in a new browser window, if possible, and -t opens the URL in a new browser tab. This is a built-in
module, so installation is not required.
3.5. PROGRAMMING LANGUAGES
Programming languages are fundamental tools used by developers to write, edit, and execute software
programs and applications. These languages serve as structured methods for communicating
instructions to computers, enabling them to perform specific tasks or operations. Each programming
language has its syntax, rules, and capabilities tailored to diverse types of applications and
26 | P a g e
environments. For instance, high-level languages like Python and JavaScript prioritize readability and
ease of use, making them ideal for rapid development and web applications. In contrast, lower-level
languages such as C and C++ offer greater control over hardware and system resources, crucial for
developing performance-critical software like operating systems and embedded systems. Additionally,
domain-specific languages like SQL facilitate efficient database management, while functional
languages like Haskell emphasize mathematical functions and immutable data. The choice of
programming language depends on factors such as project requirements, performance goals, and
developer preferences, each offering unique strengths in software development landscapes.
3.5.1 PYTHON
What is Python?
Python is a popular programming language. It was created by Guido van Rossum and released in
1991.
It is used for:
web development (server-side),
software development,
mathematics,
system scripting.
27 | P a g e
What can Python do?
Python can be used on a server to create web applications.
Python can be used alongside software to create workflows.
Python can connect to database systems. It can also read and modify files.
Python can be used to handle big data and perform complex mathematics.
Python can be used for rapid prototyping or production-ready software development.
Why Python?
Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc).
Python has a simple syntax like the English language.
Python has a syntax that allows developers to write programs with fewer lines than some other
programming languages.
Python runs on an interpreter system, meaning that code can be executed as soon as it is written.
This means that prototyping can be very quick.
Python can be treated procedurally, in an object-orientated way, or in a functional way.
28 | P a g e
Good to know
The most recent major version of Python is Python 3, which we shall be using in this tutorial.
However, Python 2, although not being updated with anything other than security updates, is still
quite popular.
In this tutorial, Python will be written in a text editor. It is possible to write Python in an Integrated
Development Environment, such as Thonny, PyCharm, NetBeans, or Eclipse which are
particularly useful when managing larger collections of Python files.
Python Syntax compared to other programming languages
Python was designed for readability and has some similarities to the English language with
influence from mathematics.
Python uses new lines to complete a command, as opposed to other programming languages which
often use semicolons or parentheses.
Python relies on indentation, using whitespace, to define scope; such as the scope of loops,
functions, and classes. Other programming languages often use curly brackets for this purpose.
Python is an interpreted, high-level, general-purpose programming language. Created by Guido
van Rossum and first released in 1991, Python's design philosophy emphasizes code readability
29 | P a g e
with its notable use of significant whitespace. Its language constructs and object-oriented
approach aim to help programmers write clear, logical code for small and large-scale projects.
Python is dynamically typed, and garbage collected. It supports multiple programming paradigms,
including procedural, object-oriented, and functional programming. Python is often described as
a "batteries included" language due to its comprehensive standard library.
Python was conceived in the late 1980s as a successor to the ABC language. Python 2.0, released
2000, introduced features like list comprehensions and a garbage collection system capable of
collecting reference cycles. Python 3.0, released 2008, was a major revision of the language that
is not completely backward compatible, and much Python 2 code does not run unmodified on
Python 3. Due to concerns about the amount of code written for Python 2, support for Python 2.7
(the last release in the 2.x series) was extended to 2020. Language developer Guido van Rossum
shouldered sole responsibility for the project until July 2018 but now shares his leadership as a
member of a five-person steering council.
Python interpreters are available for many operating systems. A global community of
programmers develops and maintains Python, an open-source reference implementation. A non-
profit organization, the Python Software Foundation, manages and directs resources for Python
and Python development.
Python is an easy-to-learn, powerful programming language. It has efficient high-level data
structures and a simple but effective approach to object-oriented programming. Python’s elegant
30 | P a g e
syntax and dynamic typing, together with its interpreted nature, make it an ideal language for
scripting and rapid application development in many areas on most platforms.
The Python interpreter and the extensive standard library are freely available in source or binary
form for all major platforms from the Python Web site, https://www.python.org/, and may be
freely distributed. The same site also contains distributions of and pointers to many free third-
party Python modules, programs and tools, and additional documentation.
The Python interpreter is easily extended with new functions and data types implemented in C or
C++ (or other languages callable from C). Python is also suitable as an extension language for
customizable applications.
This tutorial introduces the reader informally to the basic concepts and features of the Python
language and system. It helps to have a Python interpreter handy for hands-on experience, but all
examples are self-contained, so the tutorial can be read offline as well.
For a description of standard objects and modules, see The Python Standard Library. The Python
Language Reference gives a more formal definition of the language. To write extensions in C or
C++, read Extending and Embedding the Python Interpreter and Python/C API Reference Manual.
There are also several books covering Python in depth.
This tutorial does not attempt to be comprehensive and cover every single feature, or even every
commonly used feature. Instead, it introduces many of Python’s most noteworthy features and
will give you an innovative idea of the language’s Flavors and style. After reading it, you will be
able to
31 | P a g e
read and write Python modules and programs, and you will be ready to learn more about the
various Python library modules described in The Python Standard Library.
History
Guido van Rossum at OSCON 2006.
Main article: History of Python
Python was conceived in the late 1980s by Guido van Rossum at Centrum Wiskunde &
Informatica (CWI) in the Netherlands as a successor to the ABC language (itself inspired by
SETL), capable of exception handling and interfacing with the Amoeba operating system. Its
implementation began in December 1989. Van Rossum continued as Python's lead developer until
July 12, 2018, when he announced his "permanent vacation" from his responsibilities as Python's
Benevolent Dictator For Life, a title the Python community bestowed upon him to reflect his long-
term commitment as the project's chief decision-maker.[36] In January 2019, active Python core
developers elected Brett Cannon, Nick Coghlan, Barry Warsaw, Carol Willing, and Van Rossum
to a five-member "Steering Council" to lead the project.
Python 2.0 was released on 16 October 2000 with many major new features, including a cycle-
detecting garbage collector and support for Unicode.
Python 3.0 was released on 3 December 2008. It was a major revision of the language that is not
completely backward compatible. Many of its major features were backported to Python 2.6.x[40]
32 | P a g e
and 2.7.x version series. Releases of Python 3 include the 2to3 utility, which automates (at least
partially) the translation of Python 2 code to Python 3.
Python 2.7's end-of-life date was initially set at 2015 then postponed to 2020 out of concern that
a large body of existing code could not easily be forward-ported to Python 3. In January 2017,
Google announced work on a Python 2.7 to Go trans compiler to improve performance under
concurrent workloads.
Features and philosophy
Python is a multi-paradigm programming language. Object-oriented programming and structured
programming are fully supported, and many of its features support functional programming and
aspect-oriented programming (including by metaprogramming [45] and metaobjects (magic
methods)). Many other paradigms are supported via extensions, including design by contract
andlogic programming.
Python uses dynamic typing, and a combination of reference counting and a cycle-detecting
garbage collector for memory management. It also features dynamic name resolution (late
binding), which binds method and variable names during program execution.
Python's design offers some support for functional programming in the Lisp tradition. It has filter,
map, and reduce functions; list comprehensions, dictionaries, sets, and generator expressions. The
standard library has two modules (intercools and functions) that implement functional tools
borrowed from Haskell and Standard ML (Machine Learning).
33 | P a g e
The language's core philosophy is summarized in the document The Zen of Python (PEP 20),
which includes aphorisms such as:
Beautiful is better than ugly
Explicit is better than implicit
Simple is better than complex
Complex is better than complicated
Readability counts
Rather than having all its functionality built into its core, Python was designed to be highly
extensible. This compact modularity has made it particularly popular as a means of adding
programmable interfaces to existing applications. Van Rossum's vision of a small core language
with a large standard library and easily extensible interpreter stemmed from his frustrations with
ABC, which espoused the opposite approach. Python strives for a simpler, less cluttered syntax
and grammar while giving developers a choice in their coding methodology. In contrast to Perl's
"there is more than one way to do it" motto, Python embraces a "there should be one—and
preferably only one—obvious way to do it" design philosophy. Alex Martelli, a Fellow at the
Python Software Foundation and Python book author, writes that "To describe something as
'clever' is not considered a compliment in the Python culture."
Python's developers strive to avoid premature optimization and reject patches to non-critical parts
of the Python reference implementation that would offer marginal increases in speed at the cost
34 | P a g e
of clarity. When speed is important, a Python programmer can move time-critical functions to
extension modules written in languages such as C, or use PyPy, a just-in-time compiler. Python
is also available, which translates a Python script into C and makes direct C-level API calls into
the Python interpreter.
An important goal of Python's developers is to keep it fun to use. This is reflected in the language's
name—a tribute to the British comedy group Monty Python and in occasionally playful
approaches to tutorials and reference materials, such as examples that refer to spam and eggs
(from a famous Monty Python sketch) instead of the standard foo and bar.
A common neologism in the Python community is pythonic, which can have a wide range of
meanings related to program style. To say that code is pythonic is to say that it uses Python idioms
well, that it is natural or shows fluency in the language, and that it conforms with Python's
minimalist philosophy and emphasis on readability. In contrast, code that is difficult to understand
or reads like a rough transcription from another programming language is called unpythonic.
Users and admirers of Python, especially those considered knowledgeable or experienced, are
often referred to as Pythonists, Pythonistas, and Pythoneers.
35 | P a g e
5.3 PYCHARM IDE 2019.1.3
PyCharm is a dedicated Python and Django IDE providing a wide range of essential tools for
Python developers, tightly integrated together to create a convenient environment for productive
Python development and Web development.
PyCharm is available in three editions: Professional, Community, and Educational (Edu). The
Community and Edu editions are open-source projects, and they are free, but they have less
features. PyCharm Edu provides courses and helps you learn programming with Python. The
Professional edition is commercial and provides an outstanding set of tools and features. For
details, see the editions comparison matrix.\
PYCHARM FEATURES
Intelligent Coding Assistance
PyCharm provides smart code completion, code inspections, on-the-fly error highlighting and
quick fixes, along with automated code refactorings and rich navigation capabilities.
Intelligent Code Editor
PyCharm’s smart code editor provides first-class support for Python, JavaScript, CoffeeScript,
TypeScript, CSS, popular template languages and more. Take advantage of language-aware code
completion, error detection, and on-the-fly code fixes!
Smart Code Navigation
36 | P a g e
Use smart search to jump to any class, file, or symbol, or even any IDE action or tool window. It
only takes one click to switch to the declaration, super method, test, usages, implementation, and
more.
Fast and Safe Refactorings
Refactor your code the intelligent way, with safe Rename and Delete, Extract Method, Introduce
Variable, Inline Variable or Method, and other refactorings. Language and framework-specific
refactorings help you perform project-wide changes.
Built-in Developer Tools
PyCharm’s massive collection of tools out of the box includes an integrated debugger and test
runner; Python profiler; a built-in terminal; integration with major VCS and built-in database
tools; remote development capabilities with remote interpreters; an integrated ssh terminal; and
integration with Docker and Vagrant.
Debugging, Testing, and Profiling
Use the powerful debugger with a graphical UI for Python and JavaScript. Create and run your
tests with coding assistance and a GUI-based test runner. Take full control of your code with
Python Profiler integration.
VCS, Deployment, and Remote Development
37 | P a g e
Save time with a unified UI for working with Git, SVN, Mercurial, or other version control
systems. Run and debug your application on remote machines. Easily configure automatic
deployment to a remote host or VM and manage your infrastructure with Vagrant and Docker.
Database tools
Access Oracle, SQL Server, PostgreSQL, MySQL, and other databases right from the IDE. Rely
on PyCharm’s help when editing SQL code, running queries, browsing data, and altering schemas.
Web Development
In addition to Python, PyCharm provides first-class support for various Python web development
frameworks, specific template languages, JavaScript, CoffeeScript, TypeScript, HTML/CSS,
AngularJS, Node.js, and more.
Python Web frameworks
PyCharm offers great framework-specific support for modern web development frameworks such
as Django, Flask, Google App Engine, Pyramid, and web2py, including Django templates
debugger, manage.py and appcfg.py tools, special autocompletion and navigation, just to name a
few.
JavaScript & HTML
PyCharm provides first-class support for JavaScript, Coffee Script, TypeScript, HTML, and CSS,
as well as their modern successors. The JavaScript debugger is included in PyCharm and is
integrated with the Django server runs configuration.
38 | P a g e
Live Edit
Live Editing Preview lets you open a page in the editor and the browser and see the changes being
made in code instantly in the browser. PyCharm auto-saves your changes, and the browser smartly
updates the page on the fly, showing your edits.
Scientific Tools
PyCharm integrates with I Python Notebook, has an interactive Python console, and supports
Anaconda as well as multiple scientific packages including Matplotlib and NumPy.
Interactive Python console
You can run a REPL Python console in PyCharm which offers many advantages over the standard
one: on-the-fly syntax check with inspections, braces, and quotes matching, and of course code
completion.
Scientific Stack Support
PyCharm has built-in support for scientific libraries. It supports Pandas, NumPy, Matplotlib, and
other scientific libraries, offering you best-in-class code intelligence, graphs, array viewers, and
much more.
Conda Integration
Keep your dependencies isolated by having separate Conda environments per project, PyCharm
makes it easy for you to create and select the right environment
39 | P a g e
Debug Your Python Code with PyCharm
Visual Debugging
Some coders still debug using print statements because the concept is hard and pdb is intimidating.
PyCharm’s Python debugging GUI makes it easy to use a debugger by putting a visual face on
the process. Getting started is simple and moving on to the major debugging features is easy.
Debug Everywhere
Of course, PyCharm can debug code that you are running on your local computer, whether it is
your system Python, a VirtualNet, Anaconda, or a Conda env. In PyCharm Professional Edition
you can also debug code you are running inside a Docker container, within a VM, or on a remote
host through SSH.
Debug Inside Templates PRO ONLY
When you are working with templates, sometimes a bug sneak into them. These can be extremely
hard to resolve if you cannot see what is going on inside them. PyCharm’s debugger enables you
to put a breakpoint in Django and Jinja2 templates to make these problems easy to fix.
Note: to debug templates, first configure the template language.
JavaScript PRO ONLY
Any modern web project involves JavaScript; therefore, any modern Python IDE needs to be able
to debug JavaScript as well. PyCharm Professional Edition comes with the highly capable
40 | P a g e
JavaScript debugger from WebStorm. Personal voice assistant in-browser JS and NodeJS are
supported by the JavaScript debugger.
Debugging During TDD
Test-driven development, or TDD, involves exploration while writing tests. Use the debugger to
help explore by setting breakpoints in the context you are investigating:
This investigation can be in your test code or in the code being tested, which is extremely helpful
for Django integration tests (Django support is available only in PyCharm Professional Edition).
Use a breakpoint to find out what is coming from a query in a test case:
No Code Modification Necessary
PDB is a great tool, but requires you to modify your code, which can lead to accidentally checking
in `pdb.set_trace()` calls into your gilt repo.
See What Your Code Does
Breakpoints
All debuggers have breakpoints, but only some debuggers have highly versatile breakpoints. Have
you ever clicked ‘continue’ many times until you finally get to the loop iteration where your bug
occurs? No need for that with PyCharm’s conditional breakpoints.
41 | P a g e
Sometimes all you want to do is see what a certain variable’s value is throughout code execution.
You can configure PyCharm’s breakpoints to not suspend your code, but only log a message for
you.
Exceptions can ruin your day, that’s why PyCharm’s debugger can break on exceptions, even if
you are not entirely sure where they are coming from.
To help you stay in control of your debugging experience, PyCharm has an overview window
where you can see all your breakpoints, as well as disable some by checkbox. You can also
temporarily mute all your breakpoints until you need them.
See Variable Values briefly
As soon as PyCharm hits a breakpoint, you will see all your variable values inline in your code.
To make it easy to see what values have changed since the last time you hit the breakpoint,
changed values are highlighted.
Watches
Customize your variable view by adding watches. Whether they are simple or complex, you will
be able to see exactly what you want to see.
Control Your Code
Visually Step Through Your Code
If you want to know where your code goes, you do not need to put breakpoints everywhere. You
can step through your code and keep track of exactly what happens.
42 | P a g e
Run Custom Code
In some cases, the easiest way to reproduce something is to force a variable to a certain value.
PyCharm offers personal voice assistance `evaluate expression` to quickly change something, and
a console if you would like more control. The console can even use the Python shell if it is
installed.
Speed
Faster Than PDB
For Python 3.6 debugging, PyCharm’s debugger is the fastest debugger on the market. Even faster
than PDB. What this means is that you can simply always run your code under the debugger while
developing, and easily add breakpoints when you need them. Just make sure to click ‘install’ when
PyCharm asks whether to install the Python speedups.
3.5.2. DOMAIN
The domain of personal voice assistants encompasses the development and deployment of intelligent
software agents capable of understanding and responding to voice commands. These assistants,
powered by advancements in natural language processing (NLP) and artificial intelligence (AI),
perform a variety of tasks ranging from managing schedules and setting reminders to retrieving
information and controlling smart home devices. They operate across multiple platforms, including
smartphones, smart speakers, and computers, providing seamless and intuitive user interactions.
43 | P a g e
Personal voice assistants leverage vast databases, user-generated content, and contextual information
to deliver personalized and context-aware responses. By automating routine tasks and offering hands
free operation, they enhance user productivity, convenience, and accessibility, becoming indispensable
tools in both personal and professional settings.
The domain of a personal voice assistant encompasses the specific area or field in which the assistant
operates, providing tailored functionalities and services to users. In the context of a personal voice
assistant, the domain typically involves:
1. Task Management: Managing and organizing tasks such as scheduling appointments, setting
reminders, creating to-do lists, and managing daily routines.
2. Information Retrieval: Accessing and retrieving information from various sources such as the web,
knowledge databases, and user-specific data to answer questions and provide updates.
3. Automation: Automating routine tasks and processes to improve efficiency and productivity, such
as sending emails, controlling smart home devices, and performing online transactions.
4. Personalization: Adapting responses and actions based on user preferences, historical interactions,
and contextual information such as location and time.
5. Communication: Serving as an interface for users to interact with digital systems and services
through natural language input, voice commands, or text-based interfaces.
6. Integration: Integrating with other applications, platforms, and services to enhance functionality and
provide seamless user experiences across different devices and environments.
44 | P a g e
7. Support and Assistance: Providing support and assistance to users by offering guidance,
recommendations, and insights based on data analysis and user interactions.
The domain of a personal voice assistant is dynamic and evolving, incorporating advancements in
artificial intelligence, natural language processing, and machine learning to continuously enhance its
capabilities and utility for users in both personal and professional contexts.
45 | P a g e
3.6. SYSTEM ARCHITECTURE
Fig 3.1 System architecture figure
46 | P a g e
The system architecture of a personal AI assistant typically comprises several key components: the
user interface, natural language processing (NLP) engine, knowledge base, and backend services. The
user interface facilitates interactions through voice or text inputs. The NLP engine processes these
inputs, converting them into machine-readable commands—the knowledge base stores relevant
information, leveraging semantic data sources, user-generated content, and knowledge databases.
Backend services handle task execution, such as fetching data, managing schedules, or controlling
smart devices. This architecture ensures seamless and intelligent responses, enabling the AI assistant
to perform tasks efficiently and provide personalized assistance to users.
3.7. ALGORITHMS USED
3.7.1 SPEECH RECOGNITION MODULE
The class which we are using is called Recognizer.
It converts the audio files into text and the module is used to give the output in speech.
The energy threshold function represents the energy level threshold for sounds. Values below this
threshold are considered silence, and values above this threshold are considered speech.
Recognizer instance. adjust_for_ambient_noise (source, duration = 1), adjusts the energy threshold
dynamically using audio from the source (an AudioSource instance) to account for ambient noise.
47 | P a g e
SPEECH TO TEXT & TEXT TO SPEECH CONVERSION
Pyttsx3 is a text-to-speech conversion library in Python. And can change the Voice, Rate, and Volume
by specific commands.
Python provides an API called Speech Recognition to allow us to convert audio into text for further
processing converting large or long audio files into text using the Speech Recognition API in Python.
We have included sapi5 and speak TTS Engines which can process the same
PROCESS & EXECUTES THE REQUIRED COMMAND
The said command is converted into text via a speech recognition module and further stored in a
temperature.
Then, analyse the user’s text via temperature decide what the user needs based on the input provided,
and run the while loop.
Then, Commands are executed.
48 | P a g e
3.8. SYSTEM DESIGN
Fig 3.8
In this project, there is only one user. The user queries commands to the system. The system
then interprets it and fetches the answer. The response is sent back to the user
49 | P a g e
3.8.1 Component diagram
Fig 3.8.1
The main component here is the Virtual Assistant. It provides two specific services,
executing Task or Answering your question
50 | P a g e
3.8.2 SEQUENCE DIAGRAM
Fig 3.8.2
51 | P a g e
The user sends a command to the virtual assistant in audio form. The command is passed to
the interpreter. It identifies what the user has asked and directs it to the task executor. If the task is
missing some info, the virtual assistant asks the user back about it. The received information is sent
back to the task and it is accomplished. After execution feedback is sent back to the user.
Fig 3.8.3 Sequence Diagram (Answering the user)
52 | P a g e
The sequence diagram illustrates the process of fetching an answer from the internet in response to a
user's audio query. Initially, the user asks a question using voice, which is captured by a microphone.
This audio query is then processed by a speech recognition system that interprets the spoken words and
converts them into text. The textual query is subsequently sent to a web scraper, a tool designed to
search the internet for relevant information. The web scraper scours various online sources, collects
the necessary data, and identifies the most appropriate answer to the user's query. Once the answer is
found, it is sent back to the system, where it may undergo further processing if needed. Finally, the
processed answer is relayed to a text-to-speech engine, which converts the textual response back into
spoken words. The speaker then delivers the answer audibly to the user, completing the information
retrieval cycle. This automated sequence ensures a seamless and efficient method for obtaining and
presenting information from the internet in response to user inquiries.
3.9 FEASIBILITY STUDY
A feasibility study can help you determine whether you should proceed with
your project. It is essential to evaluate cost and benefit. It is essential to evaluate the cost and
benefit of the proposed system. Five types of feasibility studies are taken into consideration.
1. Technical feasibility: It includes discovering technologies for the project, both
hardware and software. For virtual assistants, users must have a microphone to convey
53 | P a g e
their message and a speaker to listen when the system speaks. These are unbelievably cheap nowadays
and everyone possesses them. Besides, the system needs an internet connection.
While using, make sure you have a steady internet connection. It is also not an
issue in this era where every home or office has Wi-Fi.
2. Operational feasibility: The proposed system's ease and simplicity of operation.
The system does not require any special skill set for users to operate it. It is
designed to be used by everyone. Kids who still do not know how to write can read
out problems for the system and get answers.
3. Economic feasibility: Here, we find the total cost and benefit of the proposed system over the current
system. For this project, the main cost is documentation cost. The user also would have to pay for a
microphone and speakers. Again, they are cheap and available. As far as maintenance is concerned, it
will not cost too much.
4. Organizational feasibility: This shows the management and organizational structure of the project.
This project is not built by a team. The management tasks are all to be carried out by a single person.
That will not create any management issues and will increase the feasibility of the project.
5. Cultural feasibility: It deals with the compatibility of the project with the cultural environment. A
virtual assistant is built under the general culture. This project is technically feasible with no external
54 | P a g e
hardware requirements. Also, it is simple in operation and does not cost training or repairs. Overall
feasibility study of the project reveals that the goals of the proposed system are achievable. The
decision is taken to proceed with the project.
3.10. TYPES OF OPERATION
If we ask for some information, it opens Wikipedia and asks us the topic on which we want the
information, then it clicks on the Wikipedia search box using its path, searches the topic in the search
box, and clicks the search button using the XPath of the button and reads a paragraph about that topic.
Keyword: information
Plays the video which we ask:
If we ask it to play a video, it opens YouTube and asks us the name of the video that it wants to play.
After that, it clicks on the search YouTube search box using its path, then it clicks on the search button
using its path and clicks the first result of the search using the path of the first video.
Keyword: Play and video or music
News of the day:
If we ask for the news, it reads out the Indian news of the day on which it is asked.
Keyword: news
Temperature and Weather:
If the user asks for the temperature, it gives the current temperature.
Keyword: temperature
Joke:
55 | P a g e
If the user asks for a joke, it tells a one-liner joke to the user.
Keyword: funny or joke
Fact:
If the user asks for some logical fact, it tells a fact to the user.
Keyword: fact
Game:
The assistant can play the number guessing game with the user. First, it asks for the lower and the
upper limit between which the number should be. Then it initializes a random number between that
upper and lower limit. After that, it uses a formula to calculate the number of turns within which the
user should guess the number.
Keyword: game
Restart the system:
The assistant restarts the system if the user asks the assistant to restart the system.
Keyword: Restart the system or reboot the system
Open:
The assistant will open some of the folders and applications which the user asks the assistant to
open.Keyword: Open
56 | P a g e
Date and Time:
If the user asks for the date or time, the assistant tells it.
Keyword: date or time or date and time
Calculate:
The assistant will calculate the equations that the user tells it to calculate using WolframAlpha API
key.
Keyword: calculate (along with the equation)
Turn on the light:
This is an IOT feature where the assistant turns on the light if the user asks it to turn on the light.
Keyword: light on
Turn off the light:
This is an IOT feature where the assistant turns off the light if the user asks it to turn off the light.
Keyword: light off
57 | P a g e
CHAPTER 4
PERSONAL ASSISTANT SOFTWARE IN THE MARKET
4.1 GOALS OF PERSONAL ASSISTANT SOFTWARE
The primary objective of the personal assistant software is to function as a seamless interface between
users and the digital world. It achieves this by comprehending user requests or commands and
translating them into actionable tasks or insightful recommendations. Central to its operation is a
sophisticated knowledge base that models the agent's understanding of the world, encompassing
intricate relationships, connections, and rules between various concepts. While these agents are not
intended to replace human capabilities, they excel in automating mundane tasks that users might find
repetitive or less engaging. This efficiency is facilitated by the software's ability to process vast
amounts of real-time information sourced from the web and other data repositories. By handling routine
tasks and providing timely information, the software aims to enhance user productivity and streamline
daily activities. It operates with a continuous learning mechanism, adapting its responses based on user
interactions and feedback, thereby improving its relevance and performance over time. This approach
ensures that the software remains an invaluable tool for users, augmenting their capabilities through
intelligent automation and data-driven decision-making. Personal assistant software aims to streamline
and enhance the user's daily activities by leveraging artificial intelligence, natural language processing
(NLP), and automation. These systems are designed to act as proactive and intelligent helpers, capable
58 | P a g e
of understanding and fulfilling user requests across various domains. The goals of personal assistant
software can be summarized as follows:
Enhanced Efficiency and Productivity:
Personal assistants strive to simplify complex tasks and workflows, reducing the time and effort
required for routine activities. By automating repetitive tasks like scheduling appointments, managing
emails, or setting reminders, they allow users to focus on more critical and creative aspects of their
work and personal lives.
Seamless Integration and Accessibility:
Another key goal is to integrate seamlessly with diverse devices and applications, ensuring a unified
user experience across platforms. Whether on smartphones, computers, smart speakers, or IoT devices,
personal assistants should provide consistent functionality and access to information, making it easy
for users to interact and stay organized from anywhere.
Intelligent Information Retrieval:
Personal assistants excel in retrieving relevant information quickly and accurately in response to user
queries. They utilize advanced NLP techniques to understand natural language input, extract key
information, and fetch real-time data from sources like the web, databases, or specialized APIs. This
capability spans from retrieving weather updates and news headlines to finding answers to factual
questions or providing directions.
59 | P a g e
Personalization and Context Awareness:
Personal assistants aim to learn and adapt to individual user preferences and behaviours over time. By
analyzing past interactions and user data (with appropriate consent and privacy considerations), they
tailor responses and recommendations to suit specific needs. This personalization enhances user
satisfaction and efficiency by anticipating needs and providing proactive assistance.
Natural and Intuitive User Interface:
User interface design is crucial to the success of personal assistants. They should provide a natural and
intuitive interaction experience through voice commands, text input, or even gestures. Advances in
speech recognition and synthesis enable assistants to understand nuanced commands, maintain context
over multiple interactions, and respond with human-like speech patterns.
Continuous Improvement and Learning:
Personal assistants are designed to evolve and improve through machine learning algorithms and user
feedback. By analysing usage patterns and incorporating new data, they can expand their knowledge
base, improve accuracy, and adapt to changing user preferences and linguistic variations.
Security and Privacy:
Finally, ensuring robust security measures and respecting user privacy are paramount goals. Personal
assistant software must protect sensitive information, employ encryption where necessary, and provide
transparent control over data usage and storage. Upholding these standards builds trust and confidence
among users in the reliability and confidentiality of their interactions.
60 | P a g e
In summary, personal assistant software aims to enhance user productivity, streamline tasks, provide
intelligent information retrieval, personalize user experiences, offer intuitive interfaces, learn from
interactions, and prioritize security and privacy. As these technologies continue to advance, they hold
promise in transforming how individuals manage their daily activities and interact with digital
environments.
4.2 Different Types of Personal Assistant Software
Personal assistant software comes in several types, each tailored to different user needs and contexts.
Firstly, task-oriented assistants focus on managing specific tasks like scheduling appointments, setting
reminders, or organizing to-do lists. These assistants excel in improving time management and
enhancing productivity by automating routine activities. Secondly, some informational assistants
prioritize retrieving and presenting information to users. These include virtual agents capable of
answering queries, providing updates on news or weather, and fetching data from the web or databases.
Thirdly, cognitive assistants, powered by AI and machine learning, offer advanced capabilities such as
natural language understanding, context-aware recommendations, and personalized interactions. They
adapt to user preferences over time, learning from interactions to deliver increasingly tailored
assistance. Lastly, integrated assistants combine features from multiple types, offering a
comprehensive suite of functionalities that span task management, information retrieval, and cognitive
decision-making. These integrated solutions aim to provide a holistic user experience by seamlessly
61 | P a g e
blending automation with intelligent assistance, catering to diverse user needs in both personal and
professional settings.
4.2. 1. Voice Recognition as Input Entry Medium
Voice recognition has revolutionized user interaction with technology by serving as a sophisticated
input medium for personal assistant software and other applications. This technology enables users to
input commands, queries, or requests using spoken language, which the software then interprets and
processes. By leveraging advancements in machine learning, particularly with neural networks and
natural language processing algorithms, voice recognition systems have significantly improved in
accuracy and responsiveness. This capability not only enhances accessibility for users with varying
levels of typing proficiency or physical abilities but also facilitates hands-free operation, particularly
useful in scenarios like driving or multitasking. Voice recognition systems can understand and
distinguish between different accents, dialects, and languages, broadening their applicability across
global user bases. Moreover, continuous advancements in voice recognition technology are expanding
its capabilities beyond basic commands to more complex interactions, such as natural conversation and
context-aware responses. As a result, voice recognition continues to play a pivotal role in enhancing
user experience and productivity across a wide range of devices and applications.
62 | P a g e
4.2. 2. Voice Recognition-Based Task Automation or Information Retrieval
Voice recognition technology has evolved to become a cornerstone of task automation and information
retrieval in modern digital environments. By enabling users to interact with devices and applications
through spoken commands, voice recognition systems streamline daily tasks and enhance user
productivity. Task automation capabilities allow users to delegate routine activities such as scheduling
appointments, setting reminders, or controlling smart home devices, all through voice commands. This
hands-free approach not only saves time but also offers convenience, particularly in situations where
manual input may be impractical or cumbersome. Furthermore, voice recognition facilitates efficient
information retrieval by enabling users to ask questions, request updates on news or weather, or search
for specific information from the web—all without needing to type queries manually. The integration
of artificial intelligence and natural language processing techniques enhances these systems' ability to
understand context, user preferences, and nuances in speech, providing more accurate and relevant
responses over time. As voice recognition technology continues to advance, its role in automating tasks
and retrieving information will expand, offering increasingly personalized and intuitive interactions
that integrate seamlessly into everyday life. Task Automation:
Voice recognition allows for hands-free operation of tasks that traditionally require manual input. For
example, users can dictate emails, schedule appointments, control smart home devices, or even perform
complex calculations without touching a keyboard or screen. This automation not only improves
productivity but also enables multitasking and accessibility for users with physical limitations.
63 | P a g e
Modern voice assistants, powered by technologies like Google Speech Recognition, Amazon Alexa,
or Apple Siri, use sophisticated algorithms to convert spoken language into text accurately. They then
interpret these commands through NLP models that understand intent and context, enabling seamless
execution of tasks across various domains.
Information Retrieval:
Voice recognition-based information retrieval systems use voice queries to fetch relevant information
from vast data sources, including the web, databases, and APIs. Users can ask natural language
questions, such as inquiries about weather forecasts, stock prices, historical facts, or definitions, and
receive immediate spoken responses or displayed results.
These systems often integrate with knowledge databases like WolframAlpha or leverage web scraping
techniques to provide real-time information. Natural language understanding models parse user
queries, extract key information, and generate concise summaries or responses, mimicking human-like
interaction.
Future Directions:
Future advancements in voice recognition and NLP are expected to focus on improving accuracy,
contextual understanding, and user personalization. Enhanced deep learning models, such as
transformers and neural networks, will enable voice assistants to handle more complex queries and
adapt dynamically to user preferences and behaviours.
64 | P a g e
Additionally, integrating voice recognition with emerging technologies like augmented reality (AR) or
virtual reality (VR) could expand the capabilities of voice assistants beyond traditional screen-based
interactions. This convergence could redefine how users interact with digital information and
immersive environments, paving the way for new applications in education, healthcare, and
entertainment.
4.2.3 Planning
In this category of personal assistant software, the emphasis is on task understanding, subtask
identification, and task planning to facilitate efficient task completion for users. Examples like Siri,
which can book restaurant reservations using web services such as OpenTable, demonstrate the
capabilities of such systems. Similarly, the agent being designed as part of your thesis aims to operate
within this category by leveraging Python, web data sources, and inference technologies to manage
and plan tasks based on contextual user information. This approach not only enhances user productivity
but also highlights the integration of AI-driven capabilities in everyday task management scenarios.
65 | P a g e
4.3 History of Voice Assistants
A Modern History of Voice Assistants
In recent times, Voice assistants became the major platform after Apple integrated the most astonishing
Virtual Assistant — Siri which is officially a part of Apple Inc. But the timeline of greatest evolution
began with the year 1962 event at the Seattle World Fair where IBM displayed a unique apparatus
called Shoebox. It was the actual size of a shoebox and could perform scientific functions and perceive
16 words and speak to them in the human recognizable voice with 0 to 9 numerical digits.
During the period of the 1970s, researchers at Carnegie Mellon University in Pittsburgh, Pennsylvania
— with the considerable help of the U.S. Department of Defence and its Defence Advanced Research
Projects Agency (DARPA) — made Harpy. It could understand almost 1,000 words, which is the
vocabulary of a three-year-old child.
66 | P a g e
Big organizations like Apple and IBM sooner in the 90s started to make things that utilized voice
acknowledgment. In 1993, Macintosh began to build speech recognition with its Macintosh PCs with
Plain Talk.
In April 1997, Dragon NaturallySpeaking was the first constant dictation product that could
comprehend around 100 words and transform them into readable content.
Having said that, how cool it would be to build a simple voice-based desktop/laptop assistant that has
the capability to:
1. Open the YouTube in the browser.
2. Open any website in the browser.
3. Send an email to your contacts.
4. Launch any system application.
5. Tells you the current weather and temperature of almost any city
6. Tells you the current time.
67 | P a g e
7. Greetings
8. Play a song on a VLC media player (of course you need to have a VLC media player installed on
your laptop/desktop)
9. Change desktop wallpaper.
10. Tells you the latest news feeds.
11. Tells you about anything you ask.
So here in this article, we are going to build a voice-based application that can do all the above-
mentioned tasks.
2.4 What are Intelligent Personal Assistants or Automated Personal Assistants?
An Intelligent Personal Assistant (IPA) represents a sophisticated application designed to streamline
and enhance daily tasks through a natural language interface. These assistants excel in organizing and
managing information such as emails, calendar events, files, and to-do lists, acting as a virtual
concierge capable of performing tasks based on voice commands or inputs. They vary in capability,
from simple reflex agents that respond to basic commands to more advanced models like goal-based
or utility-based agents that prioritize tasks based on predefined objectives or user preferences. IPAs
leverage artificial intelligence, machine learning, and natural language understanding to interpret
complex queries, personalize responses, and automate tasks without constant user interaction. They
68 | P a g e
can schedule appointments, set reminders, automate research tasks, translate languages, and even
recommend products or services based on user preferences and historical data. Furthermore, IPAs
integrate seamlessly into various digital channels, ensuring continuity across different user interfaces
and enhancing overall user experience by adapting to evolving needs and preferences.
This comprehensive functionality not only enhances personal productivity but also extends to
enterprise applications, where IPAs can leverage industry-specific knowledge and data for marketing
or customer service purposes. Their ability to learn and adapt over time ensures they remain relevant
and effective in addressing diverse user needs, whether managing personal schedules or facilitating
business operations. By enabling natural conversation and intelligent decision-making, IPAs represent
a significant advancement in human-computer interaction, bridging the gap between user intent and
actionable outcomes through intuitive and responsive digital assistance.
4.5 How do Artificial Intelligence Assistants Interact with People?
Artificial Intelligence (AI) assistants interact with people through a variety of sophisticated
mechanisms that facilitate intuitive and seamless communication. Here is a detailed explanation of how
these interactions occur:
1. Natural Language Understanding (NLU):
AI assistants employ NLU to comprehend and interpret human language input. This capability allows
them to understand spoken commands, text queries, and even complex sentences, parsing the meaning
and intent behind the user's words.
69 | P a g e
2. Speech Recognition:
Using advanced speech recognition technologies, AI assistants convert spoken words into text. This
process enables hands-free interaction, where users can dictate commands or queries without the
needfor manual input.
3. Contextual Awareness:
AI assistants maintain contextual awareness during interactions, remembering previous interactions,
user preferences, and ongoing tasks. This allows them to provide relevant responses and anticipate user
needs based on the current context.
4. Task Execution and Automation:
Based on user commands and preferences, AI assistants execute tasks such as scheduling
appointments, setting reminders, sending messages, or controlling smart home devices. They automate
routine activities, enhancing user productivity and convenience.
5. Information Retrieval and Recommendations:
AI assistants access vast databases and real-time information sources to retrieve answers to queries,
provide updates on weather or news, and offer personalized recommendations for products or services
based on user preferences.
70 | P a g e
6. Conversational Interfaces:
AI assistants employ conversational interfaces that mimic human-like interactions, using natural
language responses and interactive dialogues to engage users. They can handle follow-up questions,
clarify ambiguities, and maintain coherent conversations.
7. Learning and Adaptation:
Through machine learning algorithms, AI assistants learn from user interactions to improve their
responses and adapt to individual preferences over time. They refine their understanding of user
behaviours and preferences, enhancing the accuracy and relevance of their interactions.
8. Multi-channel Integration:
AI assistants integrate seamlessly across multiple platforms and devices, ensuring consistent
interaction experiences. They can operate via smartphones, smart speakers, chatbots, and other digital
interfaces, maintaining continuity regardless of the user’s preferred device.
9. Feedback Mechanisms:
To enhance user satisfaction and effectiveness, AI assistants incorporate feedback mechanisms. They
solicit user input, track performance metrics, and adjust responses based on user feedback to
continuously improve interaction quality.
71 | P a g e
10. Privacy and Security:
AI assistants prioritize user privacy and data security. They implement encryption protocols,
anonymize data where necessary, and adhere to privacy regulations to safeguard user information
during interactions.
As AI technology advances, the interactions between AI assistants and people are expected to become
even more nuanced and responsive. Future developments may include enhanced emotional
intelligence, proactive assistance anticipating user needs, and improved capabilities in understanding
diverse languages and accents. These advancements will further blur the line between human-like
interaction and digital assistance, making AI assistants indispensable tools for everyday tasks and
professional applications alike. This detailed explanation outlines how AI assistants leverage
sophisticated technologies to interact effectively with users, enhancing convenience, productivity, and
personalized assistance in various aspects of daily life.
72 | P a g e
CHAPTER 5
IMPLEMENTATION
5.1 Building a Personal Voice Assistant
If the available personal voice assistants do not perform all the tasks, you want them to, it is possible
to build your own. For a text-based personal voice assistant, you do not even need to know how to
code. There are apps available to help people create assistants that can automate tasks or events.
Creating a voice-activated personal voice assistant is much more difficult. That is where companies
like Converse. AI. “We make it easier for non-developers to build and automate the services that they
need. No coding experience is required,” Lucas says.
A text-based personal voice assistant automates tasks and interacts with customers. It can also help
answer questions for clients, access databases, and help customers help themselves. For more
information about customer self-service portals, many of which use personal voice assistants, read
"Customer Service Portals: Help Your Users Help Themselves."
If you choose to create a personal voice assistant, make sure it is representative of your brand. Also,
make sure it works, since technology will not do your business any good if it does not help customers.
“The danger is that people will try it, it won’t work, and they won’t go back,” Mutchler warns. She
mentions Samsung’s Bixby, which debuted on the Galaxy S8 phone but was not fully functional when
73 | P a g e
it came out. Many customers tried it a few times, and then asked Samsung to develop a way to disable
it, which they did in a software update.
Here are some other elements to consider when building a personal voice assistant:
• Remember the end user.
• Choose useful features.
• Give it personality.
• Integrate it with various platforms.
Building a personal voice assistant takes time, so it is better not to rush it. Focus on doing a few things
extraordinarily well instead of trying to do many things (and, therefore, doing them unsuccessfully).
Also, remember to update the personal voice assistant, as necessary. It is not a “build it and leave it”
venture.
74 | P a g e
5.2 Dependencies and Requirements
System requirements: Python 3.7, PyCharm IDE, Win OS (version 10)
Install all these Python libraries:
pip install Speech Recognition
pip install pyttsx3
pip install web browser
pip install smptlib
pip installs random
pip install Wikipedia
pip install Date Time
pip install wolfram alpha
pip install OS
pip install sys
75 | P a g e
5.3 Let Us Start Building Our Personal Voice Assistant Using Python
import speech recognition as sr
import wolframalpha
from YT_auto import music
from selenium_web_driver import inforr
from News import
import randfacts
from pyjokes import
from weather import
import datetime
from search import sear
import random2 import math
import warnings import open
import os import serial
import time arduino = serial.Serial(port='COM3', baudrate=115200, timeout=.1)
For our voice assistant to perform all the above-discussed features, we must code the logic of each of
them in one method.

76 | P a g e
So, our first step is to create a method that will interpret the user voice response.
def myCommand():
r = sr.Recognizer()
with sr.Microphone() as source:
print ("Listening...")
r.pause_threshold = 1
audio = r.listen(source)
try:
query = r.recognize_google(audio, language='en-in')
print ('User: ' + query + '\n')
except sr.UnknownValueError:
speak ('Sorry sir! I didn\'t get that! Try typing the command!')
query = str (input ('Command: '))
return query
Next, create a method that will convert text to speech.
77 | P a g e
def speak(audio):
print ('Computer: ' + audio)
engine.say(audio)
engine.runAndWait()
Now create a loop to continue executing multiple commands. Inside the method assistant () passes user
command (my Command ()) as parameters.
while True:
query = myCommand();
query = query.lower()
if 'open YouTube' in query:
speak('okay')
webbrowser.open('www.youtube.com')
elif 'open google' in query:
speak('okay')
webbrowser.open('www.google.co.in')
elif 'open gmail' in query:
speak('okay')
78 | P a g e
webbrowser.open('www.gmail.com')
elif "what\'s up" in query or 'how are you' in query:
stMsgs = ['Just doing my thing!', 'I am fine!', 'Nice!', 'I am nice and full of energy']
speak(random.choice(stMsgs))
elif 'email' in query:
speak ('Who is the recipient? ')
recipient = myCommand()
if 'myself' in recipient:
try:
speak ('What should I say? ')
content = myCommand()
server = smtplib.SMTP('smtp.gmail.com', 587)
server.ehlo()
server.starttls()
server.login("jackie.61093@gmail.com", 'password')
server.sendmail('jackie.61093@gmail.com', "vjajodiya6@gmail.com", content)
server.close()
79 | P a g e
speak ('Email sent!')
except:
speak ('Sorry Sir! I am unable to send your message at this moment!')
elif 'nothing' in query or 'abort' in query or 'stop' in query:
speak('okay')
speak ('Bye Sir, have a good day.')
sys.exit()
elif 'hello' in query:
speak ('Hello Sir')
elif 'bye' in query:
speak ('Bye Sir, have a good day.')
sys.exit()
elif 'play music' in query:
music_folder = Your_music_folder_path
music = [music1, music2, music3, music4, music5]
random_music = music_folder + random.choice(music) + '.mp3'
os.system(random_music)
80 | P a g e
speak ('Okay, here is your music! Enjoy!')
else:
query = query
speak ('Searching...')
try:
try:
res = client.query(query)
results = next(res.results).text
speak ('WOLFRAM-ALPHA says - ')
speak ('Got it.')
speak(results)
except:
results = Wikipedia. Summary (query, sentences=2)
speak ('Got it.')
speak ('WIKIPEDIA says - ')
speak(results)
81 | P a g e
except:
webbrowser.open('www.google.com')
speak ('Next Command! Sir!')
Our next step is to create multiple if statements corresponding to each of the features. So let us see how
to create these small modules inside if statement for each command.
warnings.filterwarnings("ignore")
engine = p.init() rate = engine.getProperty('rate')
engine.setProperty('rate', 150) voices = engine.getProperty('voices')
engine.setProperty('voice', voices[0].id)
def speak(text): engine.say(text)
engine.runAndWait()
def wishme():
hour = int(datetime.datetime.now(). hour) if hour > 0 and hour < 12:
return ("Morning") elif hour >= 12 and hour < 16:
return ("Afternoon") elif hour >= 16 and hour < 19:
return ("evening") else:
return ("night")
82 | P a g e
def quitApp(): hour = int(datetime.datetime.now(). hour)
if hour >= 3 and hour < 18: print ("have a good day sir")
speak ("have a good day sir") else:
print ("Goodnight, sir") speak ("Goodnight, sir")
print("Offline") exit (0)
def write_read(x):
Arduino. Write (bytes (x, 'utf-8')) time. Sleep (0.05)
data = arduino.readline() return data
#flags
Light_status_flag = False
today date = datetime.datetime.now()
r = sr.Recognizer() speak ("Tell the wake-up word")
wake = "hello Nova" with sr.Microphone() as source:
r.energy_threshold = 10000 r.adjust_for_ambient_noise(source, 1.2)
print("Listening") audio = r.listen(source)
wakeword = r.recognize_google(audio)
print(wakeword)
83 | P a g e
if wake == wakeword: while True:
speak ("hello sir, good " + wishme() + ", i'm here to assist you.") speak ("How are you")
print("Listening") audio = r.listen(source)
text = r.recognize_google(audio) print(text)
if "what" and "about" and "you" in text:
speak ("I am also having a good day")
if name == " main ": while True:
speak ("What can i do for you??")
with sr.Microphone() as source: r.energy_threshold = 10000
r.adjust_for_ambient_noise(source, 1.2) print('Listening .... ')
audio = r.listen(source) text2 = r.recognize_google(audio)
if "information" in text2: speak ("You need information related to which topic")
print('Listening ..... ') audio = r.listen(source)
84 | P a g e
infor = r.recognize_google(audio) speak ("Searching {} in Wikipedia".format(infor))
print ("Searching {} in Wikipedia". Format(infor))
assist = inforr()
assist.get_info(infor)
elif "play" and "video" in text2: speak ("Which video you want me to play??")
vid = r.recognize_google(audio) speak ("Playing {} on YouTube".format(vid))
print ("Playing {} on youtube". format(vid)) assist = music ()
assist. Play(vid)
elif "news" in text2: speak ("Sure sir, Now I will read news for you")
arr = news () for i in range(len(arr)):
print(arr[i]) speak((arr[i]))
elif "temperature" in text2:
speak ("Temperature in Chennai is" + str (temp ()) + " degree celcius" + " and with " + str (des ()))
print ("Temperature in Chennai is" + str (temp ()) + " degree celcius" + " and with " + str (des ()))
85 | P a g e
elif "funny" in text2:
speak ("Get ready for some chuckles") joke = pyjokes.get_joke()
speak(joke) print(joke)
elif "your name" in text2:
speak ("My name is Next genn Optimal Voice Assistant Nova")
elif "fact" in text2:
speak ("Sure sir, ")
x = randfacts.getFact() speak ("Did you know that" + x)
print(x)
elif "search" in text2: speak ("What should i search for sir")
r.adjust_for_ambient_noise(source, 1.2) print('Listening ..... ')
audio = r.listen(source) searc = r.recognize_google(audio)
speak ("Searching {} in Google".format(searc)) print ("Searching {} in Google".format(searc))
asist = sear () asist.get_infoo(searc)
elif "game" in text2: speak ("enter your lower limit sir")
86 | P a g e
r.adjust_for_ambient_noise(source, 1.2) print('Listening ..... ')
audio = r.listen(source) lower = int(r.recognize_google(audio))
speak ("now, Enter your upper limit") with sr.Microphone() as source:
upper = int(r.recognize_google(audio)) x = random2.randint(lower, upper)
speak ("\n\tYou've only " + str (round (math.log (upper - lower + 1, 2))) + " chances to guess the
integer! \n") print ("\n\tYou've only " + str (round (math.log (upper - lower + 1, 2))) + " chances to
guess the integer! \n" + str(upper), str(lower))
count = 0 while count < math.log (upper - lower + 1, 2):
count += 1 speak ("start guessing")
speak ("Guess a number")
guess = int(r.recognize_google(audio)) if x == guess:
print ("Congratulations you did it in " + str(count) + " try") speak ("Congratulations you did it in " +
str(count) + " try")

87 | P a g e
break elif x > guess:
print ("You guessed too small!") speak ("You guessed too small!")
elif x < guess: print ("You Guessed too high!")
speak ("You Guessed too high!") if count >= math.log (upper - lower + 1, 2):
print ("\nThe number is %d" % x) speak ("\nThe number is %d" % x)
print ("\tBetter Luck Next time!") speak ("\tBetter Luck Next time!")
elif "reboot the system" in text2:
speak ("Do you wish to restart your computer?") with sr.Microphone() as source:
restart = r.recognize_google(audio) elif "light off" in text2:
#If Light_status_flag == True: cmd = "OFF"
Status = write_read(cmd) speak ("Lights are turned off")
#elif Light_status_flag == False: elif "stop" or "exit" or "end" in text2:
speak ("It's a pleasure helping you and I am always here to help you out!") quitApp()
88 | P a g e
5.3. 1 SCREENSHOT
89 | P a g e
90 | P a g e
91 | P a g e
5.4 Flow-chart
92 | P a g e
5.5 DATA FLOW DIAGRAM
93 | P a g e
CHAPTER 6
RESULT & ANALYSIS

The project work of the voice assistant has been clearly explained in this report, how useful it is and
how we can rely on a voice assistant for performing any/every task which the user needs to complete
and how the assistant is developing every day which we can hope that it'll be one of the biggest
technologies in the current technological world. Development of the software is almost completed from
our side, and it is working fine as expected which was discussed for some extra development. So, some
advancement might come shortly where the assistant which we developed will be even more useful
than it is now.
6.1 Working
It starts with a signal word. Users say the names of their voice assistants for the same reason. They
might say, “Hey Siri!” or simply, “Alexa!” Whatever the signal word is, it wakes up the device. It
signals to the voice assistant that it should begin paying attention. After the voice assistant hears its
signal word, it starts to listen. The device waits for a pause to know you have finished your request.
The voice assistant then sends our request over to its source code. Once in the source code, our request
is compared to other requests. It is split into separate commands that our voice assistant can understand.
The source code then sends these commands back to the voice assistant. Once it receives the
commands, the voice assistant knows what to do next. If it understands, the voice assistant will carry
out the task we asked for. For example, “Hey NOVA! What is the weather?” NOVA reports back to
us
94 | P a g e
in seconds. The more directions the devices receive, the better and faster they get at fulfilling our
requests. The user gives the voice input through the microphone and the assistant is triggered by the
wake-up word and performs the STT (Speech to Text) and converts it into a text and understands the
Voice input and further performs the task said by the user repeatedly and delivers it via TTS (Text to
Speech) module via AI Voice. These are the key features of the voice assistant but other than this, we
can do plenty of things with the assistant.
List of features that can be done with the assistant: - Playing some video which, the user wants to see.
- Telling some random fact at the start of the day with which the user can do their work in an
informative way and the user will also learn something new. - One of the features which will be there
in every assistant is playing some game so that the user can spend their free time in a fun way. - Users
might forget to turn off the system which might contain some useful data but with a voice assistant, we
can do that even after leaving the place where the system is just by commanding the assistant to turn
the system off. As discussed, the mandatory features to be listed in voice assistant are implemented in
this work, brief explanation is given below. API CALLS We have used API keys for getting news
information from news Api and weather forecasts from OpenWeatherMap which can accurately fetch
information and give results to the user. SYSTEM CALLS In this feature, we have used OS & Web
Browser Module to access the desktop, calculator, task manager, command prompt & user folder. This
can also restart the PC and open the Chrome application. CONTENT EXTRATION This can Perform
content extraction from YouTube, Wikipedia, and Chrome using the web driver module from Selenium
which provides all the implementations for the web drive like searching for a specific video to play, to
get a piece of specific information in Google or from Wikipedia.

95 | P a g e
Fig Workflow model
1) Must provide the user with any information which they ask for: - The user might need any
information which will be available on the internet but searching for that information and reading that
takes a lot of time but with the help of a voice assistant, we can complete that task of getting the
information sooner than searching and reading it. So, this is a small proof that a voice assistant helps
the user to save time
2) Telling the day's hot news in the user's location: - In Common, watching a news channel just to
know the important news in one’s location takes a lot of time and the user might even want to listen to
some news which is unnecessary to them or news of some different location before getting to know the
news which they want needs a lot of patience to the user but having a voice assistant makes all that
nothing, it'll give the news of the location which the user wants to now or the news which they want to
know.
96 | P a g e
3) Telling some joke to chill up the moment: - Now let us be honest, everyone has had at least one
moment in their life where they were so tense up or had an argument with their close people. So, these
moments can be chilled up at least ten percent with some random joke that might cool us that moment
or stop that fight. We even have a quote stating "Laughter is the best medicine" which is relatable to
the words mentioned here in this paragraph.
4) Opening the file/folder that the user wants: - In the busy world, everything should be done quickly
else, our schedule will get changed and sometimes we need assistance from someone to complete that
task quickly but, if we have a voice assistant, we can complete that task in right away in a hustle
freeway. For example, let's say the user is doing some documentation but after a while, he needs some
file for reference and he goes searching for that file which wastes a lot of time and he ends up missing
the deadline but, with a voice assistant, we can quickly do the searching part by commanding the
assistant to open the folder. So, by this, we can say that it is one of the key features of a voice assistant.
5) Telling the temperature/weather at the user's location: - Let us start this with a question, why is it
important for us to know the weather of the day? or why is it important for us to monitor the weather
every day? The answer is simple it forewarns the users asking about the weather saying, "It might rain
today so carry an umbrella if you go out" or "It will be a sunny day so wear a sun glass". So, by this,
we can say that this is also a must-have feature.
6) Searching for what the user asks: Today in the 20th century, we people often get doubts, and we
need to clear those doubts as soon as possible else that one doubt will be multiplied, and in the end, we
will
97 | P a g e
have no doubts and to clear the doubts searching the question on the internet will give us an answer
and clear our doubts and asking that the assistant will save a lot of time. Other than clearing the doubts,
we need to search a lot of questions or topics on the internet to keep up with the trend and we can do
this search just by giving the command to our assistant, asking it to search a specific topic/question.
6.2 PROS
AI assistants offer numerous advantages across various domains, significantly enhancing productivity
and user experience. Some of the key benefits include:
1. Efficiency and Speed: AI assistants can process and analyse large volumes of data quickly, providing
accurate responses and completing tasks in a fraction of the time it would take a human.
2. 24/7 Availability: Unlike human counterparts, AI assistants are available around the clock,
offeringconsistent support without the need for breaks, sleep, or time off.
3. Personalization: By learning from user interactions, AI assistants can offer personalized
recommendations and responses, improving the overall user experience.
4. Cost-Effectiveness: Deploying AI assistants can reduce operational costs, as they can handle
repetitive and mundane tasks, allowing human employees to focus on more complex and strategic
activities.
5. Scalability: AI assistants can easily scale to handle a growing number of tasks or users without a
significant increase in resources or costs.
98 | P a g e
6. Multitasking: They can manage multiple queries or tasks simultaneously, ensuring that various needs
are met efficiently and promptly.
7. Consistency and Accuracy: AI assistants provide consistent responses and minimize the risk of
human errors, ensuring reliable and accurate information is delivered.
8. Accessibility: They can assist users with disabilities by providing voice-activated commands and
responses, making technology more accessible to a broader audience.
9. Data Insights: AI assistants can analyze user interactions and gather valuable data insights, helping
businesses understand customer behaviour and preferences better.
10. Language Support: Many AI assistants can understand and respond in multiple languages, breaking
down language barriers and providing support to a diverse user base.
Overall, AI assistants enhance productivity, improve user satisfaction, and provide valuable support in
various personal and professional contexts.
6.3 CONS
While AI assistants offer numerous benefits, they also come with several drawbacks that need to be
considered:
1. Privacy Concerns: AI assistants often require access to personal data to function effectively, raising
concerns about data privacy and the potential misuse of sensitive information.
2. Security Risks: The data handled by AI assistants can be vulnerable to hacking and cyberattacks,
posing significant security risks.

99 | P a g e
3. Dependence on Technology: Over-reliance on AI assistants can lead to reduced human skill
development and dependency on technology for even simple tasks.
4. Lack of Emotional Intelligence: AI assistants, despite advancements, cannot fully replicate human
empathy, emotional understanding, or nuanced interpersonal communication.
5. Job Displacement: The automation of tasks by AI assistants can lead to job displacement, particularly
for roles involving repetitive or routine tasks, affecting employment in certain sectors.
6. Bias and Fairness Issues: AI systems can inherit biases present in their training data, leading to unfair
or discriminatory outcomes in their responses and decision-making processes.
7. Inaccuracy and Limitations: AI assistants may provide incorrect or misleading information if they
misinterpret queries or rely on outdated or inaccurate sources.
8. Complexity in Troubleshooting: Technical issues with AI systems can be complex to diagnose and
resolve, requiring specialized knowledge and resources.
9. Limited Understanding: Despite their advanced capabilities, AI assistants still struggle with
understanding context, sarcasm, idioms, and complex human emotions, which can lead to
misunderstandings.
10. High Initial Costs: Developing and implementing sophisticated AI systems can be expensive,
requiring significant investment in technology and infrastructure.
11. Ethical Concerns: The deployment of AI assistants raises ethical questions about the extent of their
control and decision-making abilities, as well as the transparency of their operations.
100 | P a g e
Overall, while AI assistants provide significant advantages, it is crucial to address these challenges to
ensure their responsible and beneficial integration into society.
6.4 Advantages of Artificial Intelligence in Personal Voice Assistant
Artificial Intelligence (AI) significantly enhances the capabilities of personal voice assistants, offering
a range of advantages that improve user experience and functionality:
1. Natural Language Processing (NLP)*: AI enables personal voice assistants to understand and
interpret natural language, allowing users to interact using conversational speech rather than predefined
commands. This makes interactions more intuitive and user-friendly.
2. Personalization: AI-driven voice assistants can learn from user interactions and preferences,
providing personalized responses and recommendations. They can remember user habits, preferences,
and schedules, tailoring their assistance to individual needs.
3. Contextual Understanding: Advanced AI allows voice assistants to understand context, which
improves their ability to handle complex queries and follow-up questions. This contextual awareness
leads to more accurate and relevant responses.
4. Automation and Task Management: AI voice assistants can automate routine tasks such as setting
reminders, sending messages, managing calendars, and controlling smart home devices. This
automation saves time and simplifies daily activities.
101 | P a g e
5. Accessibility: Voice assistants powered by AI provide valuable support for individuals with
disabilities, offering hands-free control of devices and enabling users with visual or motor impairments
to access information and services more easily.
6. Continuous Improvement: AI algorithms enable voice assistants to continuously learn and improve
from user interactions. This ongoing learning process ensures that the assistant becomes more efficient
and effective over time.
7. Multilingual Support: AI enables voice assistants to understand and respond in multiple languages,
catering to a diverse user base and breaking down language barriers.
8. Real-Time Information Retrieval: AI voice assistants can quickly retrieve information from the
internet, and provide real-time updates on news, weather, traffic, and other relevant data, enhancing
the user’s ability to stay informed.
9. Seamless Integration with Services and Devices: AI allows voice assistants to integrate seamlessly
with various third-party services and smart devices, creating a cohesive and interconnected user
experience across different platforms and technologies.
10. Enhanced Security Features: AI can improve the security of voice assistants through features like
voice recognition and biometric authentication, ensuring that only authorized users can access certain
functionalities.
Overall, AI enhances personal voice assistants by making them more intelligent, responsive, and
capable of providing valuable support in everyday tasks.
102 | P a g e
6.5 Disadvantages of Artificial Intelligence in Personal Voice Assistant
1. High Cost:
The creation of artificial intelligence requires huge costs as they are overly complex machines. Their
repair and maintenance require huge costs.
They have software programs that need frequent gradation to cater to the needs of the changing
environment and the need for the machines to be smarter by the day.
In the case of severe breakdowns, the procedure to recover lost codes and reinstate the system might
require huge time and cost.
2. No Replicating Humans:
Intelligence is believed to be a gift of nature. An ethical argument continues, whether human
intelligence is to be replicated or not.
Machines do not have any emotions or moral values. They perform what is programmed and cannot
make the judgment of right or wrong. Even cannot make decisions if they encounter a situation
unfamiliar to them. They either perform incorrectly or break down in such situations.
3. No Improvement with Experience:
Unlike humans, artificial intelligence cannot be improved with experience. With time, it can lead to
wear and tear. It stores a lot of data but the way it can be accessed and used is quite different from
human intelligence.
103 | P a g e
Machines are unable to alter their responses to changing environments. We are constantly bombarded
by the question of whether it is exciting to replace humans with machines.
In the world of artificial intelligence, there is nothing like working with a whole heart or passionately.
Care or concerns are not present in the machine intelligence dictionary. There is no sense of belonging
or togetherness or a human touch. They fail to distinguish between a hardworking individual and an
inefficient individual.
4. No Original Creativity:
Do you want creativity or imagination?
These are not the forte of artificial intelligence. While they can help you design and create, they are no
match to the power of thinking that the human brain has or even the originality of a creative mind.
Human beings are overly sensitive and emotional intellectuals. They see, hear, think, and feel. Their
thoughts are guided by feelings which completely lacks in machines. The inherent intuitive abilities of
the human brain cannot be replicated.
5. Unemployment:
The replacement of humans with machines can lead to large-scale unemployment.
Unemployment is a socially undesirable phenomenon. People with nothing to do can lead to the
destructive use of their creative minds.
104 | P a g e
Humans can unnecessarily be highly dependent on machines if the use of artificial intelligence becomes
rampant. They will lose their creative power and will become lazy. Also, if humans start thinking
destructively, they can create havoc with these machines.
Artificial intelligence in the wrong hands is a serious threat to humankind in general. It may lead to
mass destruction. Also, there is a constant fear of machines taking over or superseding humans.
Based on the above discussion, the Association for the Advancement of Artificial Intelligence has two
objectives – to develop and advance the science of artificial intelligence and to promote and educate
about the responsible usage of artificial intelligence.
Identifying and studying the risk of artificial intelligence is an especially important task at hand. This
can help in resolving the issues at hand. Programming errors or cyber-attacks need more dedicated and
careful research. Technology companies and the technology industry as a whole need to pay more
attention to the quality of the software. Everything that has been created in this world and our societies
is the continuous result of intelligence.
Artificial intelligence augments and empowers human intelligence. So as long we are successful in
keeping the technology beneficial, we will be able to help this human civilization.
105 | P a g e
CHAPTER 7
CONCLUSION
Voice Search has now become a definitive mobile experience. An absence of knowledge and learning
makes it especially tough for organizations to develop a strategy for voice search. There is a ton of
chance for a lot further and significantly more conversational experiences with users for AI in mobile
app development.
A great many people are searching for an answer to make various multitasking tasks more successful,
making speech-to-text the ideal feature. The utilization of voice-over content is likewise alluring to
individuals who do not want to use typing. With a mistake rate of just 8%, voice search will change
how individuals search over the internet.
Personal assistant software improves user productivity by managing routine tasks of the user and by
providing information from online sources to the user. As discussed earlier, technologies such as web
services, sharing of data, linked data, shared ontologies, knowledge databases, and mobile devices are
proving to be enablers for tools such as personal assistant software.
Building an agent that can replace a human assistant has been a holy grail for the software industry,
especially in the field of artificial intelligence. Difficulties associated with capturing human
intelligence in models that can be used to drive the agent have been one of the primary bottlenecks in
building such agents. The availability of data in semantic form, where the data carries itself the meaning
and data sources are interlinked with each other, provides an opportunity to first capture human
106 | P a g e
knowledge in this form and then apply reasoning engines that can interpret these models to make
inferences for simple tasks.
This project presents a comprehensive overview of the design and development of a Voice-enabled
personal assistant f using Python programming language. This Voice-enabled personal assistant, in
today's lifestyle, will be more effective in case of saving time, compared to that of previous days. This
Personal Assistant has been designed with ease of use as the main feature. The Assistant works properly
to perform some tasks given by the user. Furthermore, there are many things that this assistant can do,
like turning our PC off, restarting it, or reciting the latest news, with just one voice command.
107 | P a g e
CHAPTER 8
FUTURE ENHANCEMENTS
We are entering the era of implementing voice-activated technologies to remain relevant and
competitive. Voice-activation technology is vital not only for businesses to stay relevant with their
target customers, but also for internal operations. Technology may be utilized to automate human
operations, saving time for everyone. Routine operations, such as sending basic emails or scheduling
appointments, can be completed more quickly, with less effort, and without the use of a computer, just
by employing a simple voice command. People can multitask as a result, enhancing their productivity.
Furthermore, relieving employees from hours of tedious administrative tasks allows them to devote
more time to strategy meetings, brainstorming sessions, and other jobs that need creativity and human
interaction.
1) Sending Emails with a voice assistant:
Emails, as we all know, are very crucial for communication because they can be used for any
professional contact, and the finest service for sending and receiving emails is, as we all know, GMAIL.
Gmail is a Google-created free email service. Gmail can be accessed over the web or using third-party
apps that use the POP or IMAP protocols to synchronize email content.
To integrate Gmail with Voice Assistant we must utilize Gmail API. The Gmail API allows you to
access and control threads, messages, and labels in your Gmail mailbox.
108 | P a g e
2) Scheduling appointments using a voice assistant:
The demands on our time increase as our company grows. A growing number of people want to meet
with us. We have a growing number of people who rely on us. We must check in on certain projects or
set aside time to chat with business leads. There will not be enough hours in the day if we keep doing
things the old way.
We need to get a better handle on our full-time schedule and devise a strategy for arranging
appointments that does not interfere with our most critical job. By working with a virtual scheduler or,
in other words, a virtual assistant, we let someone else worry about the organization and prioritize
ourschedule while we focus on the work.
3) Improved Interface of a voice assistant (VUI):
Voice user interfaces (VUIs) allow users to interact with a system by speaking commands. VUIs
include virtual assistants like Amazon's Alexa and Apple's Siri. The real advantage of a VUI is that it
allows users to interact with a product without using their hands or their eyes while focusing on
anything else.
-Other benefits of a Voice user interface (VUI):
Speed and Efficiency:
Hands-free interactions are possible with VUIs. This method of interaction eliminates the need to click
buttons or tap on the screen. The major means of human communication is speech. People have been
using speech to form relationships for ages. As a result, solutions that allow customers to do the same
109 | P a g e
are extremely valuable. Furthermore, even for experienced texters, dictating text messages has been
demonstrated to be faster than typing. Hands-free interactions, at least in some circumstances, save
time and boost efficiency.
Intuitiveness and convenience:
Intuitive user flow is required of high-quality VUIs, and technical advancements are expected to
continue to improve the intuitiveness of voice interfaces. Compared to graphical UIs (User Interface),
VUIs require less cognitive effort from the user. Furthermore, everyone – from a small child to your
grandmother – can communicate. As a result, VUI designers are in a better position than GUI designers,
who run the danger of producing incomprehensible menus and exposing users to the agony of poor
interface design. Customers are unlikely to need to be instructed on how to utilize the technology by
VUI makers. People can instead ask their voice assistant for assistance
Another promising enhancement is the seamless integration with a broader range of smart devices and
services. As the Internet of Things (IoT) continues to expand, voice assistants will become central hubs
for controlling and interacting with various connected devices, from home automation systems to
wearable technology. Enhanced interoperability and the development of standardized communication
protocols will facilitate this integration.
Additionally, improvements in multi-modal interaction will enable voice assistants to combine voice,
text, and visual inputs for a richer and more interactive user experience. This capability will be
particularly beneficial in scenarios where visual information complements verbal commands, such as
navigation assistance or technical support.
110 | P a g e
In summary, the future enhancements of personal voice assistants will focus on improving NLU
(Natural Language Understanding) and NLG capabilities, personalization, seamless integration with
IoT devices, multi-modal interaction, and robust privacy and security measures. These advancements
will collectively contribute to creating more intelligent, responsive, and user-friendly voice assistants.
111 | P a g e
CHAPTER 9
BIBLIOGRAPHY
1. Agrawal, H., Singh, N., Kumar, G., Yagyasen, D., & Singh, S. V. (2021). Voice Assistant Using Python.
IJIRT, 8(2), 419–423.
2. Buhalis, D., & Moldavska, I. (2021). In-room Voice-Based AI Digital Assistants Transforming On-Site
Hotel Services and Guests’ Experiences. Information and Communication Technologies in Tourism
2021, 10(6), 30–44. https://doi.org/10.1007/978-3-030-65785-7_3
3. Dellaert, B. G. C., Shu, S. B., Arentze, T. A., Baker, T., Diehl, K., Donkers, B., Fast, N. J., Häubl, G.,
Johnson, H., Karmarkar, U. R., Oppewal, H., Schmitt, B. H., Schroeder, J., Spiller, S. A., & Steffel,
M. (2020). Consumer decisions with artificially intelligent voice assistants. Marketing Letters, 31.
https://doi.org/10.1007/s11002-020-09537-5
4. Geetha, Gomathy, Kottamasu, Manasa, & Nukala. (2021). The Voice Enabled Personal Assistant for Pc
using Python. International Journal of Engineering and Advanced Technology, 10(D2425.0410421),
162–165.
5. Krishnaraj, Faris, M., & Rajesh. (2021). Portable Voice Recognition with GUI Automation. IJIRT, 9(6),
20–23.
6. Paul, R., & Mukhopadhya, N. (2021). A Novel Python-based Voice Assistance System for reducing the
Hardware Dependency of Modern Age Physical Servers. IRJET, 8(5), 1425–1431.
7. Sayyed, A., Shaikh, A., Sancheti, A., Sangamnere, S., & H Bhangale, J. (2021). Desktop Assistant AI
Using Python. International Journal of Advanced Research in Science, Communication and
Technology, 6(2), 2581–2942.
8. Sprengholz, P., & Betsch, C. (2021). Ok Google: Using virtual assistants for data collection
inpsychological and behavioral research. Behavior Research Methods, 54(3), 1554–
3528. https://doi.org/10.3758/s13428-021-01629-y
112 | P a g e

Personal Voice Assistant

Uploaded by

Document Informationclick to expand document informationPersonal AI assistant using python

Document Informationclick to expand document information

Copyright:

Available Formats

Personal Voice Assistant

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Personal Voice Assistant

Uploaded by

Copyright:

Available Formats

“PERSONAL VOICE ASSISTANT”

“Bachelor of Computer Applications”

Reg. No: BCA21034

Project Report submitted to the University of Mysore in

University of Mysore Manasagangothri,

ASSISTANT”, submitted to the University of Mysore in partial fulfillment of the requirements

for the award of the Degree of BACHELOR OF COMPUTER APPLICATIONS is

candidate of any University.

Signature of the Student:

possible will be crowned first on the eve of success.

me with the completion of our project work.

leveraged to build powerful applications.

the user manage his or her tasks.

"Rex," predating modern computers by over 20 years.

communicatewith machines through voice commands.

assistants inspired our project

Siri, Amazon Alexa, Google Assistant, and Facebook’s.

medical assistance, personal

1.2 PROBLEM STATEMENT

Identifying the Need

time and effort.

input devices like keyboards or touch screens.

while driving, thereby enhancing overall productivity.

interactions more relevant and tailored to individual needs.

personal voice assistants:

the overall user experience.

everyone, creating barriers to access.

understanding and executing voice commands.

making smart home management more seamless.

authenticated access to sensitive information.

Scope of the Project

needs and challenges. The scope of the project includes:

includes compatibility with popular smart home ecosystems and devices.

simple pattern matching and acoustic modelling.

power and data availability.

networks for speech-to-text conversion, significantly enhancing accuracy and usability.

Key technological milestones in voice recognition include:

enabling systems to recognize complex patterns in speech data.

enhances user experience and expands functionality.

Contemporary trends in AI and voice assistant technology include:

adapting responses accordingly.

enabling smart homes, and connected environments.

in secure voice authentication and data encryption.

technology accessible to a global audience.

mechanisms, Nina becomes “smarter” with every day of personal utilization.

Design and Functionality

2. Task Management and Planning:

enhancing resource efficiency.

to adapt to user preferences and evolving contexts.

4. Assumptions, Limitations, and Constraints:

constraints might involve computational resources required for real-time decision-making.

processing tasks, will be identified and discussed.

functionality over time.

2.1 RELATED REVIEW

that are not much supported (Agrawal et al., 2021).

System call features (Sayyed et al., 2021).

reliability (Paul & Mukhopadhyay, 2021).