[go: up one dir, main page]

100% found this document useful (1 vote)
88 views118 pages

Personal Voice Assistant

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 118

“PERSONAL VOICE ASSISTANT”

by

Pradeepa J Moolya

“VI” Semester

“Bachelor of Computer Applications”

Reg. No: BCA21034

Project Report submitted to the University of Mysore in


partial fulfillment of the requirements of “VI Semester”
“Bachelor of Computer Applications” degree
examinations “2024”

University of Mysore Manasagangothri,


Mysore– 570006
DECLARATION

I, PRADEEPA J MOOLYA, hereby declare that the project, entitled “PERSONAL VOICE

ASSISTANT”, submitted to the University of Mysore in partial fulfillment of the requirements

for the award of the Degree of BACHELOR OF COMPUTER APPLICATIONS is

submitted to the Directorate of Outreach and Online Programmes, University of Mysore. It has

not formed the basis for awarding any Degree/ Fellowship or Other similar titles to any

candidate of any University.

Place: Udupi

Date: 07-08-2024

Signature of the Student:


ACKNOWLEDGMENT

“Task successful” makes everyone happy. But happiness will be gold without glitter if we don’t

state the people who have supported us to make it a success. Success will be crowned to people

who made it a reality but the people whose constant guidance and encouragement made it

possible will be crowned first on the eve of success.

This acknowledgment transcends the reality of formality when we would like to express deep

gratitude and respect to all those people behind the screen who guided, inspired, and helped

me with the completion of our project work.

I consider myself lucky enough to get such a good project. This project would add an asset to

my academic profile.
ABSTRACT

The adoption of social network sites and the use of smartphones with several sensors has

digitized users’ activities in real-time. Smartphone applications such as calendars, email, and

notes contain a lot of user information and provide a view into the user’s activities. In contrast,

sensors such as GPS sensors can be used to find information about the user passively. In

addition to this user and device data, these devices have access to the Internet that can be

leveraged to build powerful applications.

Personal voice assistant software (smart agent) can be used as an interface to the digital world

to make the consumption of this information timely and efficient for the user’s specific tasks.

The goal of the thesis is to design personal assistant software that understands the semantics of

the task, can decompose the task into multiple tasks within the context of the user, and plan

these tasks for the user. It will be designed using semantic web technologies and knowledge

databases to understand the relations between the tasks. The agent will be integrated with online

web services to harvest the data available online with the data available on the device and help

the user manage his or her tasks.


TABLE OF CONTENTS

1. INTRODUCTION
1.1 Introduction 1-4
1.2 Problem Statement 4-8
1.3 Background 8 - 11
1.4 Objectives 12 -14
2. LITERATURE SURVEY
2.1. Related Work 15 - 17
3. METHODOLOGY
3.1. Existing system 18
3.2. Proposed system 18-19
3.3. Objective of the Project 20
3.4. Software and Hardware requirements 21
3.4.1. Software requirement 22
3.4.2. Hardware requirement 22
3.4.3. Libraries 23 -26
3.5. Programming Languages 26
3.5.1. Python 27-43
3.5.2. Domain 43-45
3.6. System Architecture 46
3.6.1 System Architecture Figure
3.7. Algorithms Used 47-48
3.7.1. Speech Recognition Module
3.8. System Design 49-53
3.8.1 component diagram
3.8.2 sequence diagram
3.8.3 sequence diagram answering user
3.9 Feasibility Study 53 - 54
3.10. Types of operation 55- 57
4. PERSONAL ASSISTANT SOFTWARE IN THE MARKET
4.1 Goals of Personal Assistant Software 58 -60
4.2 Different Types of Personal Assistant Software 61
4.2.1 Voice Recognition as Input Entry Medium 62
4.2.2 Voice Recognition-Based Task Automation or Information Retrieval 63-64
4.2.3 Planning 65
4.3 History of Voice Assistants 66 -67
4.4 What are Intelligent Personal Assistants or Automated Personal Assistants? 68
4.5 How do Artificial Intelligence Assistants Interact with People? 69 -72
5. IMPLEMENTATION
5.1 Building a Personal Voice Assistant 73-74
5.2 Dependencies and Requirements 75
5.3 Let’s Start Building Our Voice Assistant Using Python 76-88
5.3. 1 Screenshot 89-91
5.4 Flow-chart 92
5.5 Data Flow Diagram 93
6. RESULT and ANALYSIS
6.1 Working Result 94-97
6.2 Pros 98-99
6.3 Cons 99 -100
6.4 Advantages of Artificial Intelligence in Personal Voice Assistant 101-102
6.5 Disadvantages of Artificial Intelligence in Personal Voice Assistant 103-105
7. CONCLUSION 106-107
8. FUTURE ENHANCEMENTS 108 -111
9. BIBLIOGRAPHY 112
Personal Voice Assistant

CHAPTER 1

INTRODUCTION

1.1 INTRODUCTION

The first voice-activated product, Radio Rex, was released in 1922. This simple toy featured a dog that

would remain inside a doghouse until the user exclaimed its name, "Rex." At that point, the dog would

jump out. This was achieved through an electromagnet tuned to a frequency like the vowel sound in

"Rex," predating modern computers by over 20 years.

In the 21st century, human interaction is increasingly being replaced by automation. Performance is a

key driver of this change, with a significant shift in technology rather than mere advancement.

Today, we train machines to perform tasks autonomously or to think like humans using technologies

such as Machine Learning and Neural Networks. In our current era, virtual assistants allow us to

communicatewith machines through voice commands.

Virtual assistants are software programs designed to ease daily tasks, such as showing weather reports,

providing daily news, and searching the internet. These assistants can take voice commands, activated

by an invoking or wake word, followed by the user's command. Examples of popular virtual assistants

include Apple’s Siri, Amazon’s Alexa, and Microsoft’s Cortana. The development and success of these

assistants inspired our project

1|Page
Personal Voice Assistant

Our system is designed for efficient use on desktops. Voice assistants are programs on digital devices

that listen and respond to verbal commands. For example, a user can ask, “What's the weather?” the

voice assistant will provide the weather report for that day and location.

Voice assistants are artificial intelligence (AI) systems designed to facilitate user interaction with

digital devices. The basic idea behind this project is to create a simple, stand-alone application that

helps less tech-savvy individuals use computers without feeling ignorant or computer illiterate. Over

time, computers have become especially important and less expensive, making accessibility crucial.

Our application functions similarly to Siri or Google Assistant but is primarily designed to interact

with the computer itself. The user interface (UI) of the application is self-explanatory and minimal.

Currently, it takes text as input, as most people may not be comfortable with speaking commands.

Mobile technology has become renowned for its user experience, allowing easy access to applications

and services from any geo-location. Various famous and commonly used mobile operating systems

include Android, Apple, Windows, and Blackberry. These operating systems provide a plethora of

applications and services. For instance, contact applications store user contact details and facilitate

calls or SMS. Similar applications are available worldwide via the Apple Store and Play Store. These

features have led to the implementation of various sensors and functionalities in mobile devices.

The most famous application on the iPhone is “SIRI,” which allows end users to communicate with

their mobile devices using voice commands. Similarly, Google developed “Google Voice Search” for

Android phones. However, this application primarily requires an internet connection. Our proposed
2|Page
Personal Voice Assistant

system, named Personal Assistant with Voice Recognition Intelligence, can function with or without

internet connectivity. It accepts user input in the form of voice or text, processes it, and returns the

output in various forms, such as performing an action or dictating a search result to the end user.

One of the goals of artificial intelligence is the realization of natural dialogue between humans and

machines. In recent years, dialogue systems, also known as interactive conversational systems, have

become one of the fastest-growing areas in AI. Many companies have used dialogue system technology

to establish various kinds of Virtual Personal Assistants (VPAs), such as Microsoft’s Cortana, Apple’s

Siri, Amazon Alexa, Google Assistant, and Facebook’s.

In this proposal, we have utilized a single-modal dialogue system that processes user input modes, such

as speech, to design the next generation of PVAs (Personal voice assistants). This new model aims to

increase interaction between humans and machines using technologies such as gesture recognition,

image/video recognition, speech recognition, and vast dialogue and conversational knowledge bases.

Additionally, the new PVAs system can be applied in various areas, including education assistance,

medical assistance, personal

voice assistants for vehicles, systems for people with disabilities, home automation, and security access

control. Call Digitization offers new possibilities to facilitate the activities of our daily lives through

assistive technology. This is the new way to connect with technology. Currently, the voice assistant is

especially useful for a person. The voice assistant takes less time. With the help of the voice assistant,

3|Page
Personal Voice Assistant

we can participate in other works and save time. Voice assistants are a great innovation that can change

people's lives in other ways. The voice assistant was first introduced on smartphones and gained

popularity on the back of its popularity. it was widely accepted by all. It can be conveniently used in

all age groups. Speech recognition is the process of converting speech into text. It is typically used by

voice assistants such as Alexa and Siri. Python provides an API called Speech Recognition. You can

use it to convert audio to text for further processing. Python's Speech Recognition API allows you to

convert large or long audio files to text. Users can give the commands in verbal and written form as

well. The user can open an application (if installed on your system), search for queries on Google,

Wikipedia, and YouTube, or calculate mathematical questions. Just give a voice command. Used the

Google Speech Recognition API and Google Text-to-Speech for voice input and output, respectively.

In addition, you can use the Wolfram Alpha API to calculate formulas.

1.2 PROBLEM STATEMENT

Identifying the Need

In today's technology-driven world, the demand for more intuitive, efficient, and accessible ways to

interact with digital devices is ever-increasing. Personal voice assistants (PVAs) have emerged as a

pivotal solution to meet this demand, providing users with a hands-free, efficient, and natural way to

perform tasks and access information. Here are some key reasons why PVAs are needed:

1. Efficiency and Productivity: PVA (Personal voice assistants) helps users' complete tasks quickly

without needing to navigate through multiple menus or type out commands. For instance, setting
4|Page
Personal Voice Assistant

reminders, sending messages, or making calls can be done swiftly through voice commands, saving

time and effort.

2. Accessibility: PVAs are crucial for people with disabilities, such as visual impairments or mobility

issues, as they provide an alternative method to interact with technology without relying on traditional

input devices like keyboards or touch screens.

3. Multitasking: In a fast-paced world, the ability to multitask is invaluable. PVAs allow users to

perform various tasks simultaneously, such as checking the weather while cooking or sending emails

while driving, thereby enhancing overall productivity.

4. Integration with Smart Devices: As smart home technology becomes more prevalent, PVAs act as

central hubs for controlling various smart devices. They can manage home automation systems, control

lighting, adjust thermostats, and even lock doors, contributing to a more connected and efficient living

environment.

5. Personalization: PVAs can learn user preferences and habits over time, providing personalized

responses and suggestions. This level of customization enhances user experience by making

interactions more relevant and tailored to individual needs.

Challenges

Despite their numerous benefits, there are several challenges faced without the widespread adoption of

personal voice assistants:

5|Page
Personal Voice Assistant

1. Limited Interaction: Without PVAs, users are confined to traditional methods of interaction like

typing and clicking, which can be slower and less intuitive. This limitation can hinder productivity and

the overall user experience.

2. Accessibility Barriers: For individuals with disabilities, the absence of PVAs can significantly

impact their ability to use technology effectively. Traditional interfaces may not be suitable for

everyone, creating barriers to access.

3. Increased Cognitive Load: Navigating through menus and interfaces to perform simple tasks can

increase cognitive load, leading to frustration and decreased efficiency. PVAs simplify this process by

understanding and executing voice commands.

4. Lack of Integration: Without PVAs, managing multiple smart devices individually can be

cumbersome. PVAs streamline this process by providing a unified interface to control various devices,

making smart home management more seamless.

5. Privacy Concerns: While PVAs raise privacy concerns, the absence of sophisticated voice

recognition systems can lead to security vulnerabilities in other forms of digital interactions. PVAs,

when designed with robust security measures, can enhance privacy by ensuring secure and

authenticated access to sensitive information.

Scope of the Project

This project aims to develop a comprehensive personal voice assistant that addresses the identified

needs and challenges. The scope of the project includes:

6|Page
Personal Voice Assistant

1. Development of Core Functionalities: The project will focus on building essential features such as

voice recognition, natural language processing, and task execution. This includes functionalities like

setting reminders, sending messages, controlling smart devices, and retrieving information.

2. Integration with Smart Devices: The PVA will be designed to seamlessly integrate with various

smart home devices, enabling users to control their environment through voice commands. This

includes compatibility with popular smart home ecosystems and devices.

3. User Interface Design: A user-friendly interface will be developed to facilitate easy interaction with

the PVA. This includes designing both voice and visual interfaces to ensure a seamless user experience.

4. Security and Privacy Measures: The project will incorporate robust security protocols to protect user

data and ensure privacy. This includes implementing encryption, authentication, and secure data

storage mechanisms.

5. Personalization and Learning Capabilities: The PVA will include machine learning algorithms to

learn from user interactions and preferences, providing personalized responses and suggestions over

time.

6. Accessibility Features: Special attention will be given to making the PVA accessible to users with

disabilities. This includes voice commands that cater to specific needs and interface adjustments for

better usability.

7|Page
Personal Voice Assistant

7. Performance and Scalability: The PVA will be designed to handle multiple users and high volumes

of interactions efficiently. Performance optimization and scalability will be key considerations during

development.

8. Testing and Evaluation: The project will involve rigorous testing to ensure functionality, reliability,

and user satisfaction. User feedback will be gathered and analysed to make iterative improvements.

By addressing these areas, the project aims to develop a personal voice assistant that enhances user

experience, improves accessibility, and integrates seamlessly with modern smart home ecosystems.

1.3 BACKGROUND

Historical Evolution

Voice recognition technology has roots dating back to the mid-20th century when researchers began

experimenting with electronic speech synthesis and recognition. Early systems, such as the "Audrey"

system developed in the 1950s, laid foundational principles for converting spoken language into

machine-readable format. These systems were rudimentary compared to modern standards, relying on

simple pattern matching and acoustic modelling.

The evolution accelerated in the 1970s and 1980s with the development of more sophisticated

techniques, including Hidden Markov Models (HMM) and Dynamic Time Warping (DTW). These

methods improved accuracy and enabled broader applications, albeit still limited by computational

power and data availability.

8|Page
Personal Voice Assistant

By the late 1990s and early 2000s, advancements in machine learning, particularly the advent of neural

networks, revolutionized voice recognition. Systems like Dragon NaturallySpeaking introduced neural

networks for speech-to-text conversion, significantly enhancing accuracy and usability.

Today, modern voice assistants like Siri, Google Assistant, and Amazon Alexa represent the

culmination of decades of research and development. They integrate advanced algorithms, vast

datasets, and cloud computing to provide seamless voice interaction across various devices and

applications.

Technological Advancements

Key technological milestones in voice recognition include:

Neural Networks: The adoption of deep learning techniques has dramatically improved accuracy by

enabling systems to recognize complex patterns in speech data.

Big Data and Cloud Computing: Access to large-scale datasets and cloud-based processing has enabled

more accurate and efficient voice recognition, transcending the limitations of local hardware.

Natural Language Processing (NLP): Integration with NLP allows voice assistants to understand

context, intent, and even emotions, making interactions more natural and intuitive.

Multimodal Integration: Combining voice with other input modalities (such as text and gestures)

enhances user experience and expands functionality.

Current Trends

Contemporary trends in AI and voice assistant technology include:


9|Page
Personal Voice Assistant

Personalization: Voice assistants are becoming more personalized, learning user preferences, and

adapting responses accordingly.

-Integration into IoT: Voice control is increasingly integrated with Internet of Things (IoT) devices,

enabling smart homes, and connected environments.

Privacy and Security: Heightened concerns over data privacy and security have driven advancements

in secure voice authentication and data encryption.

Multilingual Support: Efforts are underway to support multiple languages and dialects, making voice

technology accessible to a global audience.

Domain-specific Applications: Voice assistants are expanding into specialized domains like healthcare,

finance, and education, offering tailored solutions and improving efficiency in various sectors.

Siri. Siri is Apple Inc.’s cloud software that can answer users’ various questions and give

recommendations, due to its voice processing mechanisms. When in use, Siri studies the user’s

preferences (like contextual advertising) to provide each person with an entirely individual

approach. This software solution is also useful for developers; the presence of an API called Siri Kit

provides smooth integration with new applications developed for iOS and Watch OS platforms.

Ok, Google. Ok, Google is an Android-based voice recognition application, which is launched by users

uttering commands of the same name. This software features very advanced functions including web

search, route optimization, memo scheduling, etc. that can collectively help users solve a wide array of

daily tasks. Like Siri, the creators of OK Google offer Google Voice Interaction API. This interface

10 | P a g e
Personal Voice Assistant

can become a truly indispensable tool in the development of mobile applications for the Android

platform.

Cortana. A virtual intelligent assistant with the function of voice recognition and AI elements, Cortana

was developed for such platforms as Windows, iOS, Android, and Xbox One. It can predict users’

wants and needs based on their search requests, e-mails, etc. One of Cortana’s distinguishable features

is her sense of humour. “She” can sing, make jokes, and speak to users

Amazon Echo. Amazon Echo combines hardware and software that can search the web, help with

scheduling upcoming tasks, and play various sound files all based on voice recognition. A small

speaker equipped with sound sensors; the device can be automatically activated by exclaiming “Alexa.”

Nina. Software with AI elements that has a main goal of narrowing down the amount of physical effort

spent on the solution of daily tasks (web search, scheduling, etc.) Due to elaborate analytical

mechanisms, Nina becomes “smarter” with every day of personal utilization.

Bixby. Samsung’s Bixby application is another successful implementation of the AI concept. It also

builds a unique user approach, based on interests and habits. Bixby features advanced voice recognition

mechanisms and uses the camera to identify images, based on markers and GPS.

11 | P a g e
Personal Voice Assistant

1.4 0BJECTIVE

The primary goal of this project is to demonstrate the feasibility of developing a personal voice assistant

software, referred to as a smart agent, using Python and leveraging various data sources available on

the web, user-generated content, knowledge databases, and inference technologies from Web 3.0.

Design and Functionality

1. Contextual Understanding:

The smart agent will gather contextual information about the user, such as location, current time,

calendar appointments, relationships between tasks, task decomposition, and past task history. It will

also consider user interests and preferences (e.g., likes, and dislikes).

This contextual understanding will enable the agent to interpret tasks more accurately and decompose

them into actionable steps based on sequences stored in its knowledge base.

2. Task Management and Planning:

The agent's core functionality will involve managing and planning tasks. It will optimize task

management by grouping related tasks that can be completed simultaneously and in proximity, thus

enhancing resource efficiency.

By leveraging data gathered about the user and environment, the agent will improve productivity by

suggesting optimal sequences for completing tasks and allocating resources effectively.

12 | P a g e
Personal Voice Assistant

3. Feedback Loop:

A feedback mechanism will be integrated to allow the user to provide input and validate decisions

made by the agent, especially in scenarios where multiple paths are possible or when the agent lacks

sufficient information.

This feedback loop will help refine the agent's decision-making process over time, enhancing its ability

to adapt to user preferences and evolving contexts.

4. Assumptions, Limitations, and Constraints:

The thesis will identify and discuss assumptions, limitations, and constraints inherent in the solution.

For instance, limitations may include data availability and accuracy from web sources, while

constraints might involve computational resources required for real-time decision-making.

5. Additional Infrastructure:

Any additional infrastructure necessary to complement the smart agent system, such as specific APIs

for data retrieval, integration with existing applications, or computational resources for intensive

processing tasks, will be identified and discussed.

Expected Outcomes

Demonstration of Feasibility: The Project will provide evidence and practical implementation of how

Python, web data sources, and inference technologies can be integrated to build a functional smart

agent.

13 | P a g e
Personal Voice Assistant

Improvement in Productivity: Success will be measured by the agent's ability to optimize task

management, improve productivity, and provide valuable insights based on user interactions and

feedback.

Identification of Future Directions: The project will highlight potential future directions for enhancing

the smart agent's capabilities, such as integrating more advanced AI models or expanding into new

domains of application.

Customization and Learning: The assistant should allow for the customization of responses and

preferences. It should have the ability to learn from user interactions to improve its responses and

functionality over time.

Task Automation: The assistant should be able to handle a range of tasks, such as setting reminders,

scheduling appointments, sending emails, or fetching information from the web. It should integrate

with various APIs (e.g., Google Calendar, Email services) to perform these tasks.

14 | P a g e
Personal Voice Assistant

CHAPTER 2

LITERATURE SURVEY

2.1 RELATED REVIEW

Nivedita Singh (2021) et al. proposed a voice assistant using a Python speech to text (STT) module

and performed some API calls and system calls which led to the development of a voice assistant using

Python that allows the user to run any type of command through voice without interaction of keyboard.

This can also run on hybrid platforms. Therefore, this paper lacks in some parts like the system calls

that are not much supported (Agrawal et al., 2021).

Abeed Sayyed (2021) et al. presented a paper on Desktop Assistant AI using Python with IOT (Internet

of Things) features and used Artificial Intelligence (AI) features along with an SQLite DB with the use

of Python. This Project has a Database connection and a query framework but lacks API call and

System call features (Sayyed et al., 2021).

Krishnagar (2021) et al. presented a project on Portable Voice Recognition with GUI Automation, this

system uses Google's online speech recognition system for converting speech input to text along with

Python. Therefore, this project has a GUI and has a portable framework. The accuracy of this text-to-

speech (TTS) engine is comparatively less and lacks IoT (Krishna raj et al., 2021).

15 | P a g e
Personal Voice Assistant

Rajdip Paul (2021) et al. presented a project named A Novel Python-based Voice Assistance System

for Reducing the Hardware Dependency of Modern Age Physical Servers. This Author has proposed

an assistant project with Python as a backend supporting system calls, API calls, and various features.

This Project is quite well responsive to API calls but also needs improvement in understanding and

reliability (Paul & Mukhopadhyay, 2021).

V. Geetha (2021) et al. presented a project named The Voice-Enabled Personal Assistant for PC using

Python. This Author has proposed an assistant project with Python as a backend and features like

turning our PC off, restarting it, or reciting some latest news, is just one voice command away. Also,

this project has well well-supported library not every API will have the capability to convert the raw

JSON data into text. And there is a delay in processing request calls (Geetha et al., 2021).

Dilawar Shah Zwakman (2021) et al. proposed the Usability Evaluation of Artificial Intelligence.

Based Voice Assistants which can give proper response to the user's request. It also has a feature where

it can make an appointment with the person mentioned by the user through voice, but it lacks API calls

(Zwakman et al., 2021).

Philipp Sprengholz (2021) et al. have proposed OK Google: Using virtual assistants for data collection

in psychological and behavioural Research which is a survey mate that they have developed which is

an extension of the Google Assistant that was used to check the reliability and validity of data collected

by this test. Answers and synonyms are defined for every different type of question so, it can be used

to analyse the behaviour of an individual. As it is a psychological and behavioural research assistant

(Sprengholz & Betsch, 2021).

16 | P a g e
Personal Voice Assistant

Dimitrios Buhalis (2021) et al. proposed a paper on In-room Voice-Based AI Digital Assistants

Transforming On-Site Hotel Services and Guests’ Experiences. Where voice assistant is used for hotel

services. It will be especially useful in this current COVID-19 era. Human Touch is considered as a

danger in this COVID time and with a voice assistant, loss of human touch is not considered as an

advantage. It can also be used to control the temperature controls and room light controls, but it needs

Complex Integration and Staff Training (Buhalis & Moldavska, 2021).

Benedict D. C (2020) et al. proposed Consumer decisions with artificially intelligent voice assistants

that will have stronger psychological reactions to the system's look on human like behaviours. The

assistant has Internet of Things features. It can also order stuffs that the user want but there are some

cons in this paper. Voice assistant relies on the speaker’s ability to represent the decision alternatives

to catch up in voice dialogues and another main disadvantage is that it lacks system calls (Dellaert et

al., 2020).

17 | P a g e
Personal Voice Assistant

CHAPTER 3

METHODOLOGY

3.1. EXISTING SYSTEM

Existing projects often rely heavily on speech recognition augmented by emotional networks. While

these systems achieve a certain level of accuracy, their practical application and suitability for real-

world use are limited. They primarily employ basic methods, among which context-aware computing

stands out. Context-aware computing encompasses programs capable of sensing their physical

environment and adjusting their responses accordingly. In the realm of speech recognition, this

capability allows systems to identify words spoken by individuals with varying accents, tones, or

speech patterns. Moreover, context-aware systems can also correct words that may have been

mispronounced, ensuring more accurate transcription, and understanding of spoken language. This

adaptive approach not only enhances the robustness of speech recognition systems but also improves

their usability across diverse user demographics and environments.

3.2. PROPOSED SYSTEM

The conceptual model that describes a system's structure, behaviour, and other aspects is called system

architecture. A formal description and representation of a system that is set up to facilitate analysis of

18 | P a g e
Personal Voice Assistant

its structures and behaviours are called an architecture description. System architecture can comprise

designed subsystems and system components that will cooperate to implement the entire system. This

section gives a succinct summary of our findings after analysing and comparing our suggested work.

We have used Python, machine learning, and AI to implement this concept. Our primary goal is to

enable consumers to do their jobs using voice commands. This can be accomplished in two steps.

Initially, with the aid of the Voice Recognition API, turn the user's audio input into an English sentence.

1) The system will continuously listen for commands, and it can adjust the amount of time it spends

doing so as per user preferences.

2) The system will keep requesting the user to repeat their input the desired number of times if it cannot

get the information from it.

3) The user's preferences can determine whether the system uses male or female voices.

4) The current version supports features including playing music, sending emails and texts, searching

Wikipedia, opening system-installed programs, and accessing any website.

5) The system will continue to listen for commands, and it can adjust the duration of that listening

based on user needs.

6) The system will keep requesting the user to repeat their input till the desired number of times if it

cannot get information from it.

7) The user's preferences can determine whether the system uses male or female voices

19 | P a g e
Personal Voice Assistant

3.3. OBJECTIVE OF THE PROJECT

The main objective of developing personal assistant software, or virtual assistant, is to leverage

semantic data sources from the web, user-generated content, and knowledge databases to effectively

answer user queries. This intelligent virtual assistant serves various purposes, such as providing

customer support on business websites through chat interfaces or offering mobile-based services where

users interact via voice commands. By automating responses to user inquiries, virtual assistants

significantly reduce the time spent on manual online research and report preparation, thereby enhancing

productivity and efficiency. the objective of this project is to show the feasibility of building a personal

voice assistant software (a smart agent) using Python data sources available on the web, user-generated

content, data providing knowledge from knowledge databases as well as from inference technologies

of web 3.0.

To design a smart agent that has contextual information about the user and helps in managing and

planning tasks, using Python web technologies and open data available on the Internet. Contextual

information about the user can be location, current time, calendar appointments, relation between tasks,

decomposition of tasks, history of tasks, user interests, likes, etc. Agent can use data gathered about

the user as well as environment data to better understand what each of the tasks means and decompose

the tasks based on a sequence of steps stored in its knowledge base and then plan individual tasks.

The planning part of the agent will strive to optimize resources and try to improve the productivity of

the user. It can be used as a time management application as well as a task management application.

20 | P a g e
Personal Voice Assistant

By combining, related tasks together that can be completed at the same time and around the same

location, the agent will optimize the user’s resources to complete these tasks.

A feedback loop from the user will help the agent to make decisions when there are multiple paths, and

the agent does not have sufficient information to make those decisions.

Assumptions, limitations, and constraints in the solution will be highlighted and any additional

infrastructure necessary as a complement to the system will be identified.

3.4. SOFTWARE AND HARDWARE REQUIREMENTS

REQUIRED FEATURES OF SYSTEM

Usability

The system is designed with a completely automated process hence there is no or less user

intervention.

Reliability

The system is more reliable because of the qualities that are inherited from the chosen platform

php. The code built by using PHP is more reliable.

Performance

This system is developed in high-level languages and uses the advanced front-end and back-end

technologies it will give a response to the end-user on the client system within extraordinarily

little time.

21 | P a g e
Personal Voice Assistant

Supportability

The system is designed to be cross-platform supportable. The system is supported on a wide range

of hardware and any software platform, which is having Apache, built into the system.

Implementation

The system is implemented in a web environment using core PHP. Apache is used as the web

server and Windows XP Professional is used as the platform.

REQUIRED SOFTWARE SPECIFICATION

• OS WINDOWS 10X OR EARLIER UPTO 7X

• PYTHON 3. X VERSION

• PYCHARM IDE 2019.1.3

REQUIRED HARDWARE SPECIFICATION

• Processor :- INTEL CORE I3 OR ABOVE

• RAM :- 4 GB

• Hard Disk :- 500GB

• Monitor :- Colour monitor

• Keyboard :- 104 keys

• Mouse :- Any pointing device

22 | P a g e
Personal Voice Assistant

3.4.3. LIBRARIES

Pyttsx3- It is a text-to-speech conversion library in Python that is used to convert the text given in the

parenthesis to speech. It is compatible with Python 2 and 3. An application invokes the pyttsx3.init()

factory function to get a reference to a pyttsx3. it is a very easy-to-use tool that converts the entered

text into speech. The pyttsx3 module supports two voices first is female and the second is male which

is provided by “sapi5” for Windows.

Command to install: - pip install pyttsx3

It supports three TTS engines: -

sapi5- To run on windows

nsss - NSSpeechSynthesizer on Mac OS X

espeak – eSpeak on every other platform

Speech recognition- It allows computers to understand human language. Speech recognition is a

machine's ability to listen to spoken words and identify them. We can then use speech recognition in

Python to convert the spoken words into text, make a query or give a reply. Python supports many

speech recognition engines and APIs, including Google Speech Engine, Google Cloud Speech API.

Command to install: - pip install Speech Recognition

WolfarmAlpha- Wolfram Alpha is an API that can compute expert-level answers using Wolfram's

algorithms, knowledgebase, and AI technology. It is made possible by the Wolfram Language. The

23 | P a g e
Personal Voice Assistant

WolfarmAlpha API provides a web-based API allowing the computational and presentation

capabilities of Wolfram Alpha to be integrated into web, mobile, and desktop applications.

Command to install: - pip install wolframalpha

Rand facts- Rand facts are a Python library that generates random facts. We can use

randfacts.get_fact() to return a random fun fact.

Command to install: - pip install Rand facts

Py jokes- Py jokes is a Python library that is used to create one-line jokes for users. Informally, it can

also be referred to as a fun Python library which is simple to use.

Command to install: - pip install Py jokes

Date Time- This module is used to get the date and time for the user. This is a built-in module so there

is no need to install this module externally. Python Datetime module supplies classes to work with date

and time. Date and Date Time are an object in Python, so when we manipulate them, we are

manipulating objects and not strings or timestamps.

Random2- Python version 2 has a module named "random". This module provides a Python 3 ported

version of Python 2.7's random module. It has also been backported to work in Python 2.6. In Python

3, the implementation of randrange() was changed, so that even with the same seed you get different

sequences in Python 2 and 3.

24 | P a g e
Personal Voice Assistant

Math- This is a built-in module that is used to perform mathematical tasks. For example, math. Cos ()

which returns the cosine of a number, or math.log () returns the natural logarithm of a number or the

logarithm of a number to base.

Warnings- The warning module is a subclass of Exception which is a built-in class in Python. A

warning in a program is distinct from an error. Conversely, a warning is not critical. It shows some

messages, but the program runs.

OS- The OS module is a built-in module that provides functions with which the user can interact with

the OS when they are running the program. This module provides a portable way of using operating

system-dependent functionality. This module has functions with which the user can open the file which

is mentioned in the program.

Serial- This module encapsulates the access for the serial port. It provides backends for Python running

on Windows, OSX, Linux, BSD, and Iron Python. The module named ¡§serial¡¨ automatically selects

the appropriate backend.

Command to install: - pip install pyserial

Wikipedia is a Python library that makes it easy to access and parse data from Wikipedia. Search

Wikipedia, get article summaries, get data like links and images from a page, and more. Wikipedia is

a multilingual online encyclopaedia.

Command to install: - pip install Wikipedia

25 | P a g e
Personal Voice Assistant

Selenium Web drive- The Selenium module is used to automate web browser interaction from Python.

Several browsers/drivers are supported (Firefox, Chrome, Internet Explorer), as well as the Remote

protocol. The supported Python versions are Python 3.5 and above.

Command to install: - pip install selenium

Requests- The requests module allows you to send HTTP requests using Python. The HTTP request

returns a Response Object with all the response data. With it, we can add content like headers, form

data, multipart files, and parameters via simple Python libraries. It also allows you to access the

response data of Python in the same way.

Command to install: - pip install requests

Web browser- The Web browser module is a convenient web browser controller. It provides a high-

level interface that allows displaying Web-based documents to users. web browser can also be used as

a CLI tool. It accepts a URL as the argument with the following optional parameters: -n opens the URL

in a new browser window, if possible, and -t opens the URL in a new browser tab. This is a built-in

module, so installation is not required.

3.5. PROGRAMMING LANGUAGES

Programming languages are fundamental tools used by developers to write, edit, and execute software

programs and applications. These languages serve as structured methods for communicating

instructions to computers, enabling them to perform specific tasks or operations. Each programming

language has its syntax, rules, and capabilities tailored to diverse types of applications and

26 | P a g e
Personal Voice Assistant

environments. For instance, high-level languages like Python and JavaScript prioritize readability and

ease of use, making them ideal for rapid development and web applications. In contrast, lower-level

languages such as C and C++ offer greater control over hardware and system resources, crucial for

developing performance-critical software like operating systems and embedded systems. Additionally,

domain-specific languages like SQL facilitate efficient database management, while functional

languages like Haskell emphasize mathematical functions and immutable data. The choice of

programming language depends on factors such as project requirements, performance goals, and

developer preferences, each offering unique strengths in software development landscapes.

3.5.1 PYTHON

What is Python?

Python is a popular programming language. It was created by Guido van Rossum and released in

1991.

It is used for:

web development (server-side),

software development,

mathematics,

system scripting.

27 | P a g e
Personal Voice Assistant

What can Python do?

Python can be used on a server to create web applications.

Python can be used alongside software to create workflows.

Python can connect to database systems. It can also read and modify files.

Python can be used to handle big data and perform complex mathematics.

Python can be used for rapid prototyping or production-ready software development.

Why Python?

Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc).

Python has a simple syntax like the English language.

Python has a syntax that allows developers to write programs with fewer lines than some other

programming languages.

Python runs on an interpreter system, meaning that code can be executed as soon as it is written.

This means that prototyping can be very quick.

Python can be treated procedurally, in an object-orientated way, or in a functional way.

28 | P a g e
Personal Voice Assistant

Good to know

The most recent major version of Python is Python 3, which we shall be using in this tutorial.

However, Python 2, although not being updated with anything other than security updates, is still

quite popular.

In this tutorial, Python will be written in a text editor. It is possible to write Python in an Integrated

Development Environment, such as Thonny, PyCharm, NetBeans, or Eclipse which are

particularly useful when managing larger collections of Python files.

Python Syntax compared to other programming languages

Python was designed for readability and has some similarities to the English language with

influence from mathematics.

Python uses new lines to complete a command, as opposed to other programming languages which

often use semicolons or parentheses.

Python relies on indentation, using whitespace, to define scope; such as the scope of loops,

functions, and classes. Other programming languages often use curly brackets for this purpose.

Python is an interpreted, high-level, general-purpose programming language. Created by Guido

van Rossum and first released in 1991, Python's design philosophy emphasizes code readability

29 | P a g e
Personal Voice Assistant

with its notable use of significant whitespace. Its language constructs and object-oriented

approach aim to help programmers write clear, logical code for small and large-scale projects.

Python is dynamically typed, and garbage collected. It supports multiple programming paradigms,

including procedural, object-oriented, and functional programming. Python is often described as

a "batteries included" language due to its comprehensive standard library.

Python was conceived in the late 1980s as a successor to the ABC language. Python 2.0, released

2000, introduced features like list comprehensions and a garbage collection system capable of

collecting reference cycles. Python 3.0, released 2008, was a major revision of the language that

is not completely backward compatible, and much Python 2 code does not run unmodified on

Python 3. Due to concerns about the amount of code written for Python 2, support for Python 2.7

(the last release in the 2.x series) was extended to 2020. Language developer Guido van Rossum

shouldered sole responsibility for the project until July 2018 but now shares his leadership as a

member of a five-person steering council.

Python interpreters are available for many operating systems. A global community of

programmers develops and maintains Python, an open-source reference implementation. A non-

profit organization, the Python Software Foundation, manages and directs resources for Python

and Python development.

Python is an easy-to-learn, powerful programming language. It has efficient high-level data

structures and a simple but effective approach to object-oriented programming. Python’s elegant

30 | P a g e
Personal Voice Assistant

syntax and dynamic typing, together with its interpreted nature, make it an ideal language for

scripting and rapid application development in many areas on most platforms.

The Python interpreter and the extensive standard library are freely available in source or binary

form for all major platforms from the Python Web site, https://www.python.org/, and may be

freely distributed. The same site also contains distributions of and pointers to many free third-

party Python modules, programs and tools, and additional documentation.

The Python interpreter is easily extended with new functions and data types implemented in C or

C++ (or other languages callable from C). Python is also suitable as an extension language for

customizable applications.

This tutorial introduces the reader informally to the basic concepts and features of the Python

language and system. It helps to have a Python interpreter handy for hands-on experience, but all

examples are self-contained, so the tutorial can be read offline as well.

For a description of standard objects and modules, see The Python Standard Library. The Python

Language Reference gives a more formal definition of the language. To write extensions in C or

C++, read Extending and Embedding the Python Interpreter and Python/C API Reference Manual.

There are also several books covering Python in depth.

This tutorial does not attempt to be comprehensive and cover every single feature, or even every

commonly used feature. Instead, it introduces many of Python’s most noteworthy features and

will give you an innovative idea of the language’s Flavors and style. After reading it, you will be

able to
31 | P a g e
Personal Voice Assistant

read and write Python modules and programs, and you will be ready to learn more about the

various Python library modules described in The Python Standard Library.

History

Guido van Rossum at OSCON 2006.

Main article: History of Python

Python was conceived in the late 1980s by Guido van Rossum at Centrum Wiskunde &

Informatica (CWI) in the Netherlands as a successor to the ABC language (itself inspired by

SETL), capable of exception handling and interfacing with the Amoeba operating system. Its

implementation began in December 1989. Van Rossum continued as Python's lead developer until

July 12, 2018, when he announced his "permanent vacation" from his responsibilities as Python's

Benevolent Dictator For Life, a title the Python community bestowed upon him to reflect his long-

term commitment as the project's chief decision-maker.[36] In January 2019, active Python core

developers elected Brett Cannon, Nick Coghlan, Barry Warsaw, Carol Willing, and Van Rossum

to a five-member "Steering Council" to lead the project.

Python 2.0 was released on 16 October 2000 with many major new features, including a cycle-

detecting garbage collector and support for Unicode.

Python 3.0 was released on 3 December 2008. It was a major revision of the language that is not

completely backward compatible. Many of its major features were backported to Python 2.6.x[40]

32 | P a g e
Personal Voice Assistant

and 2.7.x version series. Releases of Python 3 include the 2to3 utility, which automates (at least

partially) the translation of Python 2 code to Python 3.

Python 2.7's end-of-life date was initially set at 2015 then postponed to 2020 out of concern that

a large body of existing code could not easily be forward-ported to Python 3. In January 2017,

Google announced work on a Python 2.7 to Go trans compiler to improve performance under

concurrent workloads.

Features and philosophy

Python is a multi-paradigm programming language. Object-oriented programming and structured

programming are fully supported, and many of its features support functional programming and

aspect-oriented programming (including by metaprogramming [45] and metaobjects (magic

methods)). Many other paradigms are supported via extensions, including design by contract

andlogic programming.

Python uses dynamic typing, and a combination of reference counting and a cycle-detecting

garbage collector for memory management. It also features dynamic name resolution (late

binding), which binds method and variable names during program execution.

Python's design offers some support for functional programming in the Lisp tradition. It has filter,

map, and reduce functions; list comprehensions, dictionaries, sets, and generator expressions. The

standard library has two modules (intercools and functions) that implement functional tools

borrowed from Haskell and Standard ML (Machine Learning).

33 | P a g e
Personal Voice Assistant

The language's core philosophy is summarized in the document The Zen of Python (PEP 20),

which includes aphorisms such as:

Beautiful is better than ugly

Explicit is better than implicit

Simple is better than complex

Complex is better than complicated

Readability counts

Rather than having all its functionality built into its core, Python was designed to be highly

extensible. This compact modularity has made it particularly popular as a means of adding

programmable interfaces to existing applications. Van Rossum's vision of a small core language

with a large standard library and easily extensible interpreter stemmed from his frustrations with

ABC, which espoused the opposite approach. Python strives for a simpler, less cluttered syntax

and grammar while giving developers a choice in their coding methodology. In contrast to Perl's

"there is more than one way to do it" motto, Python embraces a "there should be one—and

preferably only one—obvious way to do it" design philosophy. Alex Martelli, a Fellow at the

Python Software Foundation and Python book author, writes that "To describe something as

'clever' is not considered a compliment in the Python culture."

Python's developers strive to avoid premature optimization and reject patches to non-critical parts

of the Python reference implementation that would offer marginal increases in speed at the cost

34 | P a g e
Personal Voice Assistant

of clarity. When speed is important, a Python programmer can move time-critical functions to

extension modules written in languages such as C, or use PyPy, a just-in-time compiler. Python

is also available, which translates a Python script into C and makes direct C-level API calls into

the Python interpreter.

An important goal of Python's developers is to keep it fun to use. This is reflected in the language's

name—a tribute to the British comedy group Monty Python and in occasionally playful

approaches to tutorials and reference materials, such as examples that refer to spam and eggs

(from a famous Monty Python sketch) instead of the standard foo and bar.

A common neologism in the Python community is pythonic, which can have a wide range of

meanings related to program style. To say that code is pythonic is to say that it uses Python idioms

well, that it is natural or shows fluency in the language, and that it conforms with Python's

minimalist philosophy and emphasis on readability. In contrast, code that is difficult to understand

or reads like a rough transcription from another programming language is called unpythonic.

Users and admirers of Python, especially those considered knowledgeable or experienced, are

often referred to as Pythonists, Pythonistas, and Pythoneers.

35 | P a g e
Personal Voice Assistant

5.3 PYCHARM IDE 2019.1.3

PyCharm is a dedicated Python and Django IDE providing a wide range of essential tools for

Python developers, tightly integrated together to create a convenient environment for productive

Python development and Web development.

PyCharm is available in three editions: Professional, Community, and Educational (Edu). The

Community and Edu editions are open-source projects, and they are free, but they have less

features. PyCharm Edu provides courses and helps you learn programming with Python. The

Professional edition is commercial and provides an outstanding set of tools and features. For

details, see the editions comparison matrix.\

PYCHARM FEATURES

Intelligent Coding Assistance

PyCharm provides smart code completion, code inspections, on-the-fly error highlighting and

quick fixes, along with automated code refactorings and rich navigation capabilities.

Intelligent Code Editor

PyCharm’s smart code editor provides first-class support for Python, JavaScript, CoffeeScript,

TypeScript, CSS, popular template languages and more. Take advantage of language-aware code

completion, error detection, and on-the-fly code fixes!

Smart Code Navigation

36 | P a g e
Personal Voice Assistant

Use smart search to jump to any class, file, or symbol, or even any IDE action or tool window. It

only takes one click to switch to the declaration, super method, test, usages, implementation, and

more.

Fast and Safe Refactorings

Refactor your code the intelligent way, with safe Rename and Delete, Extract Method, Introduce

Variable, Inline Variable or Method, and other refactorings. Language and framework-specific

refactorings help you perform project-wide changes.

Built-in Developer Tools

PyCharm’s massive collection of tools out of the box includes an integrated debugger and test

runner; Python profiler; a built-in terminal; integration with major VCS and built-in database

tools; remote development capabilities with remote interpreters; an integrated ssh terminal; and

integration with Docker and Vagrant.

Debugging, Testing, and Profiling

Use the powerful debugger with a graphical UI for Python and JavaScript. Create and run your

tests with coding assistance and a GUI-based test runner. Take full control of your code with

Python Profiler integration.

VCS, Deployment, and Remote Development

37 | P a g e
Personal Voice Assistant

Save time with a unified UI for working with Git, SVN, Mercurial, or other version control

systems. Run and debug your application on remote machines. Easily configure automatic

deployment to a remote host or VM and manage your infrastructure with Vagrant and Docker.

Database tools

Access Oracle, SQL Server, PostgreSQL, MySQL, and other databases right from the IDE. Rely

on PyCharm’s help when editing SQL code, running queries, browsing data, and altering schemas.

Web Development

In addition to Python, PyCharm provides first-class support for various Python web development

frameworks, specific template languages, JavaScript, CoffeeScript, TypeScript, HTML/CSS,

AngularJS, Node.js, and more.

Python Web frameworks

PyCharm offers great framework-specific support for modern web development frameworks such

as Django, Flask, Google App Engine, Pyramid, and web2py, including Django templates

debugger, manage.py and appcfg.py tools, special autocompletion and navigation, just to name a

few.

JavaScript & HTML

PyCharm provides first-class support for JavaScript, Coffee Script, TypeScript, HTML, and CSS,

as well as their modern successors. The JavaScript debugger is included in PyCharm and is

integrated with the Django server runs configuration.

38 | P a g e
Personal Voice Assistant

Live Edit

Live Editing Preview lets you open a page in the editor and the browser and see the changes being

made in code instantly in the browser. PyCharm auto-saves your changes, and the browser smartly

updates the page on the fly, showing your edits.

Scientific Tools

PyCharm integrates with I Python Notebook, has an interactive Python console, and supports

Anaconda as well as multiple scientific packages including Matplotlib and NumPy.

Interactive Python console

You can run a REPL Python console in PyCharm which offers many advantages over the standard

one: on-the-fly syntax check with inspections, braces, and quotes matching, and of course code

completion.

Scientific Stack Support

PyCharm has built-in support for scientific libraries. It supports Pandas, NumPy, Matplotlib, and

other scientific libraries, offering you best-in-class code intelligence, graphs, array viewers, and

much more.

Conda Integration

Keep your dependencies isolated by having separate Conda environments per project, PyCharm

makes it easy for you to create and select the right environment

39 | P a g e
Personal Voice Assistant

Debug Your Python Code with PyCharm

Visual Debugging

Some coders still debug using print statements because the concept is hard and pdb is intimidating.

PyCharm’s Python debugging GUI makes it easy to use a debugger by putting a visual face on

the process. Getting started is simple and moving on to the major debugging features is easy.

Debug Everywhere

Of course, PyCharm can debug code that you are running on your local computer, whether it is

your system Python, a VirtualNet, Anaconda, or a Conda env. In PyCharm Professional Edition

you can also debug code you are running inside a Docker container, within a VM, or on a remote

host through SSH.

Debug Inside Templates PRO ONLY

When you are working with templates, sometimes a bug sneak into them. These can be extremely

hard to resolve if you cannot see what is going on inside them. PyCharm’s debugger enables you

to put a breakpoint in Django and Jinja2 templates to make these problems easy to fix.

Note: to debug templates, first configure the template language.

JavaScript PRO ONLY

Any modern web project involves JavaScript; therefore, any modern Python IDE needs to be able

to debug JavaScript as well. PyCharm Professional Edition comes with the highly capable

40 | P a g e
Personal Voice Assistant

JavaScript debugger from WebStorm. Personal voice assistant in-browser JS and NodeJS are

supported by the JavaScript debugger.

Debugging During TDD

Test-driven development, or TDD, involves exploration while writing tests. Use the debugger to

help explore by setting breakpoints in the context you are investigating:

This investigation can be in your test code or in the code being tested, which is extremely helpful

for Django integration tests (Django support is available only in PyCharm Professional Edition).

Use a breakpoint to find out what is coming from a query in a test case:

No Code Modification Necessary

PDB is a great tool, but requires you to modify your code, which can lead to accidentally checking

in `pdb.set_trace()` calls into your gilt repo.

See What Your Code Does

Breakpoints

All debuggers have breakpoints, but only some debuggers have highly versatile breakpoints. Have

you ever clicked ‘continue’ many times until you finally get to the loop iteration where your bug

occurs? No need for that with PyCharm’s conditional breakpoints.

41 | P a g e
Personal Voice Assistant

Sometimes all you want to do is see what a certain variable’s value is throughout code execution.

You can configure PyCharm’s breakpoints to not suspend your code, but only log a message for

you.

Exceptions can ruin your day, that’s why PyCharm’s debugger can break on exceptions, even if

you are not entirely sure where they are coming from.

To help you stay in control of your debugging experience, PyCharm has an overview window

where you can see all your breakpoints, as well as disable some by checkbox. You can also

temporarily mute all your breakpoints until you need them.

See Variable Values briefly

As soon as PyCharm hits a breakpoint, you will see all your variable values inline in your code.

To make it easy to see what values have changed since the last time you hit the breakpoint,

changed values are highlighted.

Watches

Customize your variable view by adding watches. Whether they are simple or complex, you will

be able to see exactly what you want to see.

Control Your Code

Visually Step Through Your Code

If you want to know where your code goes, you do not need to put breakpoints everywhere. You

can step through your code and keep track of exactly what happens.
42 | P a g e
Personal Voice Assistant

Run Custom Code

In some cases, the easiest way to reproduce something is to force a variable to a certain value.

PyCharm offers personal voice assistance `evaluate expression` to quickly change something, and

a console if you would like more control. The console can even use the Python shell if it is

installed.

Speed

Faster Than PDB

For Python 3.6 debugging, PyCharm’s debugger is the fastest debugger on the market. Even faster

than PDB. What this means is that you can simply always run your code under the debugger while

developing, and easily add breakpoints when you need them. Just make sure to click ‘install’ when

PyCharm asks whether to install the Python speedups.

3.5.2. DOMAIN

The domain of personal voice assistants encompasses the development and deployment of intelligent

software agents capable of understanding and responding to voice commands. These assistants,

powered by advancements in natural language processing (NLP) and artificial intelligence (AI),

perform a variety of tasks ranging from managing schedules and setting reminders to retrieving

information and controlling smart home devices. They operate across multiple platforms, including

smartphones, smart speakers, and computers, providing seamless and intuitive user interactions.

43 | P a g e
Personal Voice Assistant

Personal voice assistants leverage vast databases, user-generated content, and contextual information

to deliver personalized and context-aware responses. By automating routine tasks and offering hands

free operation, they enhance user productivity, convenience, and accessibility, becoming indispensable

tools in both personal and professional settings.

The domain of a personal voice assistant encompasses the specific area or field in which the assistant

operates, providing tailored functionalities and services to users. In the context of a personal voice

assistant, the domain typically involves:

1. Task Management: Managing and organizing tasks such as scheduling appointments, setting

reminders, creating to-do lists, and managing daily routines.

2. Information Retrieval: Accessing and retrieving information from various sources such as the web,

knowledge databases, and user-specific data to answer questions and provide updates.

3. Automation: Automating routine tasks and processes to improve efficiency and productivity, such

as sending emails, controlling smart home devices, and performing online transactions.

4. Personalization: Adapting responses and actions based on user preferences, historical interactions,

and contextual information such as location and time.

5. Communication: Serving as an interface for users to interact with digital systems and services

through natural language input, voice commands, or text-based interfaces.

6. Integration: Integrating with other applications, platforms, and services to enhance functionality and

provide seamless user experiences across different devices and environments.

44 | P a g e
Personal Voice Assistant

7. Support and Assistance: Providing support and assistance to users by offering guidance,

recommendations, and insights based on data analysis and user interactions.

The domain of a personal voice assistant is dynamic and evolving, incorporating advancements in

artificial intelligence, natural language processing, and machine learning to continuously enhance its

capabilities and utility for users in both personal and professional contexts.

45 | P a g e
Personal Voice Assistant

3.6. SYSTEM ARCHITECTURE

Fig 3.1 System architecture figure

46 | P a g e
Personal Voice Assistant

The system architecture of a personal AI assistant typically comprises several key components: the

user interface, natural language processing (NLP) engine, knowledge base, and backend services. The

user interface facilitates interactions through voice or text inputs. The NLP engine processes these

inputs, converting them into machine-readable commands—the knowledge base stores relevant

information, leveraging semantic data sources, user-generated content, and knowledge databases.

Backend services handle task execution, such as fetching data, managing schedules, or controlling

smart devices. This architecture ensures seamless and intelligent responses, enabling the AI assistant

to perform tasks efficiently and provide personalized assistance to users.

3.7. ALGORITHMS USED

3.7.1 SPEECH RECOGNITION MODULE

The class which we are using is called Recognizer.

It converts the audio files into text and the module is used to give the output in speech.

The energy threshold function represents the energy level threshold for sounds. Values below this

threshold are considered silence, and values above this threshold are considered speech.

Recognizer instance. adjust_for_ambient_noise (source, duration = 1), adjusts the energy threshold

dynamically using audio from the source (an AudioSource instance) to account for ambient noise.

47 | P a g e
Personal Voice Assistant

SPEECH TO TEXT & TEXT TO SPEECH CONVERSION

Pyttsx3 is a text-to-speech conversion library in Python. And can change the Voice, Rate, and Volume

by specific commands.

Python provides an API called Speech Recognition to allow us to convert audio into text for further

processing converting large or long audio files into text using the Speech Recognition API in Python.

We have included sapi5 and speak TTS Engines which can process the same

PROCESS & EXECUTES THE REQUIRED COMMAND

The said command is converted into text via a speech recognition module and further stored in a

temperature.

Then, analyse the user’s text via temperature decide what the user needs based on the input provided,

and run the while loop.

Then, Commands are executed.

48 | P a g e
Personal Voice Assistant

3.8. SYSTEM DESIGN

Fig 3.8

In this project, there is only one user. The user queries commands to the system. The system

then interprets it and fetches the answer. The response is sent back to the user

49 | P a g e
Personal Voice Assistant

3.8.1 Component diagram

Fig 3.8.1

The main component here is the Virtual Assistant. It provides two specific services,

executing Task or Answering your question

50 | P a g e
Personal Voice Assistant

3.8.2 SEQUENCE DIAGRAM

Fig 3.8.2

51 | P a g e
Personal Voice Assistant

The user sends a command to the virtual assistant in audio form. The command is passed to

the interpreter. It identifies what the user has asked and directs it to the task executor. If the task is

missing some info, the virtual assistant asks the user back about it. The received information is sent

back to the task and it is accomplished. After execution feedback is sent back to the user.

Fig 3.8.3 Sequence Diagram (Answering the user)

52 | P a g e
Personal Voice Assistant

The sequence diagram illustrates the process of fetching an answer from the internet in response to a

user's audio query. Initially, the user asks a question using voice, which is captured by a microphone.

This audio query is then processed by a speech recognition system that interprets the spoken words and

converts them into text. The textual query is subsequently sent to a web scraper, a tool designed to

search the internet for relevant information. The web scraper scours various online sources, collects

the necessary data, and identifies the most appropriate answer to the user's query. Once the answer is

found, it is sent back to the system, where it may undergo further processing if needed. Finally, the

processed answer is relayed to a text-to-speech engine, which converts the textual response back into

spoken words. The speaker then delivers the answer audibly to the user, completing the information

retrieval cycle. This automated sequence ensures a seamless and efficient method for obtaining and

presenting information from the internet in response to user inquiries.

3.9 FEASIBILITY STUDY

A feasibility study can help you determine whether you should proceed with

your project. It is essential to evaluate cost and benefit. It is essential to evaluate the cost and

benefit of the proposed system. Five types of feasibility studies are taken into consideration.

1. Technical feasibility: It includes discovering technologies for the project, both

hardware and software. For virtual assistants, users must have a microphone to convey

53 | P a g e
Personal Voice Assistant

their message and a speaker to listen when the system speaks. These are unbelievably cheap nowadays

and everyone possesses them. Besides, the system needs an internet connection.

While using, make sure you have a steady internet connection. It is also not an

issue in this era where every home or office has Wi-Fi.

2. Operational feasibility: The proposed system's ease and simplicity of operation.

The system does not require any special skill set for users to operate it. It is

designed to be used by everyone. Kids who still do not know how to write can read

out problems for the system and get answers.

3. Economic feasibility: Here, we find the total cost and benefit of the proposed system over the current

system. For this project, the main cost is documentation cost. The user also would have to pay for a

microphone and speakers. Again, they are cheap and available. As far as maintenance is concerned, it

will not cost too much.

4. Organizational feasibility: This shows the management and organizational structure of the project.

This project is not built by a team. The management tasks are all to be carried out by a single person.

That will not create any management issues and will increase the feasibility of the project.

5. Cultural feasibility: It deals with the compatibility of the project with the cultural environment. A

virtual assistant is built under the general culture. This project is technically feasible with no external

54 | P a g e
Personal Voice Assistant

hardware requirements. Also, it is simple in operation and does not cost training or repairs. Overall

feasibility study of the project reveals that the goals of the proposed system are achievable. The

decision is taken to proceed with the project.

3.10. TYPES OF OPERATION

If we ask for some information, it opens Wikipedia and asks us the topic on which we want the

information, then it clicks on the Wikipedia search box using its path, searches the topic in the search

box, and clicks the search button using the XPath of the button and reads a paragraph about that topic.

Keyword: information

Plays the video which we ask:

If we ask it to play a video, it opens YouTube and asks us the name of the video that it wants to play.

After that, it clicks on the search YouTube search box using its path, then it clicks on the search button

using its path and clicks the first result of the search using the path of the first video.

Keyword: Play and video or music

News of the day:

If we ask for the news, it reads out the Indian news of the day on which it is asked.

Keyword: news

Temperature and Weather:

If the user asks for the temperature, it gives the current temperature.

Keyword: temperature

Joke:
55 | P a g e
Personal Voice Assistant

If the user asks for a joke, it tells a one-liner joke to the user.

Keyword: funny or joke

Fact:

If the user asks for some logical fact, it tells a fact to the user.

Keyword: fact

Game:

The assistant can play the number guessing game with the user. First, it asks for the lower and the

upper limit between which the number should be. Then it initializes a random number between that

upper and lower limit. After that, it uses a formula to calculate the number of turns within which the

user should guess the number.

Keyword: game

Restart the system:

The assistant restarts the system if the user asks the assistant to restart the system.

Keyword: Restart the system or reboot the system

Open:

The assistant will open some of the folders and applications which the user asks the assistant to

open.Keyword: Open

56 | P a g e
Personal Voice Assistant

Date and Time:

If the user asks for the date or time, the assistant tells it.

Keyword: date or time or date and time

Calculate:

The assistant will calculate the equations that the user tells it to calculate using WolframAlpha API

key.

Keyword: calculate (along with the equation)

Turn on the light:

This is an IOT feature where the assistant turns on the light if the user asks it to turn on the light.

Keyword: light on

Turn off the light:

This is an IOT feature where the assistant turns off the light if the user asks it to turn off the light.

Keyword: light off

57 | P a g e
Personal Voice Assistant

CHAPTER 4

PERSONAL ASSISTANT SOFTWARE IN THE MARKET

4.1 GOALS OF PERSONAL ASSISTANT SOFTWARE

The primary objective of the personal assistant software is to function as a seamless interface between

users and the digital world. It achieves this by comprehending user requests or commands and

translating them into actionable tasks or insightful recommendations. Central to its operation is a

sophisticated knowledge base that models the agent's understanding of the world, encompassing

intricate relationships, connections, and rules between various concepts. While these agents are not

intended to replace human capabilities, they excel in automating mundane tasks that users might find

repetitive or less engaging. This efficiency is facilitated by the software's ability to process vast

amounts of real-time information sourced from the web and other data repositories. By handling routine

tasks and providing timely information, the software aims to enhance user productivity and streamline

daily activities. It operates with a continuous learning mechanism, adapting its responses based on user

interactions and feedback, thereby improving its relevance and performance over time. This approach

ensures that the software remains an invaluable tool for users, augmenting their capabilities through

intelligent automation and data-driven decision-making. Personal assistant software aims to streamline

and enhance the user's daily activities by leveraging artificial intelligence, natural language processing

(NLP), and automation. These systems are designed to act as proactive and intelligent helpers, capable

58 | P a g e
Personal Voice Assistant

of understanding and fulfilling user requests across various domains. The goals of personal assistant

software can be summarized as follows:

Enhanced Efficiency and Productivity:

Personal assistants strive to simplify complex tasks and workflows, reducing the time and effort

required for routine activities. By automating repetitive tasks like scheduling appointments, managing

emails, or setting reminders, they allow users to focus on more critical and creative aspects of their

work and personal lives.

Seamless Integration and Accessibility:

Another key goal is to integrate seamlessly with diverse devices and applications, ensuring a unified

user experience across platforms. Whether on smartphones, computers, smart speakers, or IoT devices,

personal assistants should provide consistent functionality and access to information, making it easy

for users to interact and stay organized from anywhere.

Intelligent Information Retrieval:

Personal assistants excel in retrieving relevant information quickly and accurately in response to user

queries. They utilize advanced NLP techniques to understand natural language input, extract key

information, and fetch real-time data from sources like the web, databases, or specialized APIs. This

capability spans from retrieving weather updates and news headlines to finding answers to factual

questions or providing directions.

59 | P a g e
Personal Voice Assistant

Personalization and Context Awareness:

Personal assistants aim to learn and adapt to individual user preferences and behaviours over time. By

analyzing past interactions and user data (with appropriate consent and privacy considerations), they

tailor responses and recommendations to suit specific needs. This personalization enhances user

satisfaction and efficiency by anticipating needs and providing proactive assistance.

Natural and Intuitive User Interface:

User interface design is crucial to the success of personal assistants. They should provide a natural and

intuitive interaction experience through voice commands, text input, or even gestures. Advances in

speech recognition and synthesis enable assistants to understand nuanced commands, maintain context

over multiple interactions, and respond with human-like speech patterns.

Continuous Improvement and Learning:

Personal assistants are designed to evolve and improve through machine learning algorithms and user

feedback. By analysing usage patterns and incorporating new data, they can expand their knowledge

base, improve accuracy, and adapt to changing user preferences and linguistic variations.

Security and Privacy:

Finally, ensuring robust security measures and respecting user privacy are paramount goals. Personal

assistant software must protect sensitive information, employ encryption where necessary, and provide

transparent control over data usage and storage. Upholding these standards builds trust and confidence

among users in the reliability and confidentiality of their interactions.

60 | P a g e
Personal Voice Assistant

In summary, personal assistant software aims to enhance user productivity, streamline tasks, provide

intelligent information retrieval, personalize user experiences, offer intuitive interfaces, learn from

interactions, and prioritize security and privacy. As these technologies continue to advance, they hold

promise in transforming how individuals manage their daily activities and interact with digital

environments.

4.2 Different Types of Personal Assistant Software

Personal assistant software comes in several types, each tailored to different user needs and contexts.

Firstly, task-oriented assistants focus on managing specific tasks like scheduling appointments, setting

reminders, or organizing to-do lists. These assistants excel in improving time management and

enhancing productivity by automating routine activities. Secondly, some informational assistants

prioritize retrieving and presenting information to users. These include virtual agents capable of

answering queries, providing updates on news or weather, and fetching data from the web or databases.

Thirdly, cognitive assistants, powered by AI and machine learning, offer advanced capabilities such as

natural language understanding, context-aware recommendations, and personalized interactions. They

adapt to user preferences over time, learning from interactions to deliver increasingly tailored

assistance. Lastly, integrated assistants combine features from multiple types, offering a

comprehensive suite of functionalities that span task management, information retrieval, and cognitive

decision-making. These integrated solutions aim to provide a holistic user experience by seamlessly

61 | P a g e
Personal Voice Assistant

blending automation with intelligent assistance, catering to diverse user needs in both personal and

professional settings.

4.2. 1. Voice Recognition as Input Entry Medium

Voice recognition has revolutionized user interaction with technology by serving as a sophisticated

input medium for personal assistant software and other applications. This technology enables users to

input commands, queries, or requests using spoken language, which the software then interprets and

processes. By leveraging advancements in machine learning, particularly with neural networks and

natural language processing algorithms, voice recognition systems have significantly improved in

accuracy and responsiveness. This capability not only enhances accessibility for users with varying

levels of typing proficiency or physical abilities but also facilitates hands-free operation, particularly

useful in scenarios like driving or multitasking. Voice recognition systems can understand and

distinguish between different accents, dialects, and languages, broadening their applicability across

global user bases. Moreover, continuous advancements in voice recognition technology are expanding

its capabilities beyond basic commands to more complex interactions, such as natural conversation and

context-aware responses. As a result, voice recognition continues to play a pivotal role in enhancing

user experience and productivity across a wide range of devices and applications.

62 | P a g e
Personal Voice Assistant

4.2. 2. Voice Recognition-Based Task Automation or Information Retrieval

Voice recognition technology has evolved to become a cornerstone of task automation and information

retrieval in modern digital environments. By enabling users to interact with devices and applications

through spoken commands, voice recognition systems streamline daily tasks and enhance user

productivity. Task automation capabilities allow users to delegate routine activities such as scheduling

appointments, setting reminders, or controlling smart home devices, all through voice commands. This

hands-free approach not only saves time but also offers convenience, particularly in situations where

manual input may be impractical or cumbersome. Furthermore, voice recognition facilitates efficient

information retrieval by enabling users to ask questions, request updates on news or weather, or search

for specific information from the web—all without needing to type queries manually. The integration

of artificial intelligence and natural language processing techniques enhances these systems' ability to

understand context, user preferences, and nuances in speech, providing more accurate and relevant

responses over time. As voice recognition technology continues to advance, its role in automating tasks

and retrieving information will expand, offering increasingly personalized and intuitive interactions

that integrate seamlessly into everyday life. Task Automation:

Voice recognition allows for hands-free operation of tasks that traditionally require manual input. For

example, users can dictate emails, schedule appointments, control smart home devices, or even perform

complex calculations without touching a keyboard or screen. This automation not only improves

productivity but also enables multitasking and accessibility for users with physical limitations.

63 | P a g e
Personal Voice Assistant

Modern voice assistants, powered by technologies like Google Speech Recognition, Amazon Alexa,

or Apple Siri, use sophisticated algorithms to convert spoken language into text accurately. They then

interpret these commands through NLP models that understand intent and context, enabling seamless

execution of tasks across various domains.

Information Retrieval:

Voice recognition-based information retrieval systems use voice queries to fetch relevant information

from vast data sources, including the web, databases, and APIs. Users can ask natural language

questions, such as inquiries about weather forecasts, stock prices, historical facts, or definitions, and

receive immediate spoken responses or displayed results.

These systems often integrate with knowledge databases like WolframAlpha or leverage web scraping

techniques to provide real-time information. Natural language understanding models parse user

queries, extract key information, and generate concise summaries or responses, mimicking human-like

interaction.

Future Directions:

Future advancements in voice recognition and NLP are expected to focus on improving accuracy,

contextual understanding, and user personalization. Enhanced deep learning models, such as

transformers and neural networks, will enable voice assistants to handle more complex queries and

adapt dynamically to user preferences and behaviours.

64 | P a g e
Personal Voice Assistant

Additionally, integrating voice recognition with emerging technologies like augmented reality (AR) or

virtual reality (VR) could expand the capabilities of voice assistants beyond traditional screen-based

interactions. This convergence could redefine how users interact with digital information and

immersive environments, paving the way for new applications in education, healthcare, and

entertainment.

4.2.3 Planning

In this category of personal assistant software, the emphasis is on task understanding, subtask

identification, and task planning to facilitate efficient task completion for users. Examples like Siri,

which can book restaurant reservations using web services such as OpenTable, demonstrate the

capabilities of such systems. Similarly, the agent being designed as part of your thesis aims to operate

within this category by leveraging Python, web data sources, and inference technologies to manage

and plan tasks based on contextual user information. This approach not only enhances user productivity

but also highlights the integration of AI-driven capabilities in everyday task management scenarios.

65 | P a g e
Personal Voice Assistant

4.3 History of Voice Assistants

A Modern History of Voice Assistants

In recent times, Voice assistants became the major platform after Apple integrated the most astonishing

Virtual Assistant — Siri which is officially a part of Apple Inc. But the timeline of greatest evolution

began with the year 1962 event at the Seattle World Fair where IBM displayed a unique apparatus

called Shoebox. It was the actual size of a shoebox and could perform scientific functions and perceive

16 words and speak to them in the human recognizable voice with 0 to 9 numerical digits.

During the period of the 1970s, researchers at Carnegie Mellon University in Pittsburgh, Pennsylvania

— with the considerable help of the U.S. Department of Defence and its Defence Advanced Research

Projects Agency (DARPA) — made Harpy. It could understand almost 1,000 words, which is the

vocabulary of a three-year-old child.

66 | P a g e
Personal Voice Assistant

Big organizations like Apple and IBM sooner in the 90s started to make things that utilized voice

acknowledgment. In 1993, Macintosh began to build speech recognition with its Macintosh PCs with

Plain Talk.

In April 1997, Dragon NaturallySpeaking was the first constant dictation product that could

comprehend around 100 words and transform them into readable content.

Having said that, how cool it would be to build a simple voice-based desktop/laptop assistant that has

the capability to:

1. Open the YouTube in the browser.

2. Open any website in the browser.

3. Send an email to your contacts.

4. Launch any system application.

5. Tells you the current weather and temperature of almost any city

6. Tells you the current time.

67 | P a g e
Personal Voice Assistant

7. Greetings

8. Play a song on a VLC media player (of course you need to have a VLC media player installed on

your laptop/desktop)

9. Change desktop wallpaper.

10. Tells you the latest news feeds.

11. Tells you about anything you ask.

So here in this article, we are going to build a voice-based application that can do all the above-

mentioned tasks.

2.4 What are Intelligent Personal Assistants or Automated Personal Assistants?

An Intelligent Personal Assistant (IPA) represents a sophisticated application designed to streamline

and enhance daily tasks through a natural language interface. These assistants excel in organizing and

managing information such as emails, calendar events, files, and to-do lists, acting as a virtual

concierge capable of performing tasks based on voice commands or inputs. They vary in capability,

from simple reflex agents that respond to basic commands to more advanced models like goal-based

or utility-based agents that prioritize tasks based on predefined objectives or user preferences. IPAs

leverage artificial intelligence, machine learning, and natural language understanding to interpret

complex queries, personalize responses, and automate tasks without constant user interaction. They

68 | P a g e
Personal Voice Assistant

can schedule appointments, set reminders, automate research tasks, translate languages, and even

recommend products or services based on user preferences and historical data. Furthermore, IPAs

integrate seamlessly into various digital channels, ensuring continuity across different user interfaces

and enhancing overall user experience by adapting to evolving needs and preferences.

This comprehensive functionality not only enhances personal productivity but also extends to

enterprise applications, where IPAs can leverage industry-specific knowledge and data for marketing

or customer service purposes. Their ability to learn and adapt over time ensures they remain relevant

and effective in addressing diverse user needs, whether managing personal schedules or facilitating

business operations. By enabling natural conversation and intelligent decision-making, IPAs represent

a significant advancement in human-computer interaction, bridging the gap between user intent and

actionable outcomes through intuitive and responsive digital assistance.

4.5 How do Artificial Intelligence Assistants Interact with People?

Artificial Intelligence (AI) assistants interact with people through a variety of sophisticated

mechanisms that facilitate intuitive and seamless communication. Here is a detailed explanation of how

these interactions occur:

1. Natural Language Understanding (NLU):

AI assistants employ NLU to comprehend and interpret human language input. This capability allows

them to understand spoken commands, text queries, and even complex sentences, parsing the meaning

and intent behind the user's words.

69 | P a g e
Personal Voice Assistant

2. Speech Recognition:

Using advanced speech recognition technologies, AI assistants convert spoken words into text. This

process enables hands-free interaction, where users can dictate commands or queries without the

needfor manual input.

3. Contextual Awareness:

AI assistants maintain contextual awareness during interactions, remembering previous interactions,

user preferences, and ongoing tasks. This allows them to provide relevant responses and anticipate user

needs based on the current context.

4. Task Execution and Automation:

Based on user commands and preferences, AI assistants execute tasks such as scheduling

appointments, setting reminders, sending messages, or controlling smart home devices. They automate

routine activities, enhancing user productivity and convenience.

5. Information Retrieval and Recommendations:

AI assistants access vast databases and real-time information sources to retrieve answers to queries,

provide updates on weather or news, and offer personalized recommendations for products or services

based on user preferences.

70 | P a g e
Personal Voice Assistant

6. Conversational Interfaces:

AI assistants employ conversational interfaces that mimic human-like interactions, using natural

language responses and interactive dialogues to engage users. They can handle follow-up questions,

clarify ambiguities, and maintain coherent conversations.

7. Learning and Adaptation:

Through machine learning algorithms, AI assistants learn from user interactions to improve their

responses and adapt to individual preferences over time. They refine their understanding of user

behaviours and preferences, enhancing the accuracy and relevance of their interactions.

8. Multi-channel Integration:

AI assistants integrate seamlessly across multiple platforms and devices, ensuring consistent

interaction experiences. They can operate via smartphones, smart speakers, chatbots, and other digital

interfaces, maintaining continuity regardless of the user’s preferred device.

9. Feedback Mechanisms:

To enhance user satisfaction and effectiveness, AI assistants incorporate feedback mechanisms. They

solicit user input, track performance metrics, and adjust responses based on user feedback to

continuously improve interaction quality.

71 | P a g e
Personal Voice Assistant

10. Privacy and Security:

AI assistants prioritize user privacy and data security. They implement encryption protocols,

anonymize data where necessary, and adhere to privacy regulations to safeguard user information

during interactions.

As AI technology advances, the interactions between AI assistants and people are expected to become

even more nuanced and responsive. Future developments may include enhanced emotional

intelligence, proactive assistance anticipating user needs, and improved capabilities in understanding

diverse languages and accents. These advancements will further blur the line between human-like

interaction and digital assistance, making AI assistants indispensable tools for everyday tasks and

professional applications alike. This detailed explanation outlines how AI assistants leverage

sophisticated technologies to interact effectively with users, enhancing convenience, productivity, and

personalized assistance in various aspects of daily life.

72 | P a g e
Personal Voice Assistant

CHAPTER 5

IMPLEMENTATION

5.1 Building a Personal Voice Assistant

If the available personal voice assistants do not perform all the tasks, you want them to, it is possible

to build your own. For a text-based personal voice assistant, you do not even need to know how to

code. There are apps available to help people create assistants that can automate tasks or events.

Creating a voice-activated personal voice assistant is much more difficult. That is where companies

like Converse. AI. “We make it easier for non-developers to build and automate the services that they

need. No coding experience is required,” Lucas says.

A text-based personal voice assistant automates tasks and interacts with customers. It can also help

answer questions for clients, access databases, and help customers help themselves. For more

information about customer self-service portals, many of which use personal voice assistants, read

"Customer Service Portals: Help Your Users Help Themselves."

If you choose to create a personal voice assistant, make sure it is representative of your brand. Also,

make sure it works, since technology will not do your business any good if it does not help customers.

“The danger is that people will try it, it won’t work, and they won’t go back,” Mutchler warns. She

mentions Samsung’s Bixby, which debuted on the Galaxy S8 phone but was not fully functional when

73 | P a g e
Personal Voice Assistant

it came out. Many customers tried it a few times, and then asked Samsung to develop a way to disable

it, which they did in a software update.

Here are some other elements to consider when building a personal voice assistant:

• Remember the end user.

• Choose useful features.

• Give it personality.

• Integrate it with various platforms.

Building a personal voice assistant takes time, so it is better not to rush it. Focus on doing a few things

extraordinarily well instead of trying to do many things (and, therefore, doing them unsuccessfully).

Also, remember to update the personal voice assistant, as necessary. It is not a “build it and leave it”

venture.

74 | P a g e
Personal Voice Assistant

5.2 Dependencies and Requirements

System requirements: Python 3.7, PyCharm IDE, Win OS (version 10)

Install all these Python libraries:

pip install Speech Recognition

pip install pyttsx3

pip install web browser

pip install smptlib

pip installs random

pip install Wikipedia

pip install Date Time

pip install wolfram alpha

pip install OS

pip install sys

75 | P a g e
Personal Voice Assistant

5.3 Let Us Start Building Our Personal Voice Assistant Using Python

import speech recognition as sr

import wolframalpha

from YT_auto import music

from selenium_web_driver import inforr

from News import

import randfacts

from pyjokes import

from weather import

import datetime

from search import sear

import random2 import math

import warnings import open

import os import serial

import time arduino = serial.Serial(port='COM3', baudrate=115200, timeout=.1)

For our voice assistant to perform all the above-discussed features, we must code the logic of each of

them in one method.


76 | P a g e
Personal Voice Assistant

So, our first step is to create a method that will interpret the user voice response.

def myCommand():

r = sr.Recognizer()

with sr.Microphone() as source:

print ("Listening...")

r.pause_threshold = 1

audio = r.listen(source)

try:

query = r.recognize_google(audio, language='en-in')

print ('User: ' + query + '\n')

except sr.UnknownValueError:

speak ('Sorry sir! I didn\'t get that! Try typing the command!')

query = str (input ('Command: '))

return query

Next, create a method that will convert text to speech.

77 | P a g e
Personal Voice Assistant

def speak(audio):

print ('Computer: ' + audio)

engine.say(audio)

engine.runAndWait()

Now create a loop to continue executing multiple commands. Inside the method assistant () passes user

command (my Command ()) as parameters.

while True:

query = myCommand();

query = query.lower()

if 'open YouTube' in query:

speak('okay')

webbrowser.open('www.youtube.com')

elif 'open google' in query:

speak('okay')

webbrowser.open('www.google.co.in')

elif 'open gmail' in query:

speak('okay')

78 | P a g e
Personal Voice Assistant

webbrowser.open('www.gmail.com')

elif "what\'s up" in query or 'how are you' in query:

stMsgs = ['Just doing my thing!', 'I am fine!', 'Nice!', 'I am nice and full of energy']

speak(random.choice(stMsgs))

elif 'email' in query:

speak ('Who is the recipient? ')

recipient = myCommand()

if 'myself' in recipient:

try:

speak ('What should I say? ')

content = myCommand()

server = smtplib.SMTP('smtp.gmail.com', 587)

server.ehlo()

server.starttls()

server.login("jackie.61093@gmail.com", 'password')

server.sendmail('jackie.61093@gmail.com', "vjajodiya6@gmail.com", content)

server.close()

79 | P a g e
Personal Voice Assistant

speak ('Email sent!')

except:

speak ('Sorry Sir! I am unable to send your message at this moment!')

elif 'nothing' in query or 'abort' in query or 'stop' in query:

speak('okay')

speak ('Bye Sir, have a good day.')

sys.exit()

elif 'hello' in query:

speak ('Hello Sir')

elif 'bye' in query:

speak ('Bye Sir, have a good day.')

sys.exit()

elif 'play music' in query:

music_folder = Your_music_folder_path

music = [music1, music2, music3, music4, music5]

random_music = music_folder + random.choice(music) + '.mp3'

os.system(random_music)

80 | P a g e
Personal Voice Assistant

speak ('Okay, here is your music! Enjoy!')

else:

query = query

speak ('Searching...')

try:

try:

res = client.query(query)

results = next(res.results).text

speak ('WOLFRAM-ALPHA says - ')

speak ('Got it.')

speak(results)

except:

results = Wikipedia. Summary (query, sentences=2)

speak ('Got it.')

speak ('WIKIPEDIA says - ')

speak(results)

81 | P a g e
Personal Voice Assistant

except:

webbrowser.open('www.google.com')

speak ('Next Command! Sir!')

Our next step is to create multiple if statements corresponding to each of the features. So let us see how

to create these small modules inside if statement for each command.

warnings.filterwarnings("ignore")

engine = p.init() rate = engine.getProperty('rate')

engine.setProperty('rate', 150) voices = engine.getProperty('voices')

engine.setProperty('voice', voices[0].id)

def speak(text): engine.say(text)

engine.runAndWait()

def wishme():

hour = int(datetime.datetime.now(). hour) if hour > 0 and hour < 12:

return ("Morning") elif hour >= 12 and hour < 16:

return ("Afternoon") elif hour >= 16 and hour < 19:

return ("evening") else:

return ("night")

82 | P a g e
Personal Voice Assistant

def quitApp(): hour = int(datetime.datetime.now(). hour)

if hour >= 3 and hour < 18: print ("have a good day sir")

speak ("have a good day sir") else:

print ("Goodnight, sir") speak ("Goodnight, sir")

print("Offline") exit (0)

def write_read(x):

Arduino. Write (bytes (x, 'utf-8')) time. Sleep (0.05)

data = arduino.readline() return data

#flags

Light_status_flag = False

today date = datetime.datetime.now()

r = sr.Recognizer() speak ("Tell the wake-up word")

wake = "hello Nova" with sr.Microphone() as source:

r.energy_threshold = 10000 r.adjust_for_ambient_noise(source, 1.2)

print("Listening") audio = r.listen(source)

wakeword = r.recognize_google(audio)

print(wakeword)

83 | P a g e
Personal Voice Assistant

if wake == wakeword: while True:

speak ("hello sir, good " + wishme() + ", i'm here to assist you.") speak ("How are you")

with sr.Microphone() as source:

r.energy_threshold = 10000 r.adjust_for_ambient_noise(source, 1.2)

print("Listening") audio = r.listen(source)

text = r.recognize_google(audio) print(text)

if "what" and "about" and "you" in text:

speak ("I am also having a good day")

if name == " main ": while True:

speak ("What can i do for you??")

with sr.Microphone() as source: r.energy_threshold = 10000

r.adjust_for_ambient_noise(source, 1.2) print('Listening .... ')

audio = r.listen(source) text2 = r.recognize_google(audio)

if "information" in text2: speak ("You need information related to which topic")

with sr.Microphone() as source:

r.energy_threshold = 10000 r.adjust_for_ambient_noise(source, 1.2)

print('Listening ..... ') audio = r.listen(source)

84 | P a g e
Personal Voice Assistant

infor = r.recognize_google(audio) speak ("Searching {} in Wikipedia".format(infor))

print ("Searching {} in Wikipedia". Format(infor))

assist = inforr()

assist.get_info(infor)

elif "play" and "video" in text2: speak ("Which video you want me to play??")

with sr.Microphone() as source:

r.energy_threshold = 10000 r.adjust_for_ambient_noise(source, 1.2)

print('Listening ..... ') audio = r.listen(source)

vid = r.recognize_google(audio) speak ("Playing {} on YouTube".format(vid))

print ("Playing {} on youtube". format(vid)) assist = music ()

assist. Play(vid)

elif "news" in text2: speak ("Sure sir, Now I will read news for you")

arr = news () for i in range(len(arr)):

print(arr[i]) speak((arr[i]))

elif "temperature" in text2:

speak ("Temperature in Chennai is" + str (temp ()) + " degree celcius" + " and with " + str (des ()))

print ("Temperature in Chennai is" + str (temp ()) + " degree celcius" + " and with " + str (des ()))

85 | P a g e
Personal Voice Assistant

elif "funny" in text2:

speak ("Get ready for some chuckles") joke = pyjokes.get_joke()

speak(joke) print(joke)

elif "your name" in text2:

speak ("My name is Next genn Optimal Voice Assistant Nova")

elif "fact" in text2:

speak ("Sure sir, ")

x = randfacts.getFact() speak ("Did you know that" + x)

print(x)

elif "search" in text2: speak ("What should i search for sir")

with sr.Microphone() as source: r.energy_threshold = 10000

r.adjust_for_ambient_noise(source, 1.2) print('Listening ..... ')

audio = r.listen(source) searc = r.recognize_google(audio)

speak ("Searching {} in Google".format(searc)) print ("Searching {} in Google".format(searc))

asist = sear () asist.get_infoo(searc)

elif "game" in text2: speak ("enter your lower limit sir")

with sr.Microphone() as source: r.energy_threshold = 10000

86 | P a g e
Personal Voice Assistant

r.adjust_for_ambient_noise(source, 1.2) print('Listening ..... ')

audio = r.listen(source) lower = int(r.recognize_google(audio))

speak ("now, Enter your upper limit") with sr.Microphone() as source:

r.energy_threshold = 10000 r.adjust_for_ambient_noise(source, 1.2)

print('Listening ..... ') audio = r.listen(source)

upper = int(r.recognize_google(audio)) x = random2.randint(lower, upper)

speak ("\n\tYou've only " + str (round (math.log (upper - lower + 1, 2))) + " chances to guess the

integer! \n") print ("\n\tYou've only " + str (round (math.log (upper - lower + 1, 2))) + " chances to

guess the integer! \n" + str(upper), str(lower))

count = 0 while count < math.log (upper - lower + 1, 2):

count += 1 speak ("start guessing")

speak ("Guess a number")

with sr.Microphone() as source:

r.energy_threshold = 10000 r.adjust_for_ambient_noise(source, 1.2)

print('Listening ..... ') audio = r.listen(source)

guess = int(r.recognize_google(audio)) if x == guess:

print ("Congratulations you did it in " + str(count) + " try") speak ("Congratulations you did it in " +

str(count) + " try")


87 | P a g e
Personal Voice Assistant

break elif x > guess:

print ("You guessed too small!") speak ("You guessed too small!")

elif x < guess: print ("You Guessed too high!")

speak ("You Guessed too high!") if count >= math.log (upper - lower + 1, 2):

print ("\nThe number is %d" % x) speak ("\nThe number is %d" % x)

print ("\tBetter Luck Next time!") speak ("\tBetter Luck Next time!")

elif "reboot the system" in text2:

speak ("Do you wish to restart your computer?") with sr.Microphone() as source:

r.energy_threshold = 10000 r.adjust_for_ambient_noise(source, 1.2)

print('Listening ..... ') audio = r.listen(source)

restart = r.recognize_google(audio) elif "light off" in text2:

#If Light_status_flag == True: cmd = "OFF"

Status = write_read(cmd) speak ("Lights are turned off")

#elif Light_status_flag == False: elif "stop" or "exit" or "end" in text2:

speak ("It's a pleasure helping you and I am always here to help you out!") quitApp()

88 | P a g e
Personal Voice Assistant

5.3. 1 SCREENSHOT

89 | P a g e
Personal Voice Assistant

90 | P a g e
Personal Voice Assistant

91 | P a g e
Personal Voice Assistant

5.4 Flow-chart

92 | P a g e
Personal Voice Assistant

5.5 DATA FLOW DIAGRAM

93 | P a g e
Personal Voice Assistant

CHAPTER 6

RESULT & ANALYSIS


The project work of the voice assistant has been clearly explained in this report, how useful it is and

how we can rely on a voice assistant for performing any/every task which the user needs to complete

and how the assistant is developing every day which we can hope that it'll be one of the biggest

technologies in the current technological world. Development of the software is almost completed from

our side, and it is working fine as expected which was discussed for some extra development. So, some

advancement might come shortly where the assistant which we developed will be even more useful

than it is now.

6.1 Working

It starts with a signal word. Users say the names of their voice assistants for the same reason. They

might say, “Hey Siri!” or simply, “Alexa!” Whatever the signal word is, it wakes up the device. It

signals to the voice assistant that it should begin paying attention. After the voice assistant hears its

signal word, it starts to listen. The device waits for a pause to know you have finished your request.

The voice assistant then sends our request over to its source code. Once in the source code, our request

is compared to other requests. It is split into separate commands that our voice assistant can understand.

The source code then sends these commands back to the voice assistant. Once it receives the

commands, the voice assistant knows what to do next. If it understands, the voice assistant will carry

out the task we asked for. For example, “Hey NOVA! What is the weather?” NOVA reports back to

us

94 | P a g e
Personal Voice Assistant

in seconds. The more directions the devices receive, the better and faster they get at fulfilling our

requests. The user gives the voice input through the microphone and the assistant is triggered by the

wake-up word and performs the STT (Speech to Text) and converts it into a text and understands the

Voice input and further performs the task said by the user repeatedly and delivers it via TTS (Text to

Speech) module via AI Voice. These are the key features of the voice assistant but other than this, we

can do plenty of things with the assistant.

List of features that can be done with the assistant: - Playing some video which, the user wants to see.

- Telling some random fact at the start of the day with which the user can do their work in an

informative way and the user will also learn something new. - One of the features which will be there

in every assistant is playing some game so that the user can spend their free time in a fun way. - Users

might forget to turn off the system which might contain some useful data but with a voice assistant, we

can do that even after leaving the place where the system is just by commanding the assistant to turn

the system off. As discussed, the mandatory features to be listed in voice assistant are implemented in

this work, brief explanation is given below. API CALLS We have used API keys for getting news

information from news Api and weather forecasts from OpenWeatherMap which can accurately fetch

information and give results to the user. SYSTEM CALLS In this feature, we have used OS & Web

Browser Module to access the desktop, calculator, task manager, command prompt & user folder. This

can also restart the PC and open the Chrome application. CONTENT EXTRATION This can Perform

content extraction from YouTube, Wikipedia, and Chrome using the web driver module from Selenium

which provides all the implementations for the web drive like searching for a specific video to play, to

get a piece of specific information in Google or from Wikipedia.


95 | P a g e
Personal Voice Assistant

Fig Workflow model

1) Must provide the user with any information which they ask for: - The user might need any

information which will be available on the internet but searching for that information and reading that

takes a lot of time but with the help of a voice assistant, we can complete that task of getting the

information sooner than searching and reading it. So, this is a small proof that a voice assistant helps

the user to save time

2) Telling the day's hot news in the user's location: - In Common, watching a news channel just to

know the important news in one’s location takes a lot of time and the user might even want to listen to

some news which is unnecessary to them or news of some different location before getting to know the

news which they want needs a lot of patience to the user but having a voice assistant makes all that

nothing, it'll give the news of the location which the user wants to now or the news which they want to

know.
96 | P a g e
Personal Voice Assistant

3) Telling some joke to chill up the moment: - Now let us be honest, everyone has had at least one

moment in their life where they were so tense up or had an argument with their close people. So, these

moments can be chilled up at least ten percent with some random joke that might cool us that moment

or stop that fight. We even have a quote stating "Laughter is the best medicine" which is relatable to

the words mentioned here in this paragraph.

4) Opening the file/folder that the user wants: - In the busy world, everything should be done quickly

else, our schedule will get changed and sometimes we need assistance from someone to complete that

task quickly but, if we have a voice assistant, we can complete that task in right away in a hustle

freeway. For example, let's say the user is doing some documentation but after a while, he needs some

file for reference and he goes searching for that file which wastes a lot of time and he ends up missing

the deadline but, with a voice assistant, we can quickly do the searching part by commanding the

assistant to open the folder. So, by this, we can say that it is one of the key features of a voice assistant.

5) Telling the temperature/weather at the user's location: - Let us start this with a question, why is it

important for us to know the weather of the day? or why is it important for us to monitor the weather

every day? The answer is simple it forewarns the users asking about the weather saying, "It might rain

today so carry an umbrella if you go out" or "It will be a sunny day so wear a sun glass". So, by this,

we can say that this is also a must-have feature.

6) Searching for what the user asks: Today in the 20th century, we people often get doubts, and we

need to clear those doubts as soon as possible else that one doubt will be multiplied, and in the end, we

will

97 | P a g e
Personal Voice Assistant

have no doubts and to clear the doubts searching the question on the internet will give us an answer

and clear our doubts and asking that the assistant will save a lot of time. Other than clearing the doubts,

we need to search a lot of questions or topics on the internet to keep up with the trend and we can do

this search just by giving the command to our assistant, asking it to search a specific topic/question.

6.2 PROS

AI assistants offer numerous advantages across various domains, significantly enhancing productivity

and user experience. Some of the key benefits include:

1. Efficiency and Speed: AI assistants can process and analyse large volumes of data quickly, providing

accurate responses and completing tasks in a fraction of the time it would take a human.

2. 24/7 Availability: Unlike human counterparts, AI assistants are available around the clock,

offeringconsistent support without the need for breaks, sleep, or time off.

3. Personalization: By learning from user interactions, AI assistants can offer personalized

recommendations and responses, improving the overall user experience.

4. Cost-Effectiveness: Deploying AI assistants can reduce operational costs, as they can handle

repetitive and mundane tasks, allowing human employees to focus on more complex and strategic

activities.

5. Scalability: AI assistants can easily scale to handle a growing number of tasks or users without a

significant increase in resources or costs.

98 | P a g e
Personal Voice Assistant

6. Multitasking: They can manage multiple queries or tasks simultaneously, ensuring that various needs

are met efficiently and promptly.

7. Consistency and Accuracy: AI assistants provide consistent responses and minimize the risk of

human errors, ensuring reliable and accurate information is delivered.

8. Accessibility: They can assist users with disabilities by providing voice-activated commands and

responses, making technology more accessible to a broader audience.

9. Data Insights: AI assistants can analyze user interactions and gather valuable data insights, helping

businesses understand customer behaviour and preferences better.

10. Language Support: Many AI assistants can understand and respond in multiple languages, breaking

down language barriers and providing support to a diverse user base.

Overall, AI assistants enhance productivity, improve user satisfaction, and provide valuable support in

various personal and professional contexts.

6.3 CONS

While AI assistants offer numerous benefits, they also come with several drawbacks that need to be

considered:

1. Privacy Concerns: AI assistants often require access to personal data to function effectively, raising

concerns about data privacy and the potential misuse of sensitive information.

2. Security Risks: The data handled by AI assistants can be vulnerable to hacking and cyberattacks,

posing significant security risks.


99 | P a g e
Personal Voice Assistant

3. Dependence on Technology: Over-reliance on AI assistants can lead to reduced human skill

development and dependency on technology for even simple tasks.

4. Lack of Emotional Intelligence: AI assistants, despite advancements, cannot fully replicate human

empathy, emotional understanding, or nuanced interpersonal communication.

5. Job Displacement: The automation of tasks by AI assistants can lead to job displacement, particularly

for roles involving repetitive or routine tasks, affecting employment in certain sectors.

6. Bias and Fairness Issues: AI systems can inherit biases present in their training data, leading to unfair

or discriminatory outcomes in their responses and decision-making processes.

7. Inaccuracy and Limitations: AI assistants may provide incorrect or misleading information if they

misinterpret queries or rely on outdated or inaccurate sources.

8. Complexity in Troubleshooting: Technical issues with AI systems can be complex to diagnose and

resolve, requiring specialized knowledge and resources.

9. Limited Understanding: Despite their advanced capabilities, AI assistants still struggle with

understanding context, sarcasm, idioms, and complex human emotions, which can lead to

misunderstandings.

10. High Initial Costs: Developing and implementing sophisticated AI systems can be expensive,

requiring significant investment in technology and infrastructure.

11. Ethical Concerns: The deployment of AI assistants raises ethical questions about the extent of their

control and decision-making abilities, as well as the transparency of their operations.

100 | P a g e
Personal Voice Assistant

Overall, while AI assistants provide significant advantages, it is crucial to address these challenges to

ensure their responsible and beneficial integration into society.

6.4 Advantages of Artificial Intelligence in Personal Voice Assistant

Artificial Intelligence (AI) significantly enhances the capabilities of personal voice assistants, offering

a range of advantages that improve user experience and functionality:

1. Natural Language Processing (NLP)*: AI enables personal voice assistants to understand and

interpret natural language, allowing users to interact using conversational speech rather than predefined

commands. This makes interactions more intuitive and user-friendly.

2. Personalization: AI-driven voice assistants can learn from user interactions and preferences,

providing personalized responses and recommendations. They can remember user habits, preferences,

and schedules, tailoring their assistance to individual needs.

3. Contextual Understanding: Advanced AI allows voice assistants to understand context, which

improves their ability to handle complex queries and follow-up questions. This contextual awareness

leads to more accurate and relevant responses.

4. Automation and Task Management: AI voice assistants can automate routine tasks such as setting

reminders, sending messages, managing calendars, and controlling smart home devices. This

automation saves time and simplifies daily activities.

101 | P a g e
Personal Voice Assistant

5. Accessibility: Voice assistants powered by AI provide valuable support for individuals with

disabilities, offering hands-free control of devices and enabling users with visual or motor impairments

to access information and services more easily.

6. Continuous Improvement: AI algorithms enable voice assistants to continuously learn and improve

from user interactions. This ongoing learning process ensures that the assistant becomes more efficient

and effective over time.

7. Multilingual Support: AI enables voice assistants to understand and respond in multiple languages,

catering to a diverse user base and breaking down language barriers.

8. Real-Time Information Retrieval: AI voice assistants can quickly retrieve information from the

internet, and provide real-time updates on news, weather, traffic, and other relevant data, enhancing

the user’s ability to stay informed.

9. Seamless Integration with Services and Devices: AI allows voice assistants to integrate seamlessly

with various third-party services and smart devices, creating a cohesive and interconnected user

experience across different platforms and technologies.

10. Enhanced Security Features: AI can improve the security of voice assistants through features like

voice recognition and biometric authentication, ensuring that only authorized users can access certain

functionalities.

Overall, AI enhances personal voice assistants by making them more intelligent, responsive, and

capable of providing valuable support in everyday tasks.

102 | P a g e
Personal Voice Assistant

6.5 Disadvantages of Artificial Intelligence in Personal Voice Assistant

1. High Cost:

The creation of artificial intelligence requires huge costs as they are overly complex machines. Their

repair and maintenance require huge costs.

They have software programs that need frequent gradation to cater to the needs of the changing

environment and the need for the machines to be smarter by the day.

In the case of severe breakdowns, the procedure to recover lost codes and reinstate the system might

require huge time and cost.

2. No Replicating Humans:

Intelligence is believed to be a gift of nature. An ethical argument continues, whether human

intelligence is to be replicated or not.

Machines do not have any emotions or moral values. They perform what is programmed and cannot

make the judgment of right or wrong. Even cannot make decisions if they encounter a situation

unfamiliar to them. They either perform incorrectly or break down in such situations.

3. No Improvement with Experience:

Unlike humans, artificial intelligence cannot be improved with experience. With time, it can lead to

wear and tear. It stores a lot of data but the way it can be accessed and used is quite different from

human intelligence.

103 | P a g e
Personal Voice Assistant

Machines are unable to alter their responses to changing environments. We are constantly bombarded

by the question of whether it is exciting to replace humans with machines.

In the world of artificial intelligence, there is nothing like working with a whole heart or passionately.

Care or concerns are not present in the machine intelligence dictionary. There is no sense of belonging

or togetherness or a human touch. They fail to distinguish between a hardworking individual and an

inefficient individual.

4. No Original Creativity:

Do you want creativity or imagination?

These are not the forte of artificial intelligence. While they can help you design and create, they are no

match to the power of thinking that the human brain has or even the originality of a creative mind.

Human beings are overly sensitive and emotional intellectuals. They see, hear, think, and feel. Their

thoughts are guided by feelings which completely lacks in machines. The inherent intuitive abilities of

the human brain cannot be replicated.

5. Unemployment:

The replacement of humans with machines can lead to large-scale unemployment.

Unemployment is a socially undesirable phenomenon. People with nothing to do can lead to the

destructive use of their creative minds.

104 | P a g e
Personal Voice Assistant

Humans can unnecessarily be highly dependent on machines if the use of artificial intelligence becomes

rampant. They will lose their creative power and will become lazy. Also, if humans start thinking

destructively, they can create havoc with these machines.

Artificial intelligence in the wrong hands is a serious threat to humankind in general. It may lead to

mass destruction. Also, there is a constant fear of machines taking over or superseding humans.

Based on the above discussion, the Association for the Advancement of Artificial Intelligence has two

objectives – to develop and advance the science of artificial intelligence and to promote and educate

about the responsible usage of artificial intelligence.

Identifying and studying the risk of artificial intelligence is an especially important task at hand. This

can help in resolving the issues at hand. Programming errors or cyber-attacks need more dedicated and

careful research. Technology companies and the technology industry as a whole need to pay more

attention to the quality of the software. Everything that has been created in this world and our societies

is the continuous result of intelligence.

Artificial intelligence augments and empowers human intelligence. So as long we are successful in

keeping the technology beneficial, we will be able to help this human civilization.

105 | P a g e
Personal Voice Assistant

CHAPTER 7

CONCLUSION
Voice Search has now become a definitive mobile experience. An absence of knowledge and learning

makes it especially tough for organizations to develop a strategy for voice search. There is a ton of

chance for a lot further and significantly more conversational experiences with users for AI in mobile

app development.

A great many people are searching for an answer to make various multitasking tasks more successful,

making speech-to-text the ideal feature. The utilization of voice-over content is likewise alluring to

individuals who do not want to use typing. With a mistake rate of just 8%, voice search will change

how individuals search over the internet.

Personal assistant software improves user productivity by managing routine tasks of the user and by

providing information from online sources to the user. As discussed earlier, technologies such as web

services, sharing of data, linked data, shared ontologies, knowledge databases, and mobile devices are

proving to be enablers for tools such as personal assistant software.

Building an agent that can replace a human assistant has been a holy grail for the software industry,

especially in the field of artificial intelligence. Difficulties associated with capturing human

intelligence in models that can be used to drive the agent have been one of the primary bottlenecks in

building such agents. The availability of data in semantic form, where the data carries itself the meaning

and data sources are interlinked with each other, provides an opportunity to first capture human

106 | P a g e
Personal Voice Assistant

knowledge in this form and then apply reasoning engines that can interpret these models to make

inferences for simple tasks.

This project presents a comprehensive overview of the design and development of a Voice-enabled

personal assistant f using Python programming language. This Voice-enabled personal assistant, in

today's lifestyle, will be more effective in case of saving time, compared to that of previous days. This

Personal Assistant has been designed with ease of use as the main feature. The Assistant works properly

to perform some tasks given by the user. Furthermore, there are many things that this assistant can do,

like turning our PC off, restarting it, or reciting the latest news, with just one voice command.

107 | P a g e
Personal Voice Assistant

CHAPTER 8

FUTURE ENHANCEMENTS
We are entering the era of implementing voice-activated technologies to remain relevant and

competitive. Voice-activation technology is vital not only for businesses to stay relevant with their

target customers, but also for internal operations. Technology may be utilized to automate human

operations, saving time for everyone. Routine operations, such as sending basic emails or scheduling

appointments, can be completed more quickly, with less effort, and without the use of a computer, just

by employing a simple voice command. People can multitask as a result, enhancing their productivity.

Furthermore, relieving employees from hours of tedious administrative tasks allows them to devote

more time to strategy meetings, brainstorming sessions, and other jobs that need creativity and human

interaction.

1) Sending Emails with a voice assistant:

Emails, as we all know, are very crucial for communication because they can be used for any

professional contact, and the finest service for sending and receiving emails is, as we all know, GMAIL.

Gmail is a Google-created free email service. Gmail can be accessed over the web or using third-party

apps that use the POP or IMAP protocols to synchronize email content.

To integrate Gmail with Voice Assistant we must utilize Gmail API. The Gmail API allows you to

access and control threads, messages, and labels in your Gmail mailbox.

108 | P a g e
Personal Voice Assistant

2) Scheduling appointments using a voice assistant:

The demands on our time increase as our company grows. A growing number of people want to meet

with us. We have a growing number of people who rely on us. We must check in on certain projects or

set aside time to chat with business leads. There will not be enough hours in the day if we keep doing

things the old way.

We need to get a better handle on our full-time schedule and devise a strategy for arranging

appointments that does not interfere with our most critical job. By working with a virtual scheduler or,

in other words, a virtual assistant, we let someone else worry about the organization and prioritize

ourschedule while we focus on the work.

3) Improved Interface of a voice assistant (VUI):

Voice user interfaces (VUIs) allow users to interact with a system by speaking commands. VUIs

include virtual assistants like Amazon's Alexa and Apple's Siri. The real advantage of a VUI is that it

allows users to interact with a product without using their hands or their eyes while focusing on

anything else.

-Other benefits of a Voice user interface (VUI):

Speed and Efficiency:

Hands-free interactions are possible with VUIs. This method of interaction eliminates the need to click

buttons or tap on the screen. The major means of human communication is speech. People have been

using speech to form relationships for ages. As a result, solutions that allow customers to do the same

109 | P a g e
Personal Voice Assistant

are extremely valuable. Furthermore, even for experienced texters, dictating text messages has been

demonstrated to be faster than typing. Hands-free interactions, at least in some circumstances, save

time and boost efficiency.

Intuitiveness and convenience:

Intuitive user flow is required of high-quality VUIs, and technical advancements are expected to

continue to improve the intuitiveness of voice interfaces. Compared to graphical UIs (User Interface),

VUIs require less cognitive effort from the user. Furthermore, everyone – from a small child to your

grandmother – can communicate. As a result, VUI designers are in a better position than GUI designers,

who run the danger of producing incomprehensible menus and exposing users to the agony of poor

interface design. Customers are unlikely to need to be instructed on how to utilize the technology by

VUI makers. People can instead ask their voice assistant for assistance

Another promising enhancement is the seamless integration with a broader range of smart devices and

services. As the Internet of Things (IoT) continues to expand, voice assistants will become central hubs

for controlling and interacting with various connected devices, from home automation systems to

wearable technology. Enhanced interoperability and the development of standardized communication

protocols will facilitate this integration.

Additionally, improvements in multi-modal interaction will enable voice assistants to combine voice,

text, and visual inputs for a richer and more interactive user experience. This capability will be

particularly beneficial in scenarios where visual information complements verbal commands, such as

navigation assistance or technical support.

110 | P a g e
Personal Voice Assistant

In summary, the future enhancements of personal voice assistants will focus on improving NLU

(Natural Language Understanding) and NLG capabilities, personalization, seamless integration with

IoT devices, multi-modal interaction, and robust privacy and security measures. These advancements

will collectively contribute to creating more intelligent, responsive, and user-friendly voice assistants.

111 | P a g e
Personal Voice Assistant

CHAPTER 9

BIBLIOGRAPHY

1. Agrawal, H., Singh, N., Kumar, G., Yagyasen, D., & Singh, S. V. (2021). Voice Assistant Using Python.

IJIRT, 8(2), 419–423.

2. Buhalis, D., & Moldavska, I. (2021). In-room Voice-Based AI Digital Assistants Transforming On-Site

Hotel Services and Guests’ Experiences. Information and Communication Technologies in Tourism

2021, 10(6), 30–44. https://doi.org/10.1007/978-3-030-65785-7_3

3. Dellaert, B. G. C., Shu, S. B., Arentze, T. A., Baker, T., Diehl, K., Donkers, B., Fast, N. J., Häubl, G.,

Johnson, H., Karmarkar, U. R., Oppewal, H., Schmitt, B. H., Schroeder, J., Spiller, S. A., & Steffel,

M. (2020). Consumer decisions with artificially intelligent voice assistants. Marketing Letters, 31.

https://doi.org/10.1007/s11002-020-09537-5

4. Geetha, Gomathy, Kottamasu, Manasa, & Nukala. (2021). The Voice Enabled Personal Assistant for Pc

using Python. International Journal of Engineering and Advanced Technology, 10(D2425.0410421),

162–165.

5. Krishnaraj, Faris, M., & Rajesh. (2021). Portable Voice Recognition with GUI Automation. IJIRT, 9(6),

20–23.

6. Paul, R., & Mukhopadhya, N. (2021). A Novel Python-based Voice Assistance System for reducing the

Hardware Dependency of Modern Age Physical Servers. IRJET, 8(5), 1425–1431.

7. Sayyed, A., Shaikh, A., Sancheti, A., Sangamnere, S., & H Bhangale, J. (2021). Desktop Assistant AI

Using Python. International Journal of Advanced Research in Science, Communication and

Technology, 6(2), 2581–2942.

8. Sprengholz, P., & Betsch, C. (2021). Ok Google: Using virtual assistants for data collection

inpsychological and behavioral research. Behavior Research Methods, 54(3), 1554–

3528. https://doi.org/10.3758/s13428-021-01629-y

112 | P a g e

You might also like