[go: up one dir, main page]

0% found this document useful (0 votes)
25 views49 pages

AI Notes

Unit 1

Uploaded by

Teja Shree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views49 pages

AI Notes

Unit 1

Uploaded by

Teja Shree
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 49

ARTIFICIAL INTELLIGENCE

NEP- SEC
FOURTHE SEM BBA

Prepared By

Dr. Subbulakshmi. S

Assitant Professor

Department of Commerce and Management

Dayananda Sagar College of Arts, Science and Commerce

Kumaraswamy Layout, Bengaluru-560111

UNIT -1
AZURE AI FUNDAMENTALS

Artificial Intelligence
Artificial intelligence (AI) refers to the simulation of human intelligence in machines that are
programmed to think like humans and mimic their actions. The term may also be applied to
any machine that exhibits traits associated with a human mind such as learning and problem-
solving. The ideal characteristic of artificial intelligence is its ability to rationalize and take
actions that have the best chance of achieving a specific goal.

A subset of artificial intelligence is machine learning (ML), which refers to the concept that
computer programs can automatically learn from and adapt to new data without being
assisted by humans.

Artificial intelligence (AI) refers to the simulation or approximation of human intelligence in


machines. The goals of artificial intelligence include computer-enhanced learning,
reasoning, and perception. AI is being used today across different industries from finance to

1
AI Dr. Subbulakshmi
healthcare. Weak AI tends to be simple and single-task oriented, while strong AI carries on
tasks that are more complex and human-like.

Types of AI
Weak artificial intelligence embodies a system designed to carry out one particular job. Weak
AI systems include video games and personal assistants such as Amazon's Alexa and Apple's
Siri.

Strong artificial intelligence systems are systems that carry on the tasks considered to be
human-like. These tend to be more complex and complicated systems. They are programmed
to handle situations in which they may be required to problem solve without having a person
intervene. These kinds of systems can be found in applications like self-driving cars or in
hospital operating rooms.

AI in Business Management
• Spam filters
• Smart email categorisation
• Voice to text features
• smart personal assistants, such as Siri, and Google Now
• automated responders and online customer support
• process automation
• sales and business forecasting
• security surveillance
• Automated insights, especially for data-driven industries (Eg financial services or e-
commerce)
• Smart searches and relevance features
• personalisation as a service
• product recommendations and purchase predictions
• Fraud detection and prevention for online transactions
• Dynamic price optimisation

AI in Marketing
2
AI Dr. Subbulakshmi
• Recommendations and content curation
• personalisation of news feeds
• Pattern and Image recognition
• language recognition - to digest unstructured data from customers and sales prospects
• Ad targeting and optimised, real-time bidding
• customer Segmentation
• social semantics and sentiment analysis
• automated web design
• predictive customer service

What is AI?
AI is the creation of software that imitates human behaviours and capabilities. Key workloads
include:
• Machine learning - This is often the foundation for an AI system, and is the way we
"teach" a computer model to make prediction and draw conclusions from data.
• Anomaly detection - The capability to automatically detect errors or unusual activity
in a system.
• Computer vision - The capability of software to interpret the world visually through
cameras, video, and images.
• Natural language processing - The capability for a computer to interpret written or
spoken language, and respond in kind.
• Knowledge mining - The capability to extract information from large volumes of
often unstructured data to create a searchable knowledge store.

Machine Learning
How ML works?
The answer is, from data. In today's world, we create huge volumes of data as we go
about our everyday lives. From the text messages, emails, and social media posts we send
to the photographs and videos we take on our phones, we generate massive amounts of
information. More data still is created by millions of sensors in our homes, cars, cities,
public transport infrastructure, and factories.
Data scientists can use all of that data to train machine learning models that can make
predictions and inferences based on the relationships they find in the data.

3
AI Dr. Subbulakshmi
For example, suppose an environmental conservation organization wants volunteers to
identify and catalogue different species of wildflower using a phone app. The following
animation shows how machine learning can be used to enable this scenario.

Steps in ML in identifying wild flower model


• A team of botanists and scientists collect data on wildflower samples.
• The team labels the samples with the correct species.
• The labelled data is processed using an algorithm that finds relationships between the
features of the samples and the labelled species.
• The results of the algorithm are encapsulated in a model.
• When new samples are found by volunteers, the model can identify the correct species
label.

Machine Learning in MS Azure


• Microsoft Azure provides the Azure Machine Learning service - a cloud-based
platform for creating, managing, and publishing machine learning models. Azure
Machine Learning provides the following features and capabilities:

Feature Capability

Automated machine learning This feature enables non-experts to quickly create an


effective machine learning model from data.
Azure Machine Learning designer A graphical interface enabling no-code development of
machine learning solutions.
Data and compute management Cloud-based data storage and compute resources that
professional data scientists can use to run data
experiment code at scale.
Pipelines Data scientists, software engineers, and IT operations
professionals can define pipelines to orchestrate model
training, deployment, and management tasks.

Anomaly Detection
• Imagine you're creating a software system to monitor credit card transactions and
detect unusual usage patterns that might indicate fraud.

4
AI Dr. Subbulakshmi
• Or an application that tracks activity in an automated production line and identifies
failures.
• Or a racing car telemetry system that uses sensors to proactively warn engineers about
potential mechanical failures before they happen.
• These kinds of scenario can be addressed by using anomaly detection - a machine
learning based technique that analyzes data over time and identifies unusual changes.
• Let's explore how anomaly detection might help in the racing car scenario

Steps in anomaly detection


• Sensors in the car collect telemetry, such as engine revolutions, brake temperature, and so
on.
• An anomaly detection model is trained to understand expected fluctuations in the
telemetry measurements over time.
• If a measurement occurs outside of the normal expected range, the model reports an
anomaly that can be used to alert the race engineer to call the driver in for a pit stop to fix
the issue before it forces retirement from the race.

Computer Vision
 Computer Vision is an area of AI that deals with visual processing. Let's explore some of
the possibilities that computer vision brings.
 The Seeing AI app is a great example of the power of computer vision. Designed for the
blind and low vision community, the Seeing AI app harnesses the power of AI to open
up the visual world and describe nearby people, text and objects.
 Most computer vision solutions are based on machine learning models that can be
applied to visual input from cameras, videos, or images.

Computer Vision tasks

5
AI Dr. Subbulakshmi
6
AI Dr. Subbulakshmi
7
AI Dr. Subbulakshmi
Computer Vision Services in Microsoft Azure

Service Capabilities

Computer Vision You can use this service to analyze images and video, and extract
descriptions, tags, objects, and text.

Custom Vision Use this service to train custom image classification and object detection
models using your own images.

Face The Face service enables you to build face detection and facial
recognition solutions.

Form Recognizer Use this service to extract information from scanned forms and invoices.

Natural Language Processing


Natural language processing (NLP) is the area of AI that deals with creating software that
understands written and spoken language.
• NLP enables you to create software that can:
• Analyse and interpret text in documents, email messages, and other sources.
• Interpret spoken language, and synthesize speech responses.
• Automatically translate spoken or written phrases between languages.
• Interpret commands and determine appropriate actions.

8
AI Dr. Subbulakshmi

In Microsoft Azure, you can use the following cognitive services to build natural language
processing solutions.

Service Capabilities

Language Use this service to access features for understanding and analyzing text,
training language models that can understand spoken or text-based commands,
and building intelligent applications.

Translator Use this service to translate text between more than 60 languages.

Speech Use this service to recognize and synthesize speech, and to translate spoken
languages.

Azure Bot This service provides a platform for conversational AI, the capability of a
software "agent" to participate in a conversation. Developers can use the Bot
Framework to create a bot and manage it with Azure Bot Service - integrating
back-end services like Language, and connecting to channels for web chat,
email, Microsoft Teams, and others.

Knowledge Mining
• Knowledge mining is the term used to describe solutions that involve extracting
information from large volumes of often unstructured data to create a searchable
knowledge store.
• Azure Cognitive Search can utilize the built-in AI capabilities of Azure Cognitive
Services such as image processing, content extraction, and natural language processing to
perform knowledge mining of documents.

Challenges and Risks in AI

Challenge or Risk Example

Bias can affect results A loan-approval model discriminates by gender due to bias in the
data with which it was trained

9
AI Dr. Subbulakshmi
Errors may cause harm An autonomous vehicle experiences a system failure and causes a
collision

Data could be exposed A medical diagnostic bot is trained using sensitive patient data,
which is stored insecurely

Solutions may not work for A home automation assistant provides no audio output for visually
everyone impaired users

Users must trust a complex An AI-based financial tool makes investment recommendations -
system what are they based on?

Who's liable for AI-driven An innocent person is convicted of a crime based on evidence
decisions? from facial recognition – who's responsible?

Responsible AI– Six Principles


1. Fairness
AI systems should treat all people fairly. For example, suppose you create a machine learning
model to support a loan approval application for a bank. The model should predict whether
the loan should be approved or denied without bias. This bias could be based on gender,
ethnicity, or other factors that result in an unfair advantage or disadvantage to specific groups
of applicants. Azure Machine Learning includes the capability to interpret models and
quantify the extent to which each feature of the data influences the model's prediction. This
capability helps data scientists and developers identify and mitigate bias in the model.
Another example is Microsoft's implementation of Responsible AI with the Face service,
which retires facial recognition capabilities that can be used to try to infer emotional states
and identity attributes. These capabilities, if misused, can subject people to stereotyping,
discrimination or unfair denial of services.

2. Reliability and safety


AI systems should perform reliably and safely. For example, consider an AI-based software
system for an autonomous vehicle; or a machine learning model that diagnoses patient
symptoms and recommends prescriptions. Unreliability in these kinds of systems can result in
substantial risk to human life. AI-based software application development must be subjected

10
AI Dr. Subbulakshmi
to rigorous testing and deployment management processes to ensure that they work as
expected before release.

3. Privacy and security


AI systems should be secure and respect privacy. The machine learning models on which AI
systems are based rely on large volumes of data, which may contain personal details that
must be kept private. Even after the models are trained and the system is in production,
privacy and security need to be considered. As the system uses new data to make predictions
or take action, both the data and decisions made from the data may be subject to privacy or
security concerns.

4. Inclusiveness
AI systems should empower everyone and engage people. AI should bring benefits to all
parts of society, regardless of physical ability, gender, sexual orientation, ethnicity, or other
factors.

5. Transparency
AI systems should be understandable. Users should be made fully aware of the purpose of the
system, how it works, and what limitations may be expected.

6. Accountability
People should be accountable for AI systems. Designers and developers of AI-based
solutions should work within a framework of governance and organizational principles that
ensure the solution meets ethical and legal standards that are clearly defined.

1.Machine Learning in Microsoft Azure


• Machine learning is a technique that uses mathematics and statistics to create a model
that can predict unknown values.

• Mathematically, you can think of machine learning as a way of defining a function


(let's call it f) that operates on one or more features of something (which we'll call x)
to calculate a predicted label (y) - like this:
• f(x) = y
• In this bicycle rental example, the details about a given day (day of the week, weather,
and so on) are the features (x), the number of rentals for that day is the label (y), and
the function (f) that calculates the number of rentals based on the information about
the day is encapsulated in a machine learning model.

11
AI Dr. Subbulakshmi
• The specific operation that the f function performs on x to calculate y depends on a
number of factors, including the type of model you're trying to create and the specific
algorithm used to train the model.

Types of Machine Learning

Types

Supervised Unsupervised
ML ML

Regression Classification Clustering

Supervised Machine Learning


• The supervised machine learning approach requires you to start with a
dataset with known label values. Two types of supervised machine learning tasks
include regression and classification.
• Regression: used to predict a continuous value; like a price, a sales total, or
some other measure.
• Classification: used to determine a binary class label; like whether a patient
has diabetes or not.

Unsupervised Machine Learning


• The unsupervised machine learning approach starts with a dataset without known
label values. One type of unsupervised machine learning task is clustering.
• Clustering: used to determine labels by grouping similar information into
label groups; like grouping measurements from birds into species.

12
AI Dr. Subbulakshmi
Azure Machine Learning Studio
Azure Machine Learning is a cloud-based service that helps simplify some of the tasks it
takes to prepare data, train a model, and deploy a predictive service.

• Prepare data
• Train a model
• Review Performance
• deploy a predictive service.

Machine Learning workspace


• Resource - A resource group is a container that holds related resources
for an Azure solution. The resource group can include all the resources
for the solution, or only those resources that you want to manage as a
group.
• Workspace- workspace to manage data, compute resources, code,
models, and other artifacts related to your machine learning workloads.

What is Azure Machine Learning studio?


Training and deploying an effective machine learning model involves a lot of work, much of
it time-consuming and resource-intensive. Azure Machine Learning is a cloud-based service
that helps simplify some of the tasks it takes to prepare data, train a model, and deploy a
predictive service. Most importantly, Azure Machine Learning helps data scientists increase
their efficiency by automating many of the time-consuming tasks associated with training
models; and it enables them to use cloud-based compute resources that scale effectively to
handle large volumes of data while incurring costs only when actually used.

Azure Machine Learning workspace


To use Azure Machine Learning, you first create a workspace resource in your Azure
subscription. You can then use this workspace to manage data, compute resources, code,
models, and other artifacts related to your machine learning workloads. After you have
created an Azure Machine Learning workspace, you can develop solutions with the Azure
machine learning service either with developer tools or the Azure Machine Learning studio
web portal.

Azure Machine Learning studio

13
AI Dr. Subbulakshmi
Azure Machine Learning studio is a web portal for machine learning solutions in Azure. It
includes a wide range of features and capabilities that help data scientists prepare data, train
models, publish predictive services, and monitor their usage. To begin using the web portal,
you need to assign the workspace you created in the Azure portal to Azure Machine Learning
studio

Azure Machine Learning compute


At its core, Azure Machine Learning is a service for training and managing machine learning
models, for which you need compute on which to run the training process.

Compute targets are cloud-based resources on which you can run model training and data
exploration processes.

In Azure Machine Learning studio, you can manage the compute targets for your data science
activities. There are four kinds of compute resource you can create:

 Compute Instances: Development workstations that data scientists can use to work
with data and models.
 Compute Clusters: Scalable clusters of virtual machines for on-demand processing of
experiment code.
 Inference Clusters: Deployment targets for predictive services that use your trained
models.
 Attached Compute: Links to existing Azure compute resources, such as Virtual
Machines or Azure Databricks clusters.

What is Azure Automated Machine Learning?


Azure Machine Learning includes an automated machine learning capability that
automatically tries multiple pre-processing techniques and model-training algorithms in
parallel. These automated capabilities use the power of cloud compute to find the best
performing supervised machine learning model for your data.

Automated machine learning allows you to train models without extensive data science or
programming knowledge. For people with a data science and programming background, it
provides a way to save time and resources by automating algorithm selection and
hyperparameter tuning.

You can create an automated machine learning job in Azure Machine Learning studio.

14
AI Dr. Subbulakshmi
In Azure Machine Learning, operations that you run are called jobs. You can configure
multiple settings for your job before starting an automated machine learning run. The run
configuration provides the information needed to specify your training script, compute target,
and Azure ML environment in your run configuration and run a training job.

Understand the AutoML process


You can think of the steps in a machine learning process as:

1. Prepare data: Identify the features and label in a dataset. Pre-process, or clean and
transform, the data as needed.

15
AI Dr. Subbulakshmi
2. Train model: Split the data into two groups, a training and a validation set. Train a
machine learning model using the training data set. Test the machine learning model
for performance using the validation data set.
3. Evaluate performance: Compare how close the model's predictions are to the known
labels.
4. Deploy a predictive service: After you train a machine learning model, you can deploy
the model as an application on a server or device so that others can use it.

These are the same steps in the automated machine learning process with Azure Machine
Learning.

Prepare data

Machine learning models must be trained with existing data. Data scientists expend a lot of
effort exploring and pre-processing data, and trying various types of model-training
algorithms to produce accurate models, which is time consuming, and often makes inefficient
use of expensive compute hardware.

In Azure Machine Learning, data for model training and other operations is usually
encapsulated in an object called a dataset. You can create your own dataset in Azure Machine
Learning studio.

Train model
The automated machine learning capability in Azure Machine Learning
supports supervised machine learning models - in other words, models for which the training
data includes known label values. You can use automated machine learning to train models
for:

 Classification (predicting categories or classes)

16
AI Dr. Subbulakshmi
 Regression (predicting numeric values)
 Time series forecasting (predicting numeric values at a future point in time)

In Automated Machine Learning, you can select configurations for the primary metric, type
of model used for training, exit criteria, and concurrency limits.

Importantly, AutoML will split data into a training set and a validation set. You can configure
the details in the settings before you run the job.

Evaluate performance
After the job has finished you can review the best performing model. In this case, you used
exit criteria to stop the job. Thus the "best" model the job generated might not be the best
possible model, just the best one found within the time allowed for this exercise.

The best model is identified based on the evaluation metric you specified, Normalized root
mean squared error.

A technique called cross-validation is used to calculate the evaluation metric. After the model
is trained using a portion of the data, the remaining portion is used to iteratively test, or cross-
validate, the trained model. The metric is calculated by comparing the predicted value from
the test with the actual known value, or label.

The difference between the predicted and actual value, known as the residuals, indicates the
amount of error in the model. The performance metric root mean squared error (RMSE), is
calculated by squaring the errors across all of the test cases, finding the mean of these
squares, and then taking the square root. What all of this means is that smaller this value is,
the more accurate the model's predictions. The normalized root mean squared
error (NRMSE) standardizes the RMSE metric so it can be used for comparison between
models which have variables on different scales.

The Residual Histogram shows the frequency of residual value ranges. Residuals represent
variance between predicted and true values that can't be explained by the model, in other
words, errors. You should hope to see the most frequently occurring residual values clustered
around zero. You want small errors with fewer errors at the extreme ends of the scale.

The Predicted vs. True chart should show a diagonal trend in which the predicted value
correlates closely to the true value. The dotted line shows how a perfect model should
perform. The closer the line of your model's average predicted value is to the dotted line, the
better its performance. A histogram below the line chart shows the distribution of true values.

After you've used automated machine learning to train some models, you can deploy the best
performing model as a service for client applications to use.
17
AI Dr. Subbulakshmi
Deploy a predictive service
In Azure Machine Learning, you can deploy a service as an Azure Container Instances (ACI)
or to an Azure Kubernetes Service (AKS) cluster. For production scenarios, an AKS
deployment is recommended, for which you must create an inference cluster compute target.
In this exercise, you'll use an ACI service, which is a suitable deployment target for testing,
and does not require you to create an inference cluster.

What is Azure Machine Learning designer?


In Azure Machine Learning studio, there are several ways to author regression machine
learning models. One way is to use a visual interface called designer that you can use to train,
test, and deploy machine learning models. The drag-and-drop interface makes use of clearly
defined inputs and outputs that can be shared, reused, and version controlled.

Each designer project, known as a pipeline, has a left panel for navigation and a canvas on
your right hand side. To use designer, identify the building blocks, or components, needed for
your model, place and connect them on your canvas, and run a machine learning job.

Pipelines
Pipelines let you organize, manage, and reuse complex machine learning workflows across
projects and users. A pipeline starts with the dataset from which you want to train the model.
Each time you run a pipeline, the configuration of the pipeline and its results are stored in
your workspace as a pipeline job.

Components
An Azure Machine Learning component encapsulates one step in a machine learning
pipeline. You can think of a component as a programming function and as a building block
for Azure Machine Learning pipelines. In a pipeline project, you can access data assets and
components from the left panel's Asset Library tab.

Datasets
You can create data assets on the Data page from local files, a datastore, web files, and Open
Datasets. These data assets will appear along with standard sample datasets
in designer's Asset Library.

18
AI Dr. Subbulakshmi
Azure Machine Learning Jobs
An Azure Machine Learning (ML) job executes a task against a specified compute target.
Jobs enable systematic tracking for your machine learning experimentation and workflows.
Once a job is created, Azure ML maintains a run record for the job. All of your jobs' run
records can be viewed in Azure ML studio.

In your designer project, you can access the status of a pipeline job using the Submitted
jobs tab on the left pane.

You can find all the jobs you have run in a workspace on the Jobs page.

Understand steps for regression


You can think of the steps to train and evaluate a regression machine learning model as:

1. Prepare data: Identify the features and label in a dataset. Pre-process, or clean and
transform, the data as needed.
2. Train model: Split the data into two groups, a training and a validation set. Train a
machine learning model using the training data set. Test the machine learning model
for performance using the validation data set.
3. Evaluate performance: Compare how close the model's predictions are to the known
labels.
4. Deploy a predictive service: After you train a machine learning model, you need to
convert the training pipeline into a real-time inference pipeline. Then you can deploy
the model as an application on a server or device so that others can use it.

Let's follow these four steps as they appear in Azure designer.

Prepare data
Azure machine learning designer has several pre-built components that can be used to
prepare data for training. These components enable you to clean data, normalize features, join
tables, and more.

Train model
To train a regression model, you need a dataset that includes historical features,
characteristics of the entity for which you want to make a prediction, and known label values.
The label is the quantity you want to train a model to predict.

19
AI Dr. Subbulakshmi
It's common practice to train the model using a subset of the data, while holding back some
data with which to test the trained model. This enables you to compare the labels that the
model predicts with the actual known labels in the original dataset.

You will use designer's Score Model component to generate the predicted class label value.
Once you connect all the components, you will want to run an experiment, which will use the
data asset on the canvas to train and score a model.

Evaluate performance
After training a model, it is important to evaluate its performance. There are many
performance metrics and methodologies for evaluating how well a model makes predictions.
You can review evaluation metrics on the completed job page by right-clicking on
the Evaluate model component.

 Mean Absolute Error (MAE): The average difference between predicted values and
true values. This value is based on the same units as the label, in this case dollars. The
lower this value is, the better the model is predicting.
 Root Mean Squared Error (RMSE): The square root of the mean squared difference
between predicted and true values. The result is a metric based on the same unit as
the label (dollars). When compared to the MAE (above), a larger difference indicates
greater variance in the individual errors (for example, with some errors being very
small, while others are large).
 Relative Squared Error (RSE): A relative metric between 0 and 1 based on the square
of the differences between predicted and true values. The closer to 0 this metric is,
the better the model is performing. Because this metric is relative, it can be used to
compare models where the labels are in different units.
 Relative Absolute Error (RAE): A relative metric between 0 and 1 based on the
absolute differences between predicted and true values. The closer to 0 this metric is,
the better the model is performing. Like RSE, this metric can be used to compare
models where the labels are in different units.
 Coefficient of Determination (R2): This metric is more commonly referred to as R-
Squared, and summarizes how much of the variance between predicted and true
values is explained by the model. The closer to 1 this value is, the better the model is
performing.

Deploy a predictive service


You have the ability to deploy a service that can be used in real-time. In order to automate
your model into a service that makes continuous predictions, you need to create and deploy
an inference pipeline.

20
AI Dr. Subbulakshmi
Inference pipeline

To deploy your pipeline, you must first convert the training pipeline into a real-time inference
pipeline. This process removes training components and adds web service inputs and outputs
to handle requests.

The inference pipeline performs the same data transformations as the first pipeline
for new data. Then it uses the trained model to infer, or predict, label values based on its
features. This model will form the basis for a predictive service that you can publish for
applications to use.

You can create an inference pipeline by selecting the menu above a completed job.

Deployment

After creating the inference pipeline, you can deploy it as an endpoint. In the endpoints page,
you can view deployment details, test your pipeline service with sample data, and find
credentials to connect your pipeline service to a client application.

It will take a while for your endpoint to be deployed. The Deployment state on the Details tab
will indicate Healthy when deployment is successful.

On the Test tab, you can test your deployed service with sample data in a JSON format. The
test tab is a tool you can use to quickly check to see if your model is behaving as expected.
Typically it is helpful to test the service before connecting it to an application.

You can find credentials for your service on the Consume tab. These credentials are used to
connect your trained machine learning model as a service to a client application.

2. Computer Vision
Computer vision is one of the core areas of artificial intelligence (AI), and focuses on creating so
lutions that enable AI applications to "see" the world and make sense of it.

Of course, computers don't have biological eyes that work the way ours do, but they are capable of
processing images; either from a live camera feed or from digital photographs or videos. This ability
to process images is the key to creating software that can emulate human visual perception.

Some potential uses for computer vision include:

 Content Organization: Identify people or objects in photos and organize them based on that
identification. Photo recognition applications like this are commonly used in photo storage
and social media applications.
21
AI Dr. Subbulakshmi
 Text Extraction: Analyze images and PDF documents that contain text and extract the text into
a structured format.
 Spatial Analysis: Identify people or objects, such as cars, in a space and map their movement
within that space.

To an AI application, an image is just an array of pixel values. These numeric values can be used
as features to train machine learning models that make predictions about the image and its
contents.

Azure resources for Computer Vision

To use the Computer Vision service, you need to create a resource for it in your Azure subscription.
You can use either of the following resource types:

 Computer Vision: A specific resource for the Computer Vision service. Use this resource type if
you don't intend to use any other cognitive services, or if you want to track utilization and
costs for your Computer Vision resource separately.
 Cognitive Services: A general cognitive services resource that includes Computer Vision along
with many other cognitive services; such as Text Analytics, Translator Text, and others. Use
this resource type if you plan to use multiple cognitive services and want to simplify
administration and development.

Whichever type of resource you choose to create, it will provide two pieces of information that you
will need to use it:

 A key that is used to authenticate client applications.


 An endpoint that provides the HTTP address at which your resource can be accessed.

Analyzing images with the Computer Vision service

After you've created a suitable resource in your subscription, you can submit images to the Computer
Vision service to perform a wide range of analytical tasks.

Describing an image

Computer Vision has the ability to analyze an image, evaluate the objects that are detected, and
generate a human-readable phrase or sentence that can describe what was detected in the image.
Depending on the image contents, the service may return multiple results, or phrases. Each returned
phrase will have an associated confidence score, indicating how confident the algorithm is in the
supplied description. The highest confidence phrases will be listed first.

To help you understand this concept, consider the following image of the Empire State building in
New York. The returned phrases are listed below the image in the order of confidence.

 A black and white photo of a city


 A black and white photo of a large city
 A large white building in a city

22
AI Dr. Subbulakshmi
Tagging visual features

The image descriptions generated by Computer Vision are based on a set of thousands of recognizable
objects, which can be used to suggest tags for the image. These tags can be associated with the image
as metadata that summarizes attributes of the image; and can be particularly useful if you want to
index an image along with a set of key terms that might be used to search for images with specific
attributes or contents.

For example, the tags returned for the Empire State building image include:

 skyscraper
 tower
 building

Detecting objects

The object detection capability is similar to tagging, in that the service can identify common objects;
but rather than tagging, or providing tags for the recognized objects only, this service can also return
what is known as bounding box coordinates. Not only will you get the type of object, but you will
also receive a set of coordinates that indicate the top, left, width, and height of the object detected,
which you can use to identify the location of the object in the image, like this:

Detecting brands

This feature provides the ability to identify commercial brands. The service has an existing database
of thousands of globally recognized logos from commercial brands of products.

When you call the service and pass it an image, it performs a detection task and determine if any of
the identified objects in the image are recognized brands. The service compares the brands against its
database of popular brands spanning clothing, consumer electronics, and many more categories. If a
known brand is detected, the service returns a response that contains the brand name, a confidence
score (from 0 to 1 indicating how positive the identification is), and a bounding box (coordinates) for
where in the image the detected brand was found.

For example, in the following image, a laptop has a Microsoft logo on its lid, which is identified and
located by the Computer Vision service.

Detecting faces

The Computer Vision service can detect and analyze human faces in an image, including the ability to
determine age and a bounding box rectangle for the location of the face(s). The facial analysis
capabilities of the Computer Vision service are a subset of those provided by the dedicated Face
Service. If you need basic face detection and analysis, combined with general image analysis
capabilities, you can use the Computer Vision service; but for more comprehensive facial analysis and
facial recognition functionality, use the Face service.

The following example shows an image of a person with their face detected and approximate age
estimated.

23
AI Dr. Subbulakshmi
Categorizing an image

Computer Vision can categorize images based on their contents. The service uses a parent/child
hierarchy with a "current" limited set of categories. When analyzing an image, detected objects are
compared to the existing categories to determine the best way to provide the categorization. As an
example, one of the parent categories is people_. This image of a person on a roof is assigned a
category of people_.

A slightly different categorization is returned for the following image, which is assigned to the
category people_group because there are multiple people in the image:

Detecting domain-specific content

When categorizing an image, the Computer Vision service supports two specialized domain models:

 Celebrities - The service includes a model that has been trained to identify thousands of well-
known celebrities from the worlds of sports, entertainment, and business.
 Landmarks - The service can identify famous landmarks, such as the Taj Mahal and the Statue
of Liberty.

For example, when analyzing the following image for landmarks, the Computer Vision service
identifies the Eiffel Tower, with a confidence of 99.41%.

Optical character recognition

The Computer Vision service can use optical character recognition (OCR) capabilities to detect
printed and handwritten text in images. This capability is explored in the Read text with the Computer
Vision service module on Microsoft Learn.

Additional capabilities

In addition to these capabilities, the Computer Vision service can:

 Detect image types - for example, identifying clip art images or line drawings.
 Detect image color schemes - specifically, identifying the dominant foreground, background,
and overall colors in an image.
 Generate thumbnails - creating small versions of images.
 Moderate content - detecting images that contain adult content or depict violent, gory scenes.

Classify Images with Custom vision Service

Uses of image classification

Some potential uses for image classification include:

24
AI Dr. Subbulakshmi
 Product identification: performing visual searches for specific products in online searches or
even, in-store using a mobile device.
 Disaster investigation: identifying key infrastructure for major disaster preparation efforts.
For example, identifying bridges and roads in aerial images can help disaster relief teams plan
ahead in regions that are not well mapped.
 Medical diagnosis: evaluating images from X-ray or MRI devices could quickly classify specific
issues found as cancerous tumors, or many other medical conditions related to medical
imaging diagnosis.

Understand image classification

Image classification is a machine learning technique in which the object being classified is an image,
such as a photograph.

To create an image classification model, you need data that consists of features and their labels. The
existing data is a set of categorized images. Digital images are made up of an array of pixel values,
and these are used as features to train the model based on the known image classes.

The model is trained to match the patterns in the pixel values to a set of class labels. After the model
has been trained, you can use it with new sets of features to predict unknown label values.

Azure's Custom Vision service

Most modern image classification solutions are based on deep learning techniques that make use
of convolutional neural networks (CNNs) to uncover patterns in the pixels that correspond to
particular classes. Training an effective CNN is a complex task that requires considerable expertise in
data science and machine learning.

Common techniques used to train image classification models have been encapsulated into
the Custom Vision cognitive service in Microsoft Azure; making it easy to train a model and publish it
as a software service with minimal knowledge of deep learning techniques. You can use the Custom
Vision cognitive service to train image classification models and deploy them as services for
applications to use.

Get started with image classification on Azure

You can perform image classification using the Custom Vision service, available as part of the Azure
Cognitive Services offerings. This is generally easier and quicker than writing your own model
training code, and enables people with little or no machine learning expertise to create an effective
image classification solution.

Azure resources for Custom Vision

25
AI Dr. Subbulakshmi
Creating an image classification solution with Custom Vision consists of two main tasks. First you
must use existing images to train the model, and then you must publish the model so that client
applications can use it to generate predictions.

For each of these tasks, you need a resource in your Azure subscription. You can use the following
types of resource:

 Custom Vision: A dedicated resource for the custom vision service, which can be training,
a prediction, or both resources.
 Cognitive Services: A general cognitive services resource that includes Custom Vision along
with many other cognitive services. You can use this type of resource for training, prediction,
or both.

The separation of training and prediction resources is useful when you want to track resource
utilization for model training separately from client applications using the model to predict image
classes. However, it can make development of an image classification solution a little confusing.

The simplest approach is to use a general Cognitive Services resource for both training and prediction.
This means you only need to concern yourself with one endpoint (the HTTP address at which your
service is hosted) and key (a secret value used by client applications to authenticate themselves).

If you choose to create a Custom Vision resource, you will be prompted to


choose training, prediction, or both - and it's important to note that if you choose "both",
then two resources are created - one for training and one for prediction.

It's also possible to take a mix-and-match approach in which you use a dedicated Custom Vision
resource for training, but deploy your model to a Cognitive Services resource for prediction. For this
to work, the training and prediction resources must be created in the same region.

Model training

To train a classification model, you must upload images to your training resource and label them with
the appropriate class labels. Then, you must train the model and evaluate the training results.

You can perform these tasks in the Custom Vision portal, or if you have the necessary coding
experience you can use one of the Custom Vision service programming language-specific software
development kits (SDKs).

One of the key considerations when using images for classification, is to ensure that you have
sufficient images of the objects in question and those images should be of the object from many
different angles.

Model evaluation

Model training process is an iterative process in which the Custom Vision service repeatedly trains the
model using some of the data, but holds some back to evaluate the model. At the end of the training
process, the performance for the trained model is indicated by the following evaluation metrics:

 Precision: What percentage of the class predictions made by the model were correct? For
example, if the model predicted that 10 images are oranges, of which eight were actually
oranges, then the precision is 0.8 (80%).

26
AI Dr. Subbulakshmi
 Recall: What percentage of class predictions did the model correctly identify? For example, if
there are 10 images of apples, and the model found 7 of them, then the recall is 0.7 (70%).
 Average Precision (AP): An overall metric that takes into account both precision and recall).

Using the model for prediction

After you've trained the model, and you're satisfied with its evaluated performance, you can publish
the model to your prediction resource. When you publish the model, you can assign it a name (the
default is "IterationX", where X is the number of times you have trained the model).

To use your model, client application developers need the following information:

 Project ID: The unique ID of the Custom Vision project you created to train the model.
 Model name: The name you assigned to the model during publishing.
 Prediction endpoint: The HTTP address of the endpoints for the prediction resource to which
you published the model (not the training resource).
 Prediction key: The authentication key for the prediction resource to which you published the
model (not the training resource).

Object Detection
What is object detection?

Let's look at example of object detection. Consider the following image:

An object detection model might be used to identify the individual objects in this image and return the
following information:

Notice that an object detection model returns the following information:

 The class of each object identified in the image.


 The probability score of the object classification (which you can interpret as the confidence of
the predicted class being correct)
 The coordinates of a bounding box for each object.

You can create an object detection machine learning model by using advanced deep learning
techniques. However, this approach requires significant expertise and a large volume of training data.
The Custom Vision cognitive service in Azure enables you to create object detection models that
meet the needs of many computer vision scenarios with minimal deep learning expertise and fewer
training images.

Azure resources for Custom Vision

Creating an object detection solution with Custom Vision consists of three main tasks. First you must
use upload and tag images, then you can train the model, and finally you must publish the model so
that client applications can use it to generate predictions.

27
AI Dr. Subbulakshmi
For each of these tasks, you need a resource in your Azure subscription. You can use the following
types of resource:

 Custom Vision: A dedicated resource for the custom vision service, which can be either
a training, a prediction or a both resource.
 Cognitive Services: A general cognitive services resource that includes Custom Vision along
with many other cognitive services. You can use this type of resource for training, prediction,
or both.

The separation of training and prediction resources is useful when you want to track resource
utilization for model training separately from client applications using the model to predict image
classes. However, it can make development of an image classification solution a little confusing.

The simplest approach is to use a general Cognitive Services resource for both training and prediction.
This means you only need to concern yourself with one endpoint (the HTTP address at which your
service is hosted) and key (a secret value used by client applications to authenticate themselves).

If you choose to create a Custom Vision resource, you will be prompted to


choose training, prediction, or both - and it's important to note that if you choose "both",
then two resources are created - one for training and one for prediction.

It's also possible to take a mix-and-match approach in which you use a dedicated Custom Vision
resource for training, but deploy your model to a Cognitive Services resource for prediction. For this
to work, the training and prediction resources must be created in the same region.

Image tagging

Before you can train an object detection model, you must tag the classes and bounding box
coordinates in a set of training images. This process can be time-consuming, but the Custom Vision
portal provides a graphical interface that makes it straightforward. The interface will automatically
suggest areas of the image where discrete objects are detected, and you can apply a class label to these
suggested bounding boxes or drag to adjust the bounding box area. Additionally, after tagging and
training with an initial dataset, the Computer Vision service can use smart tagging to suggest classes
and bounding boxes for images you add to the training dataset.

Key considerations when tagging training images for object detection are ensuring that you have
sufficient images of the objects in question, preferably from multiple angles; and making sure that the
bounding boxes are defined tightly around each object.

Model training and evaluation

To train the model, you can use the Custom Vision portal, or if you have the necessary coding
experience you can use one of the Custom Vision service programming language-specific software
development kits (SDKs). Training an object detection model can take some time, depending on the
number of training images, classes, and objects within each image.

Model training process is an iterative process in which the Custom Vision service repeatedly trains the
model using some of the data, but holds some back to evaluate the model. At the end of the training
process, the performance for the trained model is indicated by the following evaluation metrics:

28
AI Dr. Subbulakshmi
 Precision: What percentage of class predictions did the model correctly identify? For example,
if the model predicted that 10 images are oranges, of which eight were actually oranges, then
the precision is 0.8 (80%).
 Recall: What percentage of the class predictions made by the model were correct? For
example, if there are 10 images of apples, and the model found 7 of them, then the recall is
0.7 (70%).
 Mean Average Precision (mAP): An overall metric that takes into account both precision and
recall across all classes.

Using the model for prediction

After you've trained the model, and you're satisfied with its evaluated performance, you can publish
the model to your prediction resource. When you publish the model, you can assign it a name (the
default is "IterationX", where X is the number of times you have trained the model).

To use you model, client application developers need the following information:

 Project ID: The unique ID of the Custom Vision project you created to train the model.
 Model name: The name you assigned to the model during publishing.
 Prediction endpoint: The HTTP address of the endpoints for the prediction resource to which
you published the model (not the training resource).
 Prediction key: The authentication key for the prediction resource to which you published the
model (not the training resource).

Face Detection

Introduction

Face detection and analysis is an area of artificial intelligence (AI) in which we use algorithms to
locate and analyze human faces in images or video content.

Face detection

Face detection involves identifying regions of an image that contain a human face, typically by
returning bounding box coordinates that form a rectangle around the face, like this:

Facial analysis

Moving beyond simple face detection, some algorithms can also return other information, such as
facial landmarks (nose, eyes, eyebrows, lips, and others).

These facial landmarks can be used as features with which to train a machine learning model.

Facial recognition

29
AI Dr. Subbulakshmi
A further application of facial analysis is to train a machine learning model to identify known
individuals from their facial features. This usage is more generally known as facial recognition, and
involves using multiple images of each person you want to recognize to train a model so that it can
detect those individuals in new images on which it wasn't trained.

Uses of face detection and analysis

There are many applications for face detection, analysis, and recognition. For example,

 Security - facial recognition can be used in building security applications, and increasingly it is
used in smart phones operating systems for unlocking devices.
 Social media - facial recognition can be used to automatically tag known friends in
photographs.
 Intelligent monitoring - for example, an automobile might include a system that monitors the
driver's face to determine if the driver is looking at the road, looking at a mobile device, or
shows signs of tiredness.
 Advertising - analyzing faces in an image can help direct advertisements to an appropriate
demographic audience.
 Missing persons - using public cameras systems, facial recognition can be used to identify if a
missing person is in the image frame.
 Identity validation - useful at ports of entry kiosks where a person holds a special entry permit.

When used responsibly, facial recognition is an important and useful technology that can improve
efficiency, security, and customer experiences. Face is a building block for creating a facial
recognition system.

Microsoft Azure provides multiple cognitive services that you can use to detect and analyze faces,
including:

 Computer Vision, which offers face detection and some basic face analysis, such as returning
the bounding box coordinates around an image.
 Video Indexer, which you can use to detect and identify faces in a video.
 Face, which offers pre-built algorithms that can detect, recognize, and analyze faces.

Of these, Face offers the widest range of facial analysis capabilities.

Face

Face can return the rectangle coordinates for any human faces that are found in an image, as well as a
series of attributes related to those faces such as:

 Blur: how blurred the face is (which can be an indication of how likely the face is to be the
main focus of the image)
 Exposure: aspects such as underexposed or over exposed and applies to the face in the image
and not the overall image exposure
 Glasses: if the person is wearing glasses
 Head pose: the face's orientation in a 3D space
 Noise: refers to visual noise in the image. If you have taken a photo with a high ISO setting for
darker settings, you would notice this noise in the image. The image looks grainy or full of tiny
dots that make the image less clear

30
AI Dr. Subbulakshmi
 Occlusion: determines if there may be objects blocking the face in the image

Azure resources for Face

To use Face, you must create one of the following types of resource in your Azure subscription:

 Face: Use this specific resource type if you don't intend to use any other cognitive services, or
if you want to track utilization and costs for Face separately.
 Cognitive Services: A general cognitive services resource that includes Computer Vision along
with many other cognitive services; such as Custom Vision, Form Recognizer, Language, and
others. Use this resource type if you plan to use multiple cognitive services and want to
simplify administration and development.

Whichever type of resource you choose to create, it will provide two pieces of information that you
will need to use it:

 A key that is used to authenticate client applications.


 An endpoint that provides the HTTP address at which your resource can be accessed.

Read text with the Computer Vision service


Introduction

The ability for computer systems to process written or printed text is an area of artificial intelligence
(AI) where computer vision intersects with natural language processing. You need computer vision
capabilities to "read" the text, and then you need natural language processing capabilities to make
sense of it.

The basic foundation of processing printed text is optical character recognition (OCR), in which a
model can be trained to recognize individual shapes as letters, numerals, punctuation, or other
elements of text. Much of the early work on implementing this kind of capability was performed by
postal services to support automatic sorting of mail based on postal codes. Since then, the state-of-the-
art for reading text has moved on, and it's now possible to build models that can detect printed or
handwritten text in an image and read it line-by-line or even word-by-word.

In this module, we'll focus on the use of OCR technologies to detect text in images and convert it into
a text-based data format, which can then be stored, printed, or used as the input for further processing
or analysis.

Uses of OCR

The ability to recognize printed and handwritten text in images, is beneficial in many scenarios such
as:

 note taking
 digitizing forms, such as medical records or historical documents
 scanning printed or handwritten checks for bank deposits

31
AI Dr. Subbulakshmi
Get started with the Read API on Azure

The ability to extract text from images is handled by the Computer Vision service, which also
provides image analysis capabilities.

Azure resources for Computer Vision

The first step towards using the Computer Vision service is to create a resource for it in your Azure
subscription. You can use either of the following resource types:

 Computer Vision: A specific resource for the Computer Vision service. Use this resource type if
you don't intend to use any other cognitive services, or if you want to track utilization and
costs for your Computer Vision resource separately.
 Cognitive Services: A general cognitive services resource that includes Computer Vision along
with many other cognitive services; such as Text Analytics, Translator Text, and others. Use
this resource type if you plan to use multiple cognitive services and want to simplify
administration and development.

Whichever type of resource you choose to create, it will provide two pieces of information that you
will need to use it:

 A key that is used to authenticate client applications.


 An endpoint that provides the HTTP address at which your resource can be accessed.
Note

If you create a Cognitive Services resource, client applications use the same key and endpoint
regardless of the specific service they are using.

Use the Computer Vision service to read text

Many times an image contains text. It can be typewritten text or handwritten. Some common
examples are images with road signs, scanned documents that are in an image format such as JPEG or
PNG file formats, or even just a picture taken of a white board that was used during a meeting.

The Computer Vision service provides one application programming interface (APIs) that you can use
to read text in images: the Read API.

The Read API

The Read API uses the latest recognition models and is optimized for images that have a significant
amount of text or has considerable visual noise.

The Read API can handle scanned documents that have a lot of text. It also has the ability to
automatically determine the proper recognition model to use, taking into consideration lines of text
and supporting images with printed text as well as recognizing handwriting.

Because the Read API can work with large documents, it works asynchronously so as not to block
your application while it is reading the content and returning results to your application. This means
that to use the Read API, your application must use a three-step process:

32
AI Dr. Subbulakshmi
1. Submit an image to the API, and retrieve an operation ID in response.
2. Use the operation ID to check on the status of the image analysis operation, and wait until it
has completed.
3. Retrieve the results of the operation.

The results from the Read API are arranged into the following hierarchy:

 Pages - One for each page of text, including information about the page size and orientation.
 Lines - The lines of text on a page.
 Words - The words in a line of text, including the bounding box coordinates and text itself.

Each line and word includes bounding box coordinates indicating its position on the page.

3.Natural Language Processing


Analysing text with language service

Introduction

Analyzing text is a process where you evaluate different aspects of a document or phrase, in order to
gain insights into the content of that text. For the most part, humans are able to read some text and
understand the meaning behind it. Even without considering grammar rules for the language the text
is written in, specific insights can be identified in the text.

As an example, you might read some text and identify some key phrases that indicate the main talking
points of the text. You might also recognize names of people or well-known landmarks such as the
Eiffel Tower. Although difficult at times, you might also be able to get a sense for how the person was
feeling when they wrote the text, also commonly known as sentiment.

Text Analytics Techniques

Text analytics is a process where an artificial intelligence (AI) algorithm, running on a computer,
evaluates these same attributes in text, to determine specific insights. A person will typically rely on
their own experiences and knowledge to achieve the insights. A computer must be provided with
similar knowledge to be able to perform the task. There are some commonly used techniques that can
be used to build software to analyze text, including:

 Statistical analysis of terms used in the text. For example, removing common "stop words"
(words like "the" or "a", which reveal little semantic information about the text), and
performing frequency analysis of the remaining words (counting how often each word
appears) can provide clues about the main subject of the text.
 Extending frequency analysis to multi-term phrases, commonly known as N-grams (a two-
word phrase is a bi-gram, a three-word phrase is a tri-gram, and so on).
 Applying stemming or lemmatization algorithms to normalize words before counting them -
for example, so that words like "power", "powered", and "powerful" are interpreted as being
the same word.
 Applying linguistic structure rules to analyze sentences - for example, breaking down
sentences into tree-like structures such as a noun phrase, which itself
contains nouns, verbs, adjectives, and so on.
 Encoding words or terms as numeric features that can be used to train a machine learning
model. For example, to classify a text document based on the terms it contains. This technique

33
AI Dr. Subbulakshmi
is often used to perform sentiment analysis, in which a document is classified as positive or
negative.
 Creating vectorized models that capture semantic relationships between words by assigning
them to locations in n-dimensional space. This modeling technique might, for example, assign
values to the words "flower" and "plant" that locate them close to one another, while
"skateboard" might be given a value that positions it much further away.

While these techniques can be used to great effect, programming them can be complex. In Microsoft
Azure, the Language cognitive service can help simplify application development by using pre-
trained models that can:

 Determine the language of a document or text (for example, French or English).


 Perform sentiment analysis on text to determine a positive or negative sentiment.
 Extract key phrases from text that might indicate its main talking points.
 Identify and categorize entities in the text. Entities can be people, places, organizations, or
even everyday items such as dates, times, quantities, and so on.

The Language service is a part of the Azure Cognitive Services offerings that can perform advanced
natural language processing over raw text.

Azure resources for the Language service

To use the Language service in an application, you must provision an appropriate resource in your
Azure subscription. You can choose to provision either of the following types of resource:

 A Language resource - choose this resource type if you only plan to use natural language
processing services, or if you want to manage access and billing for the resource separately
from other services.
 A Cognitive Services resource - choose this resource type if you plan to use the Language
service in combination with other cognitive services, and you want to manage access and
billing for these services together.

Language detection

Use the language detection capability of the Language service to identify the language in which text is
written. You can submit multiple documents at a time for analysis. For each document submitted to it,
the service will detect:

 The language name (for example "English").


 The ISO 6391 language code (for example, "en").
 A score indicating a level of confidence in the language detection.

For example, consider a scenario where you own and operate a restaurant where customers can
complete surveys and provide feedback on the food, the service, staff, and so on. Suppose you have
received the following reviews from customers:

Review 1: "A fantastic place for lunch. The soup was delicious."

Review 2: "Comida maravillosa y gran servicio."

34
AI Dr. Subbulakshmi
Review 3: "The croque monsieur avec frites was terrific. Bon appetit!"

You can use the text analytics capabilities in the Language service to detect the language for each of
these reviews; and it might respond with the following results:

Document Language Name ISO 6391 Code


Review 1 English en
Review 2 Spanish es
Review 3 English en

Notice that the language detected for review 3 is English, despite the text containing a mix of English
and French. The language detection service will focus on the predominant language in the text. The
service uses an algorithm to determine the predominant language, such as length of phrases or total
amount of text for the language compared to other languages in the text. The predominant language
will be the value returned, along with the language code. The confidence score may be less than 1 as a
result of the mixed language text.

Ambiguous or mixed language content

There may be text that is ambiguous in nature, or that has mixed language content. These situations
can present a challenge to the service. An ambiguous content example would be a case where the
document contains limited text, or only punctuation. For example, using the service to analyze the text
":-)", results in a value of unknown for the language name and the language identifier, and a score
of NaN (which is used to indicate not a number).

Sentiment analysis

The text analytics capabilities in the Language service can evaluate text and return sentiment scores
and labels for each sentence. This capability is useful for detecting positive and negative sentiment in
social media, customer reviews, discussion forums and more.

Using the pre-built machine learning classification model, the service evaluates the text and returns a
sentiment score in the range of 0 to 1, with values closer to 1 being a positive sentiment. Scores that
are close to the middle of the range (0.5) are considered neutral or indeterminate.

For example, the following two restaurant reviews could be analyzed for sentiment:

"We had dinner at this restaurant last night and the first thing I noticed was how courteous the staff
was. We were greeted in a friendly manner and taken to our table right away. The table was clean,
the chairs were comfortable, and the food was amazing."

and

"Our dining experience at this restaurant was one of the worst I've ever had. The service was slow,
and the food was awful. I'll never eat at this establishment again."

The sentiment score for the first review might be around 0.9, indicating a positive sentiment; while
the score for the second review might be closer to 0.1, indicating a negative sentiment.

35
AI Dr. Subbulakshmi
Indeterminate sentiment

A score of 0.5 might indicate that the sentiment of the text is indeterminate, and could result from text
that does not have sufficient context to discern a sentiment or insufficient phrasing. For example, a list
of words in a sentence that has no structure, could result in an indeterminate score. Another example
where a score may be 0.5 is in the case where the wrong language code was used. A language code
(such as "en" for English, or "fr" for French) is used to inform the service which language the text is
in. If you pass text in French but tell the service the language code is en for English, the service will
return a score of precisely 0.5.

Key phrase extraction

Key phrase extraction is the concept of evaluating the text of a document, or documents, and then
identifying the main talking points of the document(s). Consider the restaurant scenario discussed
previously. Depending on the volume of surveys that you have collected, it can take a long time to
read through the reviews. Instead, you can use the key phrase extraction capabilities of the Language
service to summarize the main points.

You might receive a review such as:

"We had dinner here for a birthday celebration and had a fantastic experience. We were greeted by a
friendly hostess and taken to our table right away. The ambiance was relaxed, the food was amazing,
and service was terrific. If you like great food and attentive service, you should try this place."

Key phrase extraction can provide some context to this review by extracting the following phrases:

 attentive service
 great food
 birthday celebration
 fantastic experience
 table
 friendly hostess
 dinner
 ambiance
 place

Not only can you use sentiment analysis to determine that this review is positive, you can use the key
phrases to identify important elements of the review.

Entity recognition

You can provide the Language service with unstructured text and it will return a list of entities in the
text that it recognizes. The service can also provide links to more information about that entity on the
web. An entity is essentially an item of a particular type or a category; and in some cases, subtype,
such as those as shown in the following table.

Type SubType Example


Person "Bill Gates", "John"
Location "Paris", "New York"
Organization "Microsoft"

36
AI Dr. Subbulakshmi
Type SubType Example
Quantity Number "6" or "six"
Quantity Percentage "25%" or "fifty percent"
Quantity Ordinal "1st" or "first"
Quantity Age "90 day old" or "30 years old"
Quantity Currency "10.99"
Quantity Dimension "10 miles", "40 cm"
Quantity Temperature "45 degrees"
DateTime "6:30PM February 4, 2012"
DateTime Date "May 2nd, 2017" or "05/02/2017"
DateTime Time "8am" or "8:00"
DateTime DateRange "May 2nd to May 5th"
DateTime TimeRange "6pm to 7pm"
DateTime Duration "1 minute and 45 seconds"
DateTime Set "every Tuesday"
URL "https://www.bing.com"
Email "support@microsoft.com"
US-based Phone Number "(312) 555-0176"
IP Address "10.0.1.125"

Speech Recognition

Introduction
Increasingly, we expect artificial intelligence (AI) solutions to accept vocal commands and provide
spoken responses. Consider the growing number of home and auto systems that you can control by
speaking to them - issuing commands such as "turn off the lights", and soliciting verbal answers to
questions such as "will it rain today?"

To enable this kind of interaction, the AI system must support two capabilities:

 Speech recognition - the ability to detect and interpret spoken input.


 Speech synthesis - the ability to generate spoken output.

Speech recognition

Speech recognition is concerned with taking the spoken word and converting it into data that can be
processed - often by transcribing it into a text representation. The spoken words can be in the form of
a recorded voice in an audio file, or live audio from a microphone. Speech patterns are analyzed in the
audio to determine recognizable patterns that are mapped to words. To accomplish this feat, the
software typically uses multiple types of models, including:

 An acoustic model that converts the audio signal into phonemes (representations of specific
sounds).
 A language model that maps phonemes to words, usually using a statistical algorithm that
predicts the most probable sequence of words based on the phonemes.

37
AI Dr. Subbulakshmi
The recognized words are typically converted to text, which you can use for various purposes, such
as.

 Providing closed captions for recorded or live videos


 Creating a transcript of a phone call or meeting
 Automated note dictation
 Determining intended user input for further processing

Speech synthesis

Speech synthesis is in many respects the reverse of speech recognition. It is concerned with vocalizing
data, usually by converting text to speech. A speech synthesis solution typically requires the following
information:

 The text to be spoken.


 The voice to be used to vocalize the speech.

To synthesize speech, the system typically tokenizes the text to break it down into individual words,
and assigns phonetic sounds to each word. It then breaks the phonetic transcription into prosodic units
(such as phrases, clauses, or sentences) to create phonemes that will be converted to audio format.
These phonemes are then synthesized as audio by applying a voice, which will determine parameters
such as pitch and timbre; and generating an audio wave form that can be output to a speaker or written
to a file.

You can use the output of speech synthesis for many purposes, including:

 Generating spoken responses to user input.


 Creating voice menus for telephone systems.
 Reading email or text messages aloud in hands-free scenarios.
 Broadcasting announcements in public locations, such as railway stations or airports.

Get started with speech on Azure

Microsoft Azure offers both speech recognition and speech synthesis capabilities through
the Speech cognitive service, which includes the following application programming interfaces
(APIs):

 The Speech-to-Text API


 The Text-to-Speech API

Azure resources for the Speech service

To use the Speech service in an application, you must create an appropriate resource in your Azure
subscription. You can choose to create either of the following types of resource:

 A Speech resource - choose this resource type if you only plan to use the Speech service, or if
you want to manage access and billing for the resource separately from other services.
 A Cognitive Services resource - choose this resource type if you plan to use the Speech service
in combination with other cognitive services, and you want to manage access and billing for
these services together.

38
AI Dr. Subbulakshmi
The speech-to-text API

You can use the speech-to-text API to perform real-time or batch transcription of audio into a text
format. The audio source for transcription can be a real-time audio stream from a microphone or an
audio file.

The model that is used by the speech-to-text API, is based on the Universal Language Model that was
trained by Microsoft. The data for the model is Microsoft-owned and deployed to Microsoft Azure.
The model is optimized for two scenarios, conversational and dictation. You can also create and train
your own custom models including acoustics, language, and pronunciation if the pre-built models
from Microsoft do not provide what you need.

Real-time transcription

Real-time speech-to-text allows you to transcribe text in audio streams. You can use real-time
transcription for presentations, demos, or any other scenario where a person is speaking.

In order for real-time transcription to work, your application will need to be listening for incoming
audio from a microphone, or other audio input source such as an audio file. Your application code
streams the audio to the service, which returns the transcribed text.

Batch transcription

Not all speech-to-text scenarios are real time. You may have audio recordings stored on a file share, a
remote server, or even on Azure storage. You can point to audio files with a shared access signature
(SAS) URI and asynchronously receive transcription results.

Batch transcription should be run in an asynchronous manner because the batch jobs are scheduled on
a best-effort basis. Normally a job will start executing within minutes of the request but there is no
estimate for when a job changes into the running state.

The text-to-speech API

The text-to-speech API enables you to convert text input to audible speech, which can either be
played directly through a computer speaker or written to an audio file.

Speech synthesis voices

When you use the text-to-speech API, you can specify the voice to be used to vocalize the text. This
capability offers you the flexibility to personalize your speech synthesis solution and give it a specific
character.

The service includes multiple pre-defined voices with support for multiple languages and regional
pronunciation, including standardvoices as well as neural voices that leverage neural networks to
overcome common limitations in speech synthesis with regard to intonation, resulting in a more
natural sounding voice. You can also develop custom voices and use them with the text-to-speech API

As organizations and individuals increasingly need to collaborate with people in other cultures and
geographic locations, the removal of language barriers has become a significant problem.

39
AI Dr. Subbulakshmi
One solution is to find bilingual, or even multilingual, people to translate between languages.
However the scarcity of such skills, and the number of possible language combinations can make this
approach difficult to scale. Increasingly, automated translation, sometimes known as machine
translation, is being employed to solve this problem.

Literal and semantic translation

Early attempts at machine translation applied literal translations. A literal translation is where each
word is translated to the corresponding word in the target language. This approach presents some
issues. For one case, there may not be an equivalent word in the target language. Another case is
where literal translation can change the meaning of the phrase or not get the context correct.

For example, the French phrase "éteindre la lumière" can be translated to English as "turn off the
light". However, in French you might also say "fermer la lumiere" to mean the same thing. The
French verb fermer literally means to "close", so a literal translation based only on the words would
indicate, in English, "close the light"; which for the average English speaker, doesn't really make
sense, so to be useful, a translation service should take into account the semantic context and return an
English translation of "turn off the light".

Artificial intelligence systems must be able to understand, not only the words, but also
the semantic context in which they are used. In this way, the service can return a more accurate
translation of the input phrase or phrases. The grammar rules, formal versus informal, and
colloquialisms all need to be considered.

Text and speech translation

Text translation can be used to translate documents from one language to another, translate email
communications that come from foreign governments, and even provide the ability to translate web
pages on the Internet. Many times you will see a Translate option for posts on social media sites, or
the Bing search engine can offer to translate entire web pages that are returned in search results.

Speech translation is used to translate between spoken languages, sometimes directly (speech-to-
speech translation) and sometimes by translating to an intermediary text format (speech-to-text
translation).

Microsoft Azure provides cognitive services that support translation. Specifically, you can
use the following services:

 The Translator service, which supports text-to-text translation.


 The Speech service, which enables speech-to-text and speech-to-speech translation.

Azure resources for Translator and Speech


Before you can use the Translator or Speech services, you must provision appropriate
resources in your Azure subscription.

40
AI Dr. Subbulakshmi
There are dedicated Translator and Speech resource types for these services, which you can
use if you want to manage access and billing for each service individually.

Alternatively, you can create a Cognitive Services resource that provides access to both
services through a single Azure resource, consolidating billing and enabling applications to
access both services through a single endpoint and authentication key.

Text translation with the Translator service


The Translator service is easy to integrate in your applications, websites, tools, and solutions.
The service uses a Neural Machine Translation (NMT) model for translation, which analyzes
the semantic context of the text and renders a more accurate and complete translation as a
result.

Translator service language support

The Translator service supports text-to-text translation between more than 60 languages.
When using the service, you must specify the language you are translating from and the
language you are translating to using ISO 639-1 language codes, such as en for English, fr for
French, and zh for Chinese. Alternatively, you can specify cultural variants of languages by
extending the language code with the appropriate 3166-1 cultural code - for example, en-
US for US English, en-GB for British English, or fr-CA for Canadian French.

When using the Translator service, you can specify one from language with
multiple to languages, enabling you to simultaneously translate a source document into
multiple languages.

Optional Configurations

The Translator API offers some optional configuration to help you fine-tune the results that
are returned, including:

 Profanity filtering. Without any configuration, the service will translate the input text,
without filtering out profanity. Profanity levels are typically culture-specific but you
can control profanity translation by either marking the translated text as profane or by
omitting it in the results.
 Selective translation. You can tag content so that it isn't translated. For example, you
may want to tag code, a brand name, or a word/phrase that doesn't make sense when
localized.

Speech translation with the Speech service


The Speech service includes the following application programming interfaces (APIs):

 Speech-to-text - used to transcribe speech from an audio source to text format.


 Text-to-speech - used to generate spoken audio from a text source.
41
AI Dr. Subbulakshmi
 Speech Translation - used to translate speech in one language to text or speech in
another.

You can use the Speech Translation API to translate spoken audio from a streaming source,
such as a microphone or audio file, and return the translation as text or an audio stream. This
enables scenarios such as real-time closed captioning for a speech or simultaneous two-way
translation of a spoken conversation.

Speech service language support

As with the Translator service, you can specify one source language and one or more target
languages to which the source should be translated. You can translate speech into over 60
languages.

The source language must be specified using the extended language and culture code format,
such as es-US for American Spanish. This requirement helps ensure that the source is
understood properly, allowing for localized pronunciation and linguistic idioms.

The target languages must be specified using a two-character language code, such as en for
English or de for German.

In today's connected world, people use a variety of technologies to communicate. For


example:

 Voice calls
 Messaging services
 Online chat applications
 Email
 Social media platforms
 Collaborative workplace tools

We've become so used to ubiquitous connectivity, that we expect the organizations we deal
with to be easily contactable and immediately responsive through the channels we already
use. Additionally, we expect these organizations to engage with us individually, and be able
to answer complex questions at a personal level.

Conversational AI
While many organizations publish support information and answers to frequently asked
questions (FAQs) that can be accessed through a web browser or dedicated app. The
complexity of the systems and services they offer means that answers to specific questions
are hard to find. Often, these organizations find their support personnel being overloaded
with requests for help through phone calls, email, text messages, social media, and other
channels.

42
AI Dr. Subbulakshmi
Increasingly, organizations are turning to artificial intelligence (AI) solutions that make use
of AI agents, commonly known as bots to provide a first-line of automated support through
the full range of channels that we use to communicate. Bots are designed to interact with
users in a conversational manner, as shown in this example of a chat interface:

Conversations typically take the form of messages exchanged in turns; and one of the most
common kinds of conversational exchange is a question followed by an answer. This pattern
forms the basis for many user support bots, and can often be based on existing FAQ
documentation. To implement this kind of solution, you need:

 A knowledge base of question and answer pairs - usually with some built-in natural
language processing model to enable questions that can be phrased in multiple ways
to be understood with the same semantic meaning.
 A bot service that provides an interface to the knowledge base through one or more
channels.

Get started with the Language service and Azure Bot


Service
You can easily create a user support bot solution on Microsoft Azure using a combination of
two core services:

 Language service. The Language service includes a custom question answering feature
that enables you to create a knowledge base of question and answer pairs that can be
queried using natural language input.
Note

The question answering capability in the Language service is a newer version of the
QnA Maker service - which is still available as a separate service.

 Azure Bot service. This service provides a framework for developing, publishing, and
managing bots on Azure.

Creating a custom question answering knowledge base


The first challenge in creating a user support bot is to use the Language service to create a
knowledge base. You can use the Language Studio's custom question answering feature to
create, train, publish, and manage knowledge bases.

Provision a Language service Azure resource

To create a knowledge base, you must first provision a Language service resource in your
Azure subscription.
43
AI Dr. Subbulakshmi
Define questions and answers

After provisioning a Language service resource, you can use the Language Studio's custom
question answering feature to create a knowledge base that consists of question-and-answer
pairs. These questions and answers can be:

 Generated from an existing FAQ document or web page.


 Entered and edited manually.

In many cases, a knowledge base is created using a combination of all of these techniques;
starting with a base dataset of questions and answers from an existing FAQ document and
extending the knowledge base with additional manual entries.

Questions in the knowledge base can be assigned alternative phrasing to help consolidate
questions with the same meaning. For example, you might include a question like:

What is your head office location?

You can anticipate different ways this question could be asked by adding an alternative
phrasing such as:

Where is your head office located?

Test the knowledge base

After creating a set of question-and-answer pairs, you must save it. This process analyzes
your literal questions and answers and applies a built-in natural language processing model to
match appropriate answers to questions, even when they are not phrased exactly as specified
in your question definitions. Then you can use the built-in test interface in the Language
Studio to test your knowledge base by submitting questions and reviewing the answers that
are returned.

Use the knowledge base

When you're satisfied with your knowledge base, deploy it. Then you can use it over its
REST interface. To access the knowledge base, client applications require:

 The knowledge base ID


 The knowledge base endpoint
 The knowledge base authorization key

Build a bot with the Azure Bot Service


After you've created and deployed a knowledge base, you can deliver it to users through a
bot.

44
AI Dr. Subbulakshmi
Create a bot for your knowledge base

You can create a custom bot by using the Microsoft Bot Framework SDK to write code that
controls conversation flow and integrates with your knowledge base. However, an easier
approach is to use the automatic bot creation functionality, which enables you create a bot for
your deployed knowledge base and publish it as an Azure Bot Service application with just a
few clicks.

Extend and configure the bot

After creating your bot, you can manage it in the Azure portal, where you can:

 Extend the bot's functionality by adding custom code.


 Test the bot in an interactive test interface.
 Configure logging, analytics, and integration with other services.

For simple updates, you can edit bot code directly in the Azure portal. However, for more
comprehensive customization, you can download the source code and edit it locally;
republishing the bot directly to Azure when you're ready.

Connect channels

When your bot is ready to be delivered to users, you can connect it to multiple channels;
making it possible for users to interact with it through web chat, email, Microsoft Teams, and
other common communication media.

Users can submit questions to the bot through any of its channels, and receive an appropriate
answer from the knowledge base on which the bot is based.

4.Knowledge Mining

Knowledge mining is the term used to describe solutions that involve


extracting information from large volumes of often unstructured data. One
of these knowledge mining solutions is Azure Cognitive Search, a cloud
search service that has tools for building user-managed indexes.

What is Azure Cognitive Search?


Azure Cognitive Search provides the infrastructure and tools to create search solutions that
extract data from various structured, semi-structured, and non-structured documents.

Azure Cognitive Search results contain only your data, which can include text inferred or
extracted from images, or new entities and key phrases detection through text analytics. It's a

45
AI Dr. Subbulakshmi
Platform as a Service (PaaS) solution. Microsoft manages the infrastructure and availability,
allowing your organization to benefit without the need to purchase or manage dedicated
hardware resources.

Azure Cognitive Search features


Azure Cognitive Search exists to complement existing technologies and provides a
programmable search engine built on Apache Lucene, an open-source software library. It's a
highly available platform offering a 99.9% uptime SLA available for cloud and on-premises
assets.

Azure Cognitive Search comes with the following features:

 Data from any source: Azure Cognitive Search accepts data from any source provided
in JSON format, with auto crawling support for selected data sources in Azure.
 Full text search and analysis: Azure Cognitive Search offers full text search capabilities
supporting both simple query and full Lucene query syntax.
 AI powered search: Azure Cognitive Search has Cognitive AI capabilities built in for
image and text analysis from raw content.
 Multi-lingual: Azure Cognitive Search offers linguistic analysis for 56 languages to
intelligently handle phonetic matching or language-specific linguistics. Natural
language processors available in Azure Cognitive Search are also used by Bing and
Office.
 Geo-enabled: Azure Cognitive Search supports geo-search filtering based on proximity
to a physical location.
 Configurable user experience: Azure Cognitive Search has several features to improve
the user experience including autocomplete, autosuggest, pagination, and hit
highlighting.

Identify elements of a search solution

Use a skillset to define an enrichment pipeline


AI enrichment refers to embedded image and natural language processing in a pipeline that
extracts text and information from content that can't otherwise be indexed for full text search.

AI processing is achieved by adding and combining skills in a skillset. A skillset defines the
operations that extract and enrich data to make it searchable. These AI skills can be either
built-in skills, such as text translation or Optical Character Recognition (OCR), or custom
skills that you provide.

46
AI Dr. Subbulakshmi
Built in skills

Built-in skills are based on pre-trained models from Microsoft, which means you can't train
the model using your own training data. Skills that call the Cognitive Resources APIs have a
dependency on those services and are billed at the Cognitive Services pay-as-you-go price
when you attach a resource. Other skills are metered by Azure Cognitive Search, or are utility
skills that are available at no charge.

Built-in skills fall into these categories:

Natural language processing skills: with these skills, unstructured text is mapped as
searchable and filterable fields in an index.

Some examples include:

 Key Phrase Extraction: uses a pre-trained model to detect important phrases based on
term placement, linguistic rules, proximity to other terms, and how unusual the term is
within the source data.
 Text Translation Skill: uses a pre-trained model to translate the input text into various
languages for normalization or localization use cases.

Image processing skills: creates text representations of image content, making it searchable
using the query capabilities of Azure Cognitive Search.

Some examples include:

 Image Analysis Skill: uses an image detection algorithm to identify the content of an
image and generate a text description.
 Optical Character Recognition Skill: allows you to extract printed or handwritten text
from images, such as photos of street signs and products, as well as from documents—
invoices, bills, financial reports, articles, and more.

A knowledge store is persistent storage of enriched content. The purpose


of a knowledge store is to store the data generated from AI enrichment in
a container. For example, you may want to save the results of an AI
skillset that generates captions from images.

Important Questions
2 Marks
What is AI?
What is ML?
What is computer vision?
What is NLP?
What is knowledge mining?
What are the two types of AI?
What is Anomaly detection?
What is Machine learning in Azure?

47
AI Dr. Subbulakshmi
What is automated ML in azure?
What is Azure ML designer?
What is datasets?
What are pipelines?
What is cognitive search? And what are all the services available under cognitive search?
What is supervised ML?
What is unsupervised ML?
What is categorization?
What is classification?
What is regression?
What is resource?
What is workspace?
What are components?
What is Machine learning Jobs?
What is describing in computer vision
What is tagging in computer vision
What is detecting objects
What is detecting brands
What is cognitive service
Differentiate between computer vision and custom vision
What is key and endpoints?
What is precision in model evaluation?
What is recall in model evaluation?
What is object detection?
What is natural language processing?
What is speech recognition?
What is speech synthesize?
What is speech to text API?
What is text to speech API?
What is real time transcription?
What is batch transcription?
What is literal and semantic translation?
What is conversational AI?
What is knowledge mining?
What is cognitive search?

5 Marks
What is AI? What are the applications of AI in business?
What is workloads? What are the key workloads in AI?
What are the steps in the wildflower identification ML model?
What are the components of computer vision?
What are the steps in natural language processing?
Explain the six components of responsible AI?
What are the challenges and issues with AI?
Explain the basis of mathematical model in ML with example?
Discuss the types of ML with examples?
What are the different steps / tasks involved machine learning in Azure?

48
AI Dr. Subbulakshmi
Discuss the steps involved in creating a resource in Azure Machine Learning
Discuss the steps involved in creating and loading the data set in Azure Machine Learning
studio.
Discuss the steps involved in creating the job configuration of classification training model
in Azure Machine Learning studio using Automated ML.
Discuss the steps involved in creating the job configuration of regression training model in
Azure Machine Learning studio using Automated ML.
Discuss the steps involved in creating the job configuration of regression training model in
Azure Machine Learning designer.
What are the Uses of computer vision
What are the different ways of analysing images
What are the different ways of creating resources in Computer vision?
What are the uses of image classification?
What are the steps involved in image classification in custom vision?
What are the steps involved in object detection in custom vision?
What are the uses of object detection?
What are the steps involved in image classification in custom vision?
What are the uses of face detection and Analysis?
What are the attributes of face services?
What are the steps involved in reading text using OCR in computer vision?
What are the different statistical analysis can be performed in the text analysis?
What are the different aspects of text analysis in Azure?
Explain briefly about speech synthesize and speech recognition?
What are the steps involved in speech synthesize and speech recognition?
What are the different ways of creating resources in Natural language processing?
What are the steps involved in speech and text translation?
What are the steps involved in creating a conversational AI?
What are the steps involved in Knowledge mining?

49
AI Dr. Subbulakshmi

You might also like