Advanced Concepts of Modelling Notes.
Advanced Concepts of Modelling Notes.
1. Identify the model: Predicting whether a customer is eligible for a bank loan or not?
a. Classification
b. Regression
c. Both a. and b.
d. None of the above
**
a. Classification
**
b. Regression
3. In which type of machine learning is the data labeled with the desired output?
a. Supervised Learning
b. Unsupervised Learning
c. Reinforcement Learning
d. Deep Learning
**
a. Supervised Learning
4. An email spam filter that learns to identify spam emails based on labeled examples is an
application of:
a. Supervised Learning
b. Unsupervised Learning
c. Reinforcement Learning
d. Transfer Learning
**
a. Supervised Learning
5. A machine learning algorithm that groups similar customer purchases into clusters for
recommendation systems uses:
a. Supervised Learning
b. Unsupervised Learning
c. Reinforcement Learning
d. Neural Networks
**
b. Unsupervised Learning
6. An AI agent playing a game and learning from its rewards and penalties is an example of:
a. Supervised Learning
b. Unsupervised Learning
c. Reinforcement Learning
d. Evolutionary Learning
**
c. Reinforcement Learning
**
**
**
**
11. Imagine an AI playing a game and learning to win by trial and error. This is an example of:
a. Supervised Learning
b. Unsupervised Learning
c. Reinforcement Learning
d. Natural Language Processing
**
c. Reinforcement Learning
12. Artificial neural networks are inspired by the structure and function of:
a. The human brain
b. Quantum computers
c. Complex mathematical models
d. High-speed processors
**
13. The process of adjusting the weights in a neural network to improve performance is called:
a. Activation
b. Learning
c. Optimization
d. Training
**
d. Training
**
**
**
17. Assertion (A): Unsupervised Learning is a type of learning without any guidance.
Reasoning (R): Unsupervised learning models work on unlabeled datasets, where the data fed into
the machine is random and the person training the model may not have any prior information
about it.
a. Both A and R are true and R is the correct explanation for A
b. Both A and R are true and R is not the correct explanation for A
c. A is True but R is False
d. A is false but R is True
**
18. Assertion (A): Information processing in a neural network relies on weights and biases
assigned to nodes.
Reasoning (R): These weights and biases determine how strongly a node is influenced by its inputs
and its overall contribution to the next layer.
a. Both A and R are true and R is the correct explanation for A
b. Both A and R are true and R is not the correct explanation for A
c. A is True but R is False
d. A is false but R is True
**
19. In this learning model, the data set which is fed to the machine is labelled. Name the
model. (CBSE 2022 – 2023)
**
Supervised Learning
**
**
22. Which form of unsupervised learning does the following diagram indicate ? (CBSE 2023 – 2024)
a. Clustering
b. Regression
c. Reinforcement learning
d. Classification
**
a. Clustering
23. For Data Science, usually the data is collected in the form of tables. These tabular datasets can
be stored in different formats. Which of the following formats is not used for storing data in a
tabular format? CBSE 2023 – 2024)
a. CSV
b. Website
c. SQL
d. Spreadsheet
**
b. Website
**
c. iii and iv
25. Regression is one of the type of supervised learning model, where data is classified according
to labels and data need not to be continuous, (True / False) CBSE 2022 – 2023)
**
False
26. Aditi, a student of class XII developed a chatbot that clarifies the doubts of Economics
students. She trained the software with lots of data sets catering to all difficulty levels. If any
student would type or ask questions related to Economics, the software would give an instant
reply. Identify the domain of AI in the given scenario.
a. Computer Vision
b. Data Science
c. Natural Language Processing
d. None of these
**
27. Which of the following applications is not associated with Natural Language Processing
(NLP)? (CBSE 2023 – 2024)
a. Sentiment Analysis
b. Speech Recognition
c. Spam Filtering in emails
d. Stock Market Analysis
**
d. Stock Market Analysis
28. _________helps to find the best model that represents our data and how well the chosen
model will work in future. (CBSE 2022 – 2023)
**
Model Evaluation
29. Assertion (A): Neural networks are the backbone of deep learning algorithms
Reasoning (R): Neural networks use vast amounts of data
a. Both A and R are correct and R is the correct explanation of A
b. Both A and R are correct but R is NOT the correct explanation of A
c. A is correct but R is not correct
d. A is not correct but R is correct.
**
30. Assertion (A): The term used to refer to the number of pixels in an image is resolution.
Reasoning (R): Resolution in an image denotes the total number of pixels it contains, usually
represented as height x width. (CBSE 2023 – 2024)
a. Both a. and R) are true and R) is the correct explanation for a..
b. Both a. and R) are true and R) is not the correct explanation for a..
c. a. is true, but R) is false.
d. a. is false, but R) is true.
**
31. It refers to the unsupervised learning algorithm which can cluster the unknown data according
to the patterns or trends identified out of it. (CBSE 2022 – 2023)
a. Regression
b. Classification
c. Clustering
d. Dimensionality reduction
**
c. Clustering
32. Infrared sensors detect infrared energy that is emitted by one’s body heat. When hands are
placed in the proximity of the sensor, the infrared energy quickly fluctuates. This fluctuation
triggers the pump to activate and dispense the designated amount of sanitizer. This is an example
of
a. Automated machine
b. AI machine
c. Semi-automatic machine
d. Deep Learning machine
**
a. Automated machine
33. Google Translate is Google’s free service that instantly translates words, phrases, and web
pages between English and over 100 other languages. Google translate uses —–
a. 4w problem canvas
b. Neural Networks
c. KWLH chart
d. System maps
**
b. Neural Networks
34. Data about the houses such as square footage, number of rooms, features, whether a house
has a garden or not, and the prices of these houses, i.e., the corresponding labels are fed into an AI
machine. By leveraging data coming from thousands of houses, their features and prices, we can
now train the model to predict a new house’s price. This is an example of
a. Reinforcement learning
b. Supervised learning
c. Unsupervised learning
d. None of the above
**
b. Supervised learning
**
c. Input Layer -> Answer; Output layer -> Processing; Hidden Layer -> Data
**
c.In NLP, modelling requires data pre-processing only after which the data is fed to the machine.
1. Differentiate between AI, ML, and DL
Artificial intelligence (AI) is the ability of machines to do cognitive tasks such as thinking,
perceiving, learning, problem-solving, and decision-making. ML and DL is a subset of Artificial
Intelligence.
ML depends on labeled data for making DL requires large labeled data to perform
AI has predefined rules.
predictions tasks
AL can be rule-based and ML can learn automatically with less human DL automates feature extraction and lessens
require human programming intervention human intervention
AI is used in virtual assistants, ML is used in spam filtering, image DL is used in speech recognition, autonomous
recommendation systems, etc. recognition, etc. vehicles, etc.
Artificial Intelligence
Artificial intelligence (AI) is the simulation of human intelligence in robots that have been trained to
think and act like humans. The term can also refer to any machine that demonstrates, like humans,
the ability to learn and solve the problem is Artificial Intelligence.
Machine learning is a part of an Artificial Intelligence application in which we give data to the
machine and allow them to learn for themselves. It’s essentially getting a machine to accomplish
something without being specifically programmed to do so. The machine learns from its mistakes
and takes them into consideration in the next execution. It improvises itself using its own
experiences.
Here is an example which shows labelled images (every image is tagged either as apple or
strawberry) are given as input to the ML model. ML model learns from the input data to classify
between apples and strawberries and predicts the correct output as shown.
• Anomaly Detection – Anomaly detection helps us find the unexpected things hiding in our
data. For example, tracking your heart rate, and finding a sudden spike could be an anomaly,
flagging a potential issue.
Deep learning is a part of Artificial Intelligence that uses neural networks with multilayer. Deep
learning analyzes the data, learns the data and solves the problem the same as a human. Deep
learning requires the machine to be educated with a large quantity of data in order to train itself.
Deep Learning is the most advanced form of Artificial Intelligence out of these three.
Here is an example which shows pixels of a bird image given as input to the DL Model and the model
is able to analyze and correctly predict that it is a bird using a deep learning algorithm ( ANN).
What is Data?
Data is information in any form For e.g. A table with information about fruits is data, Each row will
contain information about different fruits. Each fruit is described by certain features
Columns of the tables are called features, In the fruit dataset example, features may be name, color,
size, etc., Some features are special, they are called labels
Data Labeling is the process of attaching meaning to data. For e.g. if we are trying to predict what
fruit it is based on the color of the fruit, then color is the feature, and fruit name is the label. Data
can be of two types – Labeled and Unlabeled
Data to which some tag/label is attached is known as labeled data. For example, name, type,
number, etc. Unlabeled data is a raw form of data that has no tag attached.
What do you mean by a training data set?
The training data set is a collection of examples given to the model to analyze and learn. Just like
how a teacher teaches a topic to the class through a lot of examples and illustrations. Similarly, a set
of labeled data is used to train the AI model.
The testing data set is used to test the accuracy of the model. Just like how a teacher takes a class
test related to a topic to evaluate the understanding level of students. Test is performed without
labeled data and then verify results with labels.
Modelling
AI Modelling refers to developing algorithms, also called models which can be trained to get
intelligent outputs. An AI model is a program that uses algorithms to analyze data and make
decisions without human intervention. AI models are trained on data sets to recognize patterns and
perform tasks.
• Once trained, the machine will not make any changes in the training dataset.
• If you try testing the machine on a dataset which is different from the rules and data you fed
it at the training stage, the machine will fail and will not learn from its mistake.
• Once the model is trained, the model cannot improvise itself on the basis of feedback.
Learning Based – Refers to the AI modelling where the machine learns by itself. Under the Learning
Based approach, the AI model gets trained on the data fed to it and then is able to design a model
which is adaptive to the change in data. Random data is provided to the computer in this method,
and the system is left to figure out patterns and trends from it.
1. Supervised Learning
2. Unsupervised Learning
3. Reinforcement Learning
1. Supervised Learning
Supervised learning is a machine learning technique that uses labeled data to train algorithms to
predict outcomes. In a supervised learning model, the dataset which is fed to the machine is
labelled. In other words, we can say that the dataset is known to the person who is training the
machine only then he/she is able to label the data.
Let’s consider the example of currency coins. Problem Statement: Build a model to predict the coin
based on its weight. Assume that we have different currency coins (dataset) having different
weights. 1 Euro weighs 5 grams, 1 Dirham weighs 7 grams, 1 Dollar weighs 3 grams, 1 Rupee weighs
4 grams and so on.
• Feature – Weights,
• Label – Currency
So, if a model is trained in tagging the features i.e., the weights of the coin with the targets i.e.,
currency, the trained model can be further be used to identify a coin based on its weight (since it has
already learnt).
a. Classification
Where the data is classified according to the labels. For example, in the grading system, students are
classified on the basis of the grades they obtain with respect to their marks in the examination. This
model works on discrete dataset which means the data need not be continuous.
Classifying emails as spam or not: The model is shown tons of emails, both real ones (like from
friends or colleagues) and spam. The model learns what makes an email look like spam. Once
trained, the model sees a new email. It analyzes the clues in the email and decides: is this spam or
not? It assigns a category – “spam” or “not spam” – just like sorting your mail.
b. Regression
Such models work on continuous data. For example, if you wish to predict your next salary, then you
would put in the data of your previous salary, any increments, etc., and would train the model. Here,
the data which has been fed to the machine is continuous.
Examples of the Regression Model
Predicting temperature: Temperature is a continuous variable, meaning it can take on any value
within a range. Regression models are well-suited for predicting continuous outputs.
Used Car Price Prediction: This model predicts the selling price of the car with the help of a few
parameters like
fuel type, years of service, the number of previous owners, kilometers driven, transmission type
(manual/automatic) This type of model will be of type regression since it will predict an approximate
price
(continuous value) of the car based on the training dataset.
2. Unsupervised Learning
An unsupervised learning model works on unlabeled dataset. This means that the data which is fed
to the machine is random and there is a possibility that the person who is training the model does
not have any information regarding it. It helps the user in understanding what the data is about and
what are the major features identified by the machine in it.
Assume that we have a customer database with records of their products bought over a period. Now
you being the marketing manager decides to send a grocery offer message to those customers who
buys grocery regularly. Model could discover patterns on its own and could come up with two
different group a) Grocery Shopper and Non-grocery Shopper.
a. Clustering
Refers to the unsupervised learning algorithm which can cluster the unknown data according to the
patterns or trends identified out of it. The patterns observed might be the ones which are known to
the developer, or it might even come up with some unique patterns out of it.
What is the difference between Clustering and Classification?
• Clustering finds similarities between objects and places them in the same cluster and it
differentiates them from objects in other clusters.
b. Association
Association is another type of unsupervised learning method that uses different rules to find
relationships between variables in a given data set. This is a data mining technique used for better
understanding of customer purchasing patterns based on relationships between various products.
Example,
Based on the purchase pattern of customers A and B, can you predict any Customer X who buys
bread will most probably buy?
Based on the purchase pattern of other customers, we can predict that there is high probability that
any customer x who buys bread will most probably buy butter. Therefore, such meaningful
associations can be useful to recommend items to customers. This is called Association Rule.
3. Reinforcement Learning
This learning approach enables the computer to make a series of decisions that maximize a reward
metric for the task without human intervention and without being explicitly programmed to achieve
the task. It’s based on a trial-and-error learning process to achieve the goals. Examples of
reinforcement learning are question and answering, machine translation, and text summarization.
Reinforcement learning is a type of learning in which a machine learns to perform a task through a
repeated trial-and-error method. Let’s say you provide an image of an apple to the machine and ask
the machine to predict it. The machine first predicts it as ‘cherry’ and you give negative feedback
that it’s incorrect. Now, the machine learns that it’s not a cherry.
Then again, you ask the machine to predict the fruit by giving an image of an apple as input; Now, it
knows it is not a cherry. It predicts it as an apple, and you give positive feedback that it’s correct. So,
now the machine learns that this is an apple.
Useful in real-world problems like Useful in finding unknown patterns within data like making
predicting the prices of an item sence of a large number of observations from an experimental
something based on past trends. device.
Summary of ML Models
• Supervised learning models are used when we want to determine relationships through
training.
• Unsupervised learning models are used when we want to discover new patterns from data.
• Reinforcement learning models are used when we want to implement machine learning
through a reward mechanism.
Deep learning is a subset of machine learning that uses artificial neural networks to learn from data.
In deep learning, the machine is trained with huge amounts of data, which helps it in training itself
around the data. Such machines are intelligent enough to develop algorithms for themselves. Deep
learning is the most advanced form of artificial intelligence.
Neural networks are modelled on the human brain and nervous system. They are able to
automatically extract features without input from the programmer. Every neural network node is
essentially a machine learning algorithm. It is useful when solving problems for which the data set is
very large.
An artificial neural network is a type of machine learning algorithm derived from biological neural
network principles that can work similarly to the human brain using neuron interconnection. These
neurons are known as nodes. An artificial neural network has an input layer, an output layer, and
hidden layers. The input layer is responsible for receiving the data from the real world, and all the
input data passes through one or multiple hidden layers and transforms the result using the output
layer.
• Input layer – takes input data and transfers it to the hidden layer of neurons using synapses.
• Hidden layer – takes data from the input layer to categorize the data and send it to more
hidden layers and finally send it to the output layer.
• Output layer – takes the data from the hidden layer and generates the result.
Artificial neural networks are trained using a training set. For example, suppose you want to teach an
ANN to recognize a dog; using the input layer, thousands of different dog images will be shown to
the neural network. Once the neural network is trained, then ANN will help to identify whether the
neural network has identified the dog correctly or not. When output is generated, then ANN will
verify the dog image against the human-provided description. If the ANN finds it incorrect, then
backpropagation is used to adjust the image during the training. This process continues until the
ANN can correctly recognize a dog in an image.
2. Convolutional Neural Network (CNN)
A Convolutional Neural Network (CNN) is a Deep Learning algorithm which can take in an input
image, assign importance (learnable weights and biases) to various aspects/objects in the image and
be able to differentiate one from the other.
In the above diagram, we give an input image, which is then processed through a CNN and then
gives prediction on the basis of the label given in the particular dataset.
• Convolution Layer
• Pooling Layer
Convolution Layer – A convolutional layer is the first layer and the main building block of CNN that
extracts features from images. In the convolution layer, there are several kernels that are used to
produce several features. The output of this layer is called the feature map. A feature map is also
called the activation map. We can use these terms interchangeably.
• We only focus on the features of the image that can help us in processing the image further.
Images shown here are the property of individual organisations and are used here for reference
purpose only.
Rectified Linear Unit Function – After we get the feature map, it is then passed onto the ReLU layer.
This layer simply gets rid of all the negative numbers in the feature map and lets the positive number
stay as it is.
If we see the two graphs side by side, the one on the left is a linear graph. This graph when passed
through the ReLU layer, gives the one on the right. The ReLU graph starts with a horizontal straight
line and then increases linearly as it reaches a positive number.
Pooling Layer – Similar to the Convolutional Layer, the Pooling layer is responsible for reducing the
spatial size of the Convolved Feature while still retaining the important features.
• Max Pooling: Max Pooling returns the maximum value from the portion of the image
covered by the Kernel.
• Average Pooling: Max Pooling returns the maximum value from the portion of the image
covered by the Kernel.
The pooling layer is an important layer in the CNN as it performs a series of tasks which are as
follows:
• Makes the image more resistant to small transformations, distortions and translations in the
input image.
Fully Connected Layer – The final layer in the CNN is the Fully Connected Layer (FCP). The objective
of a fully connected layer is to take the results of the convolution/pooling process and use them to
classify the image into a label (in a simple classification example). For example, if the image is of a
cat, features representing things like whiskers or fur should have high probabilities for the label
“cat”.
TensorFlow Playground is a browser-based software that allows the users to experiment with neural
networks and machine learning algorithms. TensorFlow can simulate small neural networks in real
time in your browser and see the result instantly
Answer: The difference between rule based and learning based AI models are –
Useful in real-world problems like Useful in finding unknown patterns within data like
predicting the prices of an item something making sence of a large number of observations from an
based on past trends. experimental device.
Computing power required is simpler as The computing power required is more complex as
clean labelled data is used as input. unsorted and messy data is used as input
Q2. What is supervised, unsupervised and reinforcement learning? Explain with examples.
Answer:
Supervised Learning – Supervised learning is a machine learning technique that uses labeled data to
train algorithms to predict outcomes. Example of supervised learning is Email spam filtering and
Image recognition
Unsupervised Learning – An unsupervised learning model works on unlabeled dataset. This means
that the data which is fed to the machine is random and there is a possibility that the person who is
training the model does not have any information regarding it. It helps the user in understanding
what the data is about and what are the major features identified by the machine in it. Example of
unsupervised learning is medical image and data exploration.
Reinforcement Learning – This learning approach enables the computer to make a series of
decisions that maximize a reward metric for the task without human intervention and without being
explicitly programmed to achieve the task. It’s based on a trial-and-error learning process to achieve
the goals. Examples of reinforcement learning are question and answering, machine translation, and
text summarization.
Answer: Classification uses predefined classes in which objects are assigned. Clustering finds
similarities between objects and places them in the same cluster and it differentiates them from
objects in other clusters.
Q4. Explain neural networks. Also give functions of three layers of neural networks.
Answer: Neural networks are modelled on the human brain and nervous system. They are able to
automatically extract features without input from the programmer.
These neurons are known as nodes. An artificial neural network has an input layer, an output layer,
and hidden layers. The input layer is responsible for receiving the data from the real world, and all
the input data passes through one or multiple hidden layers and transforms the result using the
output layer.
Answer: the data is classified according to the labels. For example, in the grading system, students
are classified on the basis of the grades they obtain with respect to their marks in the examination.
In regression algorithms predict a continuous value based on the input variables. Continuous values
as Temperature, Price, Income, Age, etc
Q6. Identify the type of learning (supervised, unsupervised, reinforcement learning) are the
following case studies most likely based on?
a) Case Study 1: A company wants to predict customer churn based on past purchasing behavior,
demographics, and customer interactions. They have a dataset with labeled examples of customers
who churned and those who did not.
b) Case Study 2: A social media platform wants to group users based on their interests and behavior
to recommend relevant content. They have a large dataset of user interactions but no predefined
categories. Which type of learning is this case study most likely based on?
d) Case Study 4: A healthcare provider wants to identify patterns in patient data to personalize
treatment plans. They have a dataset with various patient attributes but no predefined labels
indicating specific treatment plans. Which type of learning is this case study most likely based on?
e) Case Study 5: A manufacturing company wants to optimize its production process by detecting
anomalies in sensor data from machinery. They have a dataset with examples of normal and
anomalous behavior. Which type of learning is this case study most likely based on?
Q7. Identify the type of model (classification, regression, clustering, association model) are the
following case studies most likely based on?
a) A bank wants to predict whether a loan applicant will “default” or “non-default” on their loan
payments. They have a dataset containing information such as income, credit score, loan amount,
and employment status.
Answer: Classification
b) A real estate agency wants to predict the selling price of houses based on various features such
as size, location, number of bedrooms, and bathrooms. They have a dataset containing historical
sales data.
Answer: Regression
c) A marketing company wants to segment its customer base into distinct groups based on
purchasing behavior for targeted marketing campaigns. They have a dataset containing
information such as purchase history, frequency of purchases, and amount spent.
Answer: Clustering
Q8. A healthcare provider wants to improve patient care by predicting the length of hospital stays
for different medical conditions. They have a dataset containing patient demographics, medical
history, and treatment details. The task involves:
a) To predict whether a patient will have a short or long hospital stay.
Answer: Classification
Answer: Regression
c) To segment patients into groups with similar characteristics for personalized treatment plans.
Answer: Clustering
Identify the type of model (classification, regression, clustering, and association model) in the above
tasks.
Answer:
b) Context: A homeowner is deciding whether to invest in solar panels for their house.
Factors: – Do I have a sufficient average amount of sunlight in my area? – Are there any available
incentives or rebates for installing solar panels? – Does installing solar panels impact the value of
my home? – Does solar energy lead to environmental benefits?
Answer:
Q10. Sirisha and Divisha want to make a model which will organize the unlabeled input data into
groups based on features. Which learning model should they use and why?
Answer: Clustering model/Unsupervised learning is used to organize the unlabeled input data into
groups
based on features. Clustering is an unsupervised learning algorithm which can cluster unknown data
according to
the patterns or trends identified out of it. The patterns observed might be the ones which are known
to the developer or it might even come up with some unique patterns out of it.
Q11. Identify and explain the types of the learning-based approaches in the figures given below.
Answer: The learning-based approaches shown in the given figures are Supervised learning and
Unsupervised learning.
Figure 1: In a supervised learning model, the dataset which is fed to the machine is labelled. In other
words, we can say that the dataset is known to the person who is training the machine only then
he/she is able to label the
data. A label is some information which can be used as a tag for data. Here, labelled images of dog
and cat are fed into the model and trained. The model correctly identifies the given input as dog.
Figure 2: An unsupervised learning model works on unlabelled dataset. This means that the data
which is fed to the machine is random and there is a possibility that the person who is training the
model does not have any information regarding it. The unsupervised learning models are used to
identify relationships, patterns and
trends out of the data which is fed into it. It helps the user in understanding what the data is about
and what are the major features identified by the machine in it. Here, images of a set of animals are
fed into the AI model and the model clusters them based on similar features
Q12. Neural networks are said to be modelled the way how neurons in the human brain behave. A
similar system is mimicked by the AI machine to perform certain tasks. Explain how neural
networks work in an AI model and mention any three features of Neural Networks.
Answer: Neural networks are loosely modelled after how neurons in the human brain behave. The
features of a neural network are :
• It is a fast and efficient way to solve problems for which the dataset is very large, such as in
images.
• They are able to extract data features automatically without needing the input of the
programmer.
Q13. Why should we avoid using the training data for evaluation?
Answer: This is because our model will simply remember the whole training set, and will therefore
always predict the correct label for any point in the training set.
Answer:
Classification Regression