Machine Learning Assignment-7.Sol

The document consists of multiple-choice and subjective questions related to machine learning concepts, particularly focusing on scikit-learn library functionalities, hyperparameter tuning, ensemble techniques, and model evaluation metrics. Key topics include the use of GridSearchCV for hyperparameter tuning, the characteristics of Random Forests, and the importance of scaling features in datasets. Additionally, it discusses the implications of model complexity on bias and variance, and the limitations of accuracy as a performance metric in imbalanced datasets.

Uploaded by

faltukaamdone

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views6 pages

Machine Learning Assignment-7.Sol

Uploaded by

faltukaamdone

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

!

""#$%&'%()*)+

&!,-#%').'!/%#%$
Q1 to Q8, Choose the correct option:

1. Which of the following in scikit-learn library is used for hyper

parameter tuning?
a. GridSearchCV
b. RandomizedCV
c. K-fold Cross Validation d. All
of the above
Answer: a. GridSearchCV and b. RandomizedCV

2. In which of the below ensemble techniques trees are trained in

parallel?
a. Random Forest. b.
Adaboost
c. Gradient Boosting d. All
of the above
Answer: a. Random Forest

3. In machine learning, if in the below line of code:

sklearn.svm.SVC(C=1.0, kernel='rbf', degree=3)
we increasing the Chyper parameter, what will happen?
a. The regularization will increase b. The
regularization will decrease c. No effect on
regularization
d. Kernel will be changed to linear
Answer: a. The regularization will increase

4. Check the below line of code and answer the following questions:

sklearn.tree.DecisionTreeClassifier(* criterion='gini',splitter='best',
max_depth=None, min_samples_split=2). Which of the following is true
regarding max_depth hyper parameter?
!""#$%&'%()*)+

a. It regularizes the decision tree by limiting the maximum

depth up to which a tree can be grown
b. It denotes the number of children a node can have c.
Both A & B
d. None of the above

Answer: a. It regularizes the decision tree by limiting the maximum depth

up to which a tree can be grown
5. Which of the following is true regarding Random Forests?
a. It's an ensemble of weak learners
b. The component trees are trained in series
c. In case of classification problem, the prediction is made by
taking mode of the class labels predicted by the component trees
d. None of the above
Answer: a. It’s an ensemble of weak learners and
6. What can be the disadvantage if the learning rate is very high in
gradient descent?
a. Gradient Descent algorithm can diverge from the optimal
solution.
b. Gradient Descent algorithm can keep oscillating around the
optimal solution and may not settle c.
Both of them
d. None of them
Answer: c) Both of them
7. As the model complexity increases, what will happen?
a. Bias will increase, Variance decrease
!""#$%&'%()*)+

b. Bias will decrease, Variance Increase c. Both

bias and variance increase
d. Both bias and variance decrease
Answer: b) Bias will decrease, Variance increases

8. Suppose I have a linear regression model which is performing as

follows: Train accuracy=0.95 and Test accuracy=0.75. Which of the
following is true regarding the model?
a. Model is underfitting b.
Model is overfitting
c. Model is performing good d.
None of the above
Answer: b. Model is overfitting

Q9 to Q15 are subjective answer type questions, Answer them briefly.

9. Suppose we have a dataset which have two classes A and B. The percentage of
class A is 40% and percentage of class B is 60%.
Calculate the Gini index and entropy of the dataset?
Answer: To calculate the Gini index of the dataset, we use the formula:
2 2
=1−( 1 + 2 )
Where;
• p1 is the probability of class A

• p2 is the probability of class B.

In this case, p1 = 0.4 and p2 = 0.6, so,
2 2
=1−(0.4 +0.6 )=0.48
To calculate the entropy of the dataset, we use the formula
= − 1 ( 1) − 2 ( 2)
In this case, p1 = 0.4 and p2 = 0.6, so,
= −0.4 (0.4) − 0.6 (0.6) = 0.97
𝐺
𝑒
𝑒
𝑛
𝑛
𝑖
𝑡
𝑛
𝑡
𝑟
𝑟
𝑖
𝑜
𝑜
𝑝
𝑝
𝑦
𝑦
𝐺
𝑖
𝑛
𝑖
𝑝
𝑙
𝑙
𝑜
𝑜
𝑔
𝑔
𝑝
𝑝
𝑝
𝑝
𝑙
𝑜
𝑙
𝑔
𝑜
𝑔
𝑝
!""#$%&'%()*)+

10.What are the advantages of Random Forests over Decision

Tree?
Answer: Some advantages of Random Forests over Decision Tree
are:
• Random forests are less prone to overfitting than a single
decision tree.
• Random forests can handle missing values and maintain
accuracy.
• Random forests can be used for both classification and
regression problems.
• Random forests can provide feature importance, which can
be used for feature selection.

11.Whatis the need of scaling all numerical features in a

dataset? Name any two techniques used for scaling?
Answer: Scaling all numerical features in a dataset is important because many
machine learning algorithms, such as those based
on distance measures, are sensitive to the scale of the features. Without scaling,
these algorithms would be affected by the presence of large scale features and
small scale features. Two techniques used for scaling are Min-Max Scaling
and Standardization.

12.Write down some advantages which scaling provides in

optimization using gradient descent algorithm?
Answer: Scaling provides the following advantages in optimization
using gradient descent algorithm:
!""#$%&'%()*)+

• It helps to converge faster by reducing the oscillations in the

optimization path.
• It helps to nd the global minimum by reducing the chances
of getting stuck in a local minimum.

13. In case of a highly imbalanced dataset for a classification

problem, is accuracy a good metric to measure the performance
of the model. If not, why?
Answer: In case of a highly imbalanced dataset for a classification problem,
accuracy is not a good metric to measure the performance of
the model. This is because accuracy does not take into account the
imbalance of the classes and can be misleading. Other metrics such as
precision, recall, F1-score, and AUC-ROCare more suitable for such cases.

14. What is “ f-score" metric? Write its mathematical formula.

Answer: F-score is a metric that balances precision and recall and is
commonly used in classification problems. The mathematical
formula for F-score is:
− = (2 ∗ ∗ )/( + )

15. What is the difference between fit(), transform() and

fit_transform()?
Answer: fit() is used to train a model on the training data,
transform() is used to apply a pre-trained model to the new data and
fit_transform() isused to train amodel on the training data and then apply
it to the new data.

The difference between fit() and transform() is that fit() learns the
parameters of the model, i.e., it trains the model on the training
fi
𝐹
𝑠
𝑐
𝑜
𝑟
𝑒
𝑝
𝑟
𝑒
𝑐
𝑖
𝑠
𝑖
𝑜
𝑛
𝑟
𝑒
𝑐
𝑎
𝑙
𝑙
𝑝
𝑟
𝑒
𝑐
𝑖
𝑠
𝑖
𝑜
𝑛
𝑟
𝑒
𝑐
𝑎
𝑙
𝑙
!""#$%&'%()*)+

data, while transform() applies the already trained model to the new data. On
the other hand, fit_transform() combines the functionality of fit() and
transform() in one step, it first trains the model on the training data and then
applies it to the new data in
one step.

Viva ML
No ratings yet
Viva ML
10 pages
ML Interview Qes.
No ratings yet
ML Interview Qes.
21 pages
15 Mlops Interview Questions For 2025
No ratings yet
15 Mlops Interview Questions For 2025
13 pages
IML-IITKGP - Assignment 1 Solution
100% (1)
IML-IITKGP - Assignment 1 Solution
7 pages
2023 ML Assignment
No ratings yet
2023 ML Assignment
57 pages
Machine Learning Qs
No ratings yet
Machine Learning Qs
10 pages
ML Important Questions
No ratings yet
ML Important Questions
7 pages
Interview Questions AI
No ratings yet
Interview Questions AI
7 pages
Q1-What's The Trade-Off Between Bias and Variance?
100% (1)
Q1-What's The Trade-Off Between Bias and Variance?
5 pages
ML Interview Questions PDF
83% (6)
ML Interview Questions PDF
20 pages
ECS7020P Sample Paper Solutions
No ratings yet
ECS7020P Sample Paper Solutions
6 pages
Data Science Final Mock Test
No ratings yet
Data Science Final Mock Test
47 pages
ML Interview Ques
No ratings yet
ML Interview Ques
12 pages
Data Science Interview Question
No ratings yet
Data Science Interview Question
7 pages
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
No ratings yet
Exam in Statistical Machine Learning Statistisk Maskininlärning (1RT700)
10 pages
Introduction To Machine Learning - Ecen 4122 - 2023
No ratings yet
Introduction To Machine Learning - Ecen 4122 - 2023
4 pages
ML 19.03 Sidenotes
No ratings yet
ML 19.03 Sidenotes
30 pages
Interview Questions
No ratings yet
Interview Questions
24 pages
Midpaper
No ratings yet
Midpaper
16 pages
Machine Learning Assessment 12
No ratings yet
Machine Learning Assessment 12
7 pages
examBD2223 January Solutions
No ratings yet
examBD2223 January Solutions
7 pages
Mock Interview Sample Questions Answers-1
No ratings yet
Mock Interview Sample Questions Answers-1
34 pages
Interview Questions On Machine Learning
100% (4)
Interview Questions On Machine Learning
22 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
MachineLearning MidTerm UMT Spring 2021
100% (1)
MachineLearning MidTerm UMT Spring 2021
12 pages
5926 - Question - Paper ML
No ratings yet
5926 - Question - Paper ML
2 pages
ML - Unit 2 - Question Bank
No ratings yet
ML - Unit 2 - Question Bank
15 pages
Ml-Unit 2-QB
No ratings yet
Ml-Unit 2-QB
6 pages
2022 ML Assignments
100% (1)
2022 ML Assignments
45 pages
Machine Learning Quiz for Students
No ratings yet
Machine Learning Quiz for Students
5 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
116 pages
Midterm
No ratings yet
Midterm
12 pages
212 Final-Solution
No ratings yet
212 Final-Solution
23 pages
Lecture1 MCQ Guide
No ratings yet
Lecture1 MCQ Guide
4 pages
Machine Learning & AI Quiz Answers
No ratings yet
Machine Learning & AI Quiz Answers
15 pages
Data Mining Sample Midterm Questions (Last Modified 2/17/19)
No ratings yet
Data Mining Sample Midterm Questions (Last Modified 2/17/19)
4 pages
Aiml Solved Answers For QP
No ratings yet
Aiml Solved Answers For QP
39 pages
Machine Learning Interview Prep
No ratings yet
Machine Learning Interview Prep
14 pages
Top 30 AI ML Fresher QA
No ratings yet
Top 30 AI ML Fresher QA
3 pages
21Csc305P-Machine Learning: Offline
No ratings yet
21Csc305P-Machine Learning: Offline
8 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
MCQ Machine Learning
No ratings yet
MCQ Machine Learning
23 pages
Aam Unit 1 QB With Answer
No ratings yet
Aam Unit 1 QB With Answer
12 pages
Answer 2022-23
No ratings yet
Answer 2022-23
22 pages
2 Mark Questions
No ratings yet
2 Mark Questions
13 pages
Machine Learning Full Question Bank
No ratings yet
Machine Learning Full Question Bank
14 pages
Lecture 5
No ratings yet
Lecture 5
26 pages
200 Data Science Interview Questions
No ratings yet
200 Data Science Interview Questions
16 pages
ML Term 1 and 2
No ratings yet
ML Term 1 and 2
6 pages
Ai Possible Qns
No ratings yet
Ai Possible Qns
15 pages
April May 2024
No ratings yet
April May 2024
17 pages
Regression
No ratings yet
Regression
13 pages
Giant Pile ML Problems
No ratings yet
Giant Pile ML Problems
56 pages
Answer 2023-24
No ratings yet
Answer 2023-24
19 pages
ES335
No ratings yet
ES335
22 pages
Machine Learning Model Essentials
No ratings yet
Machine Learning Model Essentials
8 pages
Ai ML Unit 3
No ratings yet
Ai ML Unit 3
15 pages
Practical 7 Classification Revision Questions
No ratings yet
Practical 7 Classification Revision Questions
8 pages
Machine Learning Engineer Interview Preparation Guide
No ratings yet
Machine Learning Engineer Interview Preparation Guide
14 pages
Customer Churn Prediction Employing Ensemble Learning
No ratings yet
Customer Churn Prediction Employing Ensemble Learning
5 pages
Attribution Models
No ratings yet
Attribution Models
7 pages
Machine Learning
No ratings yet
Machine Learning
42 pages
Predictive Maintenance For Automotive Vehicle Engines in Military Logistics
100% (1)
Predictive Maintenance For Automotive Vehicle Engines in Military Logistics
12 pages
AgroAdvisor Crop Yield Prediction Crop and Fertili
No ratings yet
AgroAdvisor Crop Yield Prediction Crop and Fertili
27 pages
Random State
No ratings yet
Random State
4 pages
Training Report On Machine Learning
No ratings yet
Training Report On Machine Learning
27 pages
Consumer Behavior Analysis of Social Media Networks by Using Machine Learning
No ratings yet
Consumer Behavior Analysis of Social Media Networks by Using Machine Learning
4 pages
Forecasting by Machine Learning Techniques and Econometrics A Review
No ratings yet
Forecasting by Machine Learning Techniques and Econometrics A Review
7 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
Credit Risk Model Lauki Finance Presentation CCL
No ratings yet
Credit Risk Model Lauki Finance Presentation CCL
10 pages
1 s2.0 S2214509524001426 Main
No ratings yet
1 s2.0 S2214509524001426 Main
21 pages
House Price Prediction Project
No ratings yet
House Price Prediction Project
22 pages
Prediction of Admission and Jobs in Engineering and Technology With Respect To Demographic Locations
No ratings yet
Prediction of Admission and Jobs in Engineering and Technology With Respect To Demographic Locations
8 pages
Orange Machine Learning
No ratings yet
Orange Machine Learning
8 pages
Mathematical Modeling and Analysis of Credit Scori
No ratings yet
Mathematical Modeling and Analysis of Credit Scori
28 pages
Ensemble Learning for Data Scientists
No ratings yet
Ensemble Learning for Data Scientists
41 pages
SMS Spam Detection with NLP
No ratings yet
SMS Spam Detection with NLP
21 pages
Crop Prediction & Storage Solutions
No ratings yet
Crop Prediction & Storage Solutions
39 pages
Ijase 202503 22 1 003
No ratings yet
Ijase 202503 22 1 003
15 pages
Machine Learning - Project Approach - Opendir - Cloud
No ratings yet
Machine Learning - Project Approach - Opendir - Cloud
1 page
Ly and Nguyen (2020) ICSC
No ratings yet
Ly and Nguyen (2020) ICSC
4 pages
Predicting BPLMatch Winners An Empirical Study Using Machine Learning Approach
No ratings yet
Predicting BPLMatch Winners An Empirical Study Using Machine Learning Approach
9 pages
Result-Based Talent Identification in Road Cycling - Discovering
No ratings yet
Result-Based Talent Identification in Road Cycling - Discovering
18 pages
1machine Learning Based Intelligent Career Counselling Chatbot ICCC
No ratings yet
1machine Learning Based Intelligent Career Counselling Chatbot ICCC
8 pages
Diagnosing Cervical Cancer Using Machine Learning Methods
No ratings yet
Diagnosing Cervical Cancer Using Machine Learning Methods
3 pages
Complete Chapter
No ratings yet
Complete Chapter
6 pages
Spam Email. Classifier
No ratings yet
Spam Email. Classifier
44 pages
Heart Disease
No ratings yet
Heart Disease
6 pages