ML LAB - V SEM - BCA
ML LAB - V SEM - BCA
ML LAB - V SEM - BCA
Description :
Exploratory Data Analysis (EDA) is a crucial step in any machine learning project. It involves
analyzing and visualizing data to understand its characteristics, patterns, and relationships.
Below is a simple example program in Python using popular libraries like Pandas, Matplotlib,
and Seaborn for EDA.
Source code :
import pandas as pd
# Load your dataset (replace 'your_dataset.csv' with your actual file path)
df = pd.read_csv('your_dataset.csv')
print(df.info())
print(df.head())
# Summary statistics
print(df.describe())
print(df.isnull().sum())
plt.figure(figsize=(10, 6))
sns.histplot(df['numerical_feature'], kde=True)
P.V.V.SANDEEP MCA 1
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
plt.xlabel('Numerical Feature')
plt.ylabel('Frequency')
plt.show()
plt.figure(figsize=(10, 6))
sns.countplot(x='categorical_feature', data=df)
plt.xlabel('Categorical Feature')
plt.ylabel('Count')
plt.show()
plt.figure(figsize=(10, 6))
plt.show()
# Bivariate analysis - Box plot for a numerical feature and a categorical feature
plt.figure(figsize=(12, 8))
plt.xlabel('Categorical Feature')
plt.ylabel('Numerical Feature')
plt.show()
# Correlation matrix
P.V.V.SANDEEP MCA 2
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
correlation_matrix = df.corr()
plt.figure(figsize=(12, 8))
plt.title('Correlation Matrix')
plt.show()
Source code :
<class 'pandas.core.frame.DataFrame'>
None
P.V.V.SANDEEP MCA 3
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
numerical_feature 0
categorical_feature 0
numerical_feature_1 0
numerical_feature_2 0
target 0
dtype: int64
Description :
Feature selection is an essential step in machine learning, helping to identify the most
relevant features for building a predictive model. There are various methods for feature
selection, and they can be broadly categorized into "filter" methods, "wrapper" methods, and
"embedded" methods. In this example, I'll provide a simple program that demonstrates both
ranking and wrapper methods for feature selection using scikit-learn in Python.
First, make sure you have scikit-learn installed. You can install it using:
Source code :
P.V.V.SANDEEP MCA 4
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
import numpy as np
import pandas as pd
iris = load_iris()
X = iris.data
y = iris.target
P.V.V.SANDEEP MCA 5
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
selected_features = selected_features.sort_values(by='Rank')
# Train a model with the selected features and evaluate its performance
rf_classifier.fit(X_train_rfe, y_train)
X_test_rfe = rfe_selector.transform(X_test)
Output :
P.V.V.SANDEEP MCA 6
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
Description :
Source code :
import numpy as np
import pandas as pd
iris = load_iris()
X = iris.data
y = iris.target
X_standardized = StandardScaler().fit_transform(X)
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_standardized)
df_pca['Target'] = y
P.V.V.SANDEEP MCA 7
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
plt.figure(figsize=(10, 6))
targets = np.unique(y)
c=color,
label=f'Target {target}',
alpha=0.7)
plt.legend()
plt.show()
explained_variance_ratio = pca.explained_variance_ratio_
Output :
P.V.V.SANDEEP MCA 8
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
0 -2.264542 0.505704 0
1 -2.086426 -0.655405 0
2 -2.367950 -0.318477 0
3 -2.304197 -0.575368 0
4 -2.388777 0.674767 0
Description :
Hyperparameter tuning involves finding the best set of hyperparameters for your model to
optimize its performance. Below is a simple example using scikit-learn's gridsearchcv to tune
hyperparameters for a Support Vector Machine (SVM) classifier on the Iris dataset
Source code :
import numpy as np
import pandas as pd
iris = load_iris()
X = iris.data
y = iris.target
svm_classifier = SVC()
param_grid = {
grid_search.fit(X_train, y_train)
best_params = grid_search.best_params_
y_pred = grid_search.predict(X_test)
Output :
Description :
P.V.V.SANDEEP MCA 10
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
The Gaussian Naive Bayes classifier (gaussiannb) for probabilistic classification on the Iris
dataset. The fit method is used to train the model, and the predict method is used to make
predictions on the test set. The accuracy, classification report, and confusion matrix are then
printed to evaluate the performance of the classifier.
Make sure you have scikit-learn installed (pip install scikit-learn) before running this program.
Source code:
import numpy as np
import pandas as pd
iris = load_iris()
X = iris.data
y = iris.target
naive_bayes_classifier = GaussianNB()
naive_bayes_classifier.fit(X_train, y_train)
y_pred = naive_bayes_classifier.predict(X_test)
P.V.V.SANDEEP MCA 11
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
print(f'Accuracy: {accuracy:.2f}')
Output :
Accuracy: 0.97
Classification Report:
accuracy 0.97 30
Confusion Matrix:
[[10 0 0]
[ 0 10 0]
[ 0 1 9]]
Description :
This lab covers both linear regression for predicting continuous target variables and logistic
regression for binary classification. It includes training the models, making predictions, and
evaluating their performance using metrics like mean squared error for regression and
P.V.V.SANDEEP MCA 12
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
accuracy for classification. The visualizations show the predicted values compared to the
actual values in the case of linear regression. Remember to replace 'regression_data.csv',
'target_regression', 'target_classification' with your actual data file name and target column
names. Adjust the code based on your specific dataset and requirements.
Source code:
import numpy as np
import pandas as pd
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
linear_reg_model = LinearRegression()
linear_reg_model.fit(X_train, y_train)
y_pred = linear_reg_model.predict(X_test)
P.V.V.SANDEEP MCA 13
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
r2 = r2_score(y_test, y_pred)
print(f'R-squared: {r2:.2f}')
plt.xlabel('X')
plt.ylabel('y')
plt.show()
Output :
R-squared: 0.98
import numpy as np
import pandas as pd
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
P.V.V.SANDEEP MCA 14
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
logistic_reg_model = LogisticRegression()
logistic_reg_model.fit(X_train, y_train.ravel())
y_pred = logistic_reg_model.predict(X_test)
print(f'Accuracy: {accuracy:.2f}')
print(f'Confusion Matrix:\n{conf_matrix}')
print(f'Classification Report:\n{class_report}')
plt.xlabel('X')
plt.ylabel('y')
plt.show()
Output :
Accuracy: 0.95
Confusion Matrix:
[[9 1]
P.V.V.SANDEEP MCA 15
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
[0 10]]
Classification Report:
accuracy 0.95 20
Description :
Tree-based algorithms, such as Decision Trees, Random Forests, and Gradient Boosted Trees,
are powerful techniques for classification tasks. Below, I'll provide an example using scikit-
learn to demonstrate a simple classification task using a Decision Tree and a Random Forest
In this example, we use the Iris dataset for a classification task. We train a Decision Tree
classifier and a Random Forest classifier on the training set and evaluate their performance on
the test set using accuracy, confusion matrix, and classification report.
Make sure you have scikit-learn installed (pip install scikit-learn) before running this program.
Adjust the data and model parameters as needed for your specific use case.
Source code :
import numpy as np
import pandas as pd
iris = load_iris()
X = iris.data
y = iris.target
decision_tree_model = DecisionTreeClassifier()
decision_tree_model.fit(X_train, y_train)
y_pred_dt = decision_tree_model.predict(X_test)
print(f'Accuracy: {accuracy_dt:.2f}')
print(f'Confusion Matrix:\n{conf_matrix_dt}')
print(f'Classification Report:\n{class_report_dt}')
random_forest_model.fit(X_train, y_train)
y_pred_rf = random_forest_model.predict(X_test)
P.V.V.SANDEEP MCA 17
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
print(f'Accuracy: {accuracy_rf:.2f}')
print(f'Confusion Matrix:\n{conf_matrix_rf}')
print(f'Classification Report:\n{class_report_rf}')
Output:
Accuracy: 1.00
Confusion Matrix:
[[10 0 0]
[ 0 9 0]
[ 0 0 11]]
Classification Report:
accuracy 1.00 30
Accuracy: 1.00
P.V.V.SANDEEP MCA 18
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
Confusion Matrix:
[[10 0 0]
[ 0 9 0]
[ 0 0 11]]
Classification Report:
accuracy 1.00 30
Description :
In this example, we build a simple neural network with one hidden layer. The model is
compiled using the Adam optimizer and categorical crossentropy loss. We then train the
model on the Iris dataset and evaluate its performance on the test set.
Feel free to adjust the architecture of the neural network, the number of epochs, or other
hyperparameters based on your specific needs.
a simple neural network for classification using the popular deep learning library Keras.
Source code :
import numpy as np
P.V.V.SANDEEP MCA 19
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
import pandas as pd
iris = load_iris()
X = iris.data
y = iris.target
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
model = Sequential()
model.add(Dense(3, activation='softmax'))
P.V.V.SANDEEP MCA 20
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
y_pred_nn = model.predict_classes(X_test)
print(f'Accuracy: {accuracy_nn:.2f}')
print(f'Confusion Matrix:\n{conf_matrix_nn}')
print(f'Classification Report:\n{class_report_nn}')
Output :
Epoch 1/50
Epoch 2/50
...
Epoch 49/50
Epoch 50/50
P.V.V.SANDEEP MCA 21
PRAGATI WOMENS DEGREE COLLEGE ML USING PYTHON III BCA – V SEM
Accuracy: 0.97
Confusion Matrix:
[[10 0 0]
[ 0 9 1]
[ 0 0 10]]
Classification Report:
accuracy 0.97 30
P.V.V.SANDEEP MCA 22