0% found this document useful (0 votes)

39 views7 pages

Modelling and Simmulation Assignment - Ipynb - Colab

Student Droupout Prediction Using Decision Tree Classifier

Uploaded by

Muhammad Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views7 pages

Modelling and Simmulation Assignment - Ipynb - Colab

Student Droupout Prediction Using Decision Tree Classifier

Uploaded by

Muhammad Ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

keyboard_arrow_down Step 1: Exploratory Data Analysis (EDA)

Let's begin by examining the dataset to understand its structure and the relationships between features and the target variable
(Dropout/Graduate).

path= '/content/drive/MyDrive/dataset.csv'

import pandas as pd

data= pd.read_csv(path)

data

Marital Application Application Daytime/evening Previous

Course Nacion
status mode order attendance qualification

0 1 8 5 2 1 1

1 1 6 1 11 1 1

2 1 1 5 5 1 1

3 1 8 2 15 1 1

4 2 12 1 3 0 1

... ... ... ... ... ... ...

4419 1 1 6 15 1 1

4420 1 1 2 15 1 1

4421 1 1 1 12 1 1

4422 1 1 1 9 1 1

4423 1 5 1 15 1 1

4424 rows × 35 columns

# Display summary statistics

data.describe()

Marital Application Application Daytime/evening Previou

Course
status mode order attendance qualificatio

count 4424.000000 4424.000000 4424.000000 4424.000000 4424.000000 4424.00000

mean 1.178571 6.886980 1.727848 9.899186 0.890823 2.53142

std 0.605747 5.298964 1.313793 4.331792 0.311897 3.96370

min 1.000000 1.000000 0.000000 1.000000 0.000000 1.00000

25% 1.000000 1.000000 1.000000 6.000000 1.000000 1.00000

50% 1.000000 8.000000 1.000000 10.000000 1.000000 1.00000

75% 1.000000 12.000000 2.000000 13.000000 1.000000 1.00000

max 6.000000 18.000000 9.000000 17.000000 1.000000 17.00000

8 rows × 34 columns

# Display data types of each column

data.dtypes
0

Marital status int64

Application mode int64

Application order int64

Course int64

Daytime/evening attendance int64

Previous qualification int64

Nacionality int64

Mother's qualification int64

Father's qualification int64

Mother's occupation int64

Father's occupation int64

Displaced int64

Educational special needs int64

Debtor int64

Tuition fees up to date int64

Gender int64

Scholarship holder int64

Age at enrollment int64

International int64

Curricular units 1st sem (credited) int64

Curricular units 1st sem (enrolled) int64

Curricular units 1st sem (evaluations) int64

Curricular units 1st sem (approved) int64

Curricular units 1st sem (grade) float64

Curricular units 1st sem (without evaluations) int64

Curricular units 2nd sem (credited) int64

Curricular units 2nd sem (enrolled) int64

Curricular units 2nd sem (evaluations) int64

Curricular units 2nd sem (approved) int64

Curricular units 2nd sem (grade) float64

Curricular units 2nd sem (without evaluations) int64

# Check for missing values

data.isnull().sum()
Application mode 0

Application order 0

Course 0

Daytime/evening attendance 0

Previous qualification 0

Nacionality 0

Mother's qualification 0

Father's qualification 0

Mother's occupation 0

Father's occupation 0

Displaced 0

Educational special needs 0

Debtor 0

Tuition fees up to date 0

Gender 0

Scholarship holder 0

Age at enrollment 0

International 0

Curricular units 1st sem (credited) 0

Curricular units 1st sem (enrolled) 0

Curricular units 1st sem (evaluations) 0

Curricular units 1st sem (approved) 0

Curricular units 1st sem (grade) 0

Curricular units 1st sem (without evaluations) 0

Curricular units 2nd sem (credited) 0

Curricular units 2nd sem (enrolled) 0

Curricular units 2nd sem (evaluations) 0

Curricular units 2nd sem (approved) 0

Curricular units 2nd sem (grade) 0

Curricular units 2nd sem (without evaluations) 0

Unemployment rate 0

Inflation rate 0

keyboard_arrow_down Step 2: Data Visualization

We will create various charts to visualize the data.

Scatter Plot

Let's create a scatter plot to see the relationship between the " Curricular units 2nd sem (grade) " and the " Target ".

import matplotlib.pyplot as plt

plt.scatter(data['Curricular units 2nd sem (grade)'], data['Target'])
plt.xlabel('Curricular units 2nd sem (grade)')
plt.ylabel('Target')
plt.title('Scatter Plot of Curricular units 2nd sem (grade) vs. Target'
plt show()

Bar Chart

Let's create a bar chart for the " Marital status " feature.

data['Marital status'].value_counts().plot(kind='bar')
plt.xlabel('Marital Status')
plt.ylabel('Count')
plt.title('Bar Chart of Marital Status')
plt.show()

Box Plot

Let's create a box plot for the " Curricular units 2nd sem (grade) " feature.
data.boxplot(column='Curricular units 2nd sem (grade)')
plt.title('Box Plot of Curricular units 2nd sem (grade)')
plt.show()

Histogram

Let's create a histogram for the " Curricular units 2nd sem (grade) " feature.

data['Curricular units 2nd sem (grade)'].hist()

plt.xlabel('Curricular units 2nd sem (grade)')
plt.ylabel('Frequency')
plt.title('Histogram of Curricular units 2nd sem (grade)')
plt.show()

keyboard_arrow_down Step 3: Data Preprocessing

We will preprocess the data, handling missing values, encoding categorical variables, and splitting the data into training and testing sets.

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import LabelEncoder

# Encode the target variable

label_encoder = LabelEncoder()
data['Target'] = label_encoder.fit_transform(data['Target'])

# Define the features (X) and the target (y)

X = data.drop('Target', axis=1)
y = data['Target']

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

keyboard_arrow_down Step 4: Model Building

We will build and train a decision tree model to predict student dropout rates.

from sklearn.tree import DecisionTreeClassifier

from sklearn.metrics import accuracy_score, classification_report

# Build and train the model

model = DecisionTreeClassifier()
model.fit(X_train, y_train)

# Make predictions
y_pred = model.predict(X_test)

# Evaluate the model

accuracy = accuracy_score(y_test, y_pred)
report = classification_report(y_test, y_pred)

print(f'Accuracy: {accuracy}')
print(f'Classification Report:\n{report}')

Accuracy: 0.6813559322033899
Classification Report:
precision recall f1-score support

0 0.77 0.66 0.71 316

1 0.34 0.39 0.36 151
2 0.76 0.81 0.78 418

accuracy 0.68 885

macro avg 0.62 0.62 0.62 885
weighted avg 0.69 0.68 0.68 885

def get_user_input_and_predict(model, feature_columns):

user_input = {}
for column in feature_columns:
user_input[column] = [input(f"Enter value for {column}: ")]

# Create a DataFrame for user inputs

input_df = pd.DataFrame(user_input)

# Handle any necessary preprocessing (e.g., converting to numeric)

for column in feature_columns:
if X[column].dtype in ['int64', 'float64']:
input_df[column] = pd.to_numeric(input_df[column])

# Predict using the trained model

prediction = model.predict(input_df)

# Decode the prediction

decoded_prediction = label_encoder.inverse_transform(prediction)

return decoded_prediction[0]

pred= model.predict(X_test)

# Dictionary for mapping encoded target values to original labels

target_mapping = {0: 'Dropout', 1: 'Enrolled', 2: 'Graduate'}
output= target_mapping[pred[0]]

original=target_mapping[y_pred[0]]

Comparing Values

print(f"Original Value: '{original}' and Predicted Value: '{output}'")

Original Value: 'Dropout' and Predicted Value: 'Dropout'

feature_columns = X.columns

# Predict on user inputs

predicted_class = get_user_input_and_predict(model, feature_columns)
predicted_class= target_mapping[predicted_class]
print(f"The predicted class is: {predicted_class}")

Enter value for Marital status: 1

Enter value for Application mode: 8
Enter value for Application order: 5
Enter value for Course: 2
Enter value for Daytime/evening attendance: 1
Enter value for Previous qualification: 1
Enter value for Nacionality: 1
Enter value for Mother's qualification: 1
Enter value for Father's qualification: 10
Enter value for Mother's occupation: 6
Enter value for Father's occupation: 10
Enter value for Displaced: 1
Enter value for Educational special needs: 0
Enter value for Debtor: 0
Enter value for Tuition fees up to date: 1
Enter value for Gender: 1
Enter value for Scholarship holder: 0
Enter value for Age at enrollment: 20
Enter value for International: 0
Enter value for Curricular units 1st sem (credited): 0
Enter value for Curricular units 1st sem (enrolled): 0
Enter value for Curricular units 1st sem (evaluations): 0
Enter value for Curricular units 1st sem (approved): 0
Enter value for Curricular units 1st sem (grade): 0
Enter value for Curricular units 1st sem (without evaluations): 0
Enter value for Curricular units 2nd sem (credited): 0
Enter value for Curricular units 2nd sem (enrolled): 0
Enter value for Curricular units 2nd sem (evaluations): 0
Enter value for Curricular units 2nd sem (approved): 0
Enter value for Curricular units 2nd sem (grade): 0
Enter value for Curricular units 2nd sem (without evaluations): 0
Enter value for Unemployment rate: 10.8
Enter value for Inflation rate: 1.4
Enter value for GDP: 1.74
The predicted class is: Dropout

Final-12-Lab Programs
No ratings yet
Final-12-Lab Programs
30 pages
Student Performance Analysis and Prediction 2.3
No ratings yet
Student Performance Analysis and Prediction 2.3
19 pages
Source Code
No ratings yet
Source Code
20 pages
Code Explanation
No ratings yet
Code Explanation
3 pages
Da Lab Mannual
No ratings yet
Da Lab Mannual
25 pages
ML Lab Programs For Exam
No ratings yet
ML Lab Programs For Exam
10 pages
Data Mining Presentation
No ratings yet
Data Mining Presentation
13 pages
Phase 3.PDF Ramana
No ratings yet
Phase 3.PDF Ramana
17 pages
Student Performance Analysis
No ratings yet
Student Performance Analysis
28 pages
Student Performance Analysis and Prediction
No ratings yet
Student Performance Analysis and Prediction
19 pages
Building Logistic Regression Model in Python
No ratings yet
Building Logistic Regression Model in Python
24 pages
Documentation - Ishaan Mittal - Jio - Assessment
No ratings yet
Documentation - Ishaan Mittal - Jio - Assessment
9 pages
Featureselection
No ratings yet
Featureselection
11 pages
Da Rec
No ratings yet
Da Rec
29 pages
DWM Journal
No ratings yet
DWM Journal
104 pages
Coding Notes Data Science
No ratings yet
Coding Notes Data Science
4 pages
Student Performance Analysis
No ratings yet
Student Performance Analysis
28 pages
Project Report
100% (3)
Project Report
36 pages
Machine File
No ratings yet
Machine File
27 pages
Project Paarth
No ratings yet
Project Paarth
21 pages
Documentation
No ratings yet
Documentation
7 pages
Program 4: Public
No ratings yet
Program 4: Public
10 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
No ratings yet
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
38 pages
Project Synopsis of Student Droupout Prediction
No ratings yet
Project Synopsis of Student Droupout Prediction
6 pages
Student Performance Prediction Report
No ratings yet
Student Performance Prediction Report
9 pages
Day-4 DS Practicals
No ratings yet
Day-4 DS Practicals
5 pages
Zindi Financial Inclusion Guide
No ratings yet
Zindi Financial Inclusion Guide
12 pages
Spark Python Course APPLY Project Solution Guide Hints
No ratings yet
Spark Python Course APPLY Project Solution Guide Hints
2 pages
Student Behavior Analysis Project
No ratings yet
Student Behavior Analysis Project
3 pages
22BCE7750 ML Assignment
No ratings yet
22BCE7750 ML Assignment
23 pages
Articles Xgboost Classification With Smote-Enn Algorithm
No ratings yet
Articles Xgboost Classification With Smote-Enn Algorithm
11 pages
MACHINE LEARNING Manual
No ratings yet
MACHINE LEARNING Manual
36 pages
Assignment 2 Oops
No ratings yet
Assignment 2 Oops
10 pages
Python Prediction Project by Dikiza
No ratings yet
Python Prediction Project by Dikiza
2 pages
4.-Student Dropout Prediction 2020
No ratings yet
4.-Student Dropout Prediction 2020
12 pages
C121 Exp1
No ratings yet
C121 Exp1
32 pages
Python Linear Regression Tutorial
No ratings yet
Python Linear Regression Tutorial
6 pages
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
No ratings yet
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
5 pages
A Minor Project Report On DMT
No ratings yet
A Minor Project Report On DMT
11 pages
Record
No ratings yet
Record
22 pages
Machine Learning Project
No ratings yet
Machine Learning Project
29 pages
C121 Exp2
No ratings yet
C121 Exp2
23 pages
Cse Machine Learning Lab Manual
No ratings yet
Cse Machine Learning Lab Manual
22 pages
DataAnalytics Lab Manual
No ratings yet
DataAnalytics Lab Manual
35 pages
ML Report
No ratings yet
ML Report
20 pages
Personalized Learning
No ratings yet
Personalized Learning
13 pages
ML Lab AIDS
No ratings yet
ML Lab AIDS
25 pages
ML Manual
No ratings yet
ML Manual
18 pages
ML Practical 205160694034
No ratings yet
ML Practical 205160694034
33 pages
ML Lab
No ratings yet
ML Lab
29 pages
ML Complete Notes Hridoy
No ratings yet
ML Complete Notes Hridoy
5 pages
Exp 5
No ratings yet
Exp 5
4 pages
Academic Analytics Using Machine Learning
No ratings yet
Academic Analytics Using Machine Learning
26 pages
Statistics IMP Questions and Answers
No ratings yet
Statistics IMP Questions and Answers
23 pages
ML Question
No ratings yet
ML Question
2 pages
Lab 13
No ratings yet
Lab 13
5 pages
Titanic Data Analysis & Modeling
No ratings yet
Titanic Data Analysis & Modeling
11 pages
11th English Test 1
No ratings yet
11th English Test 1
4 pages
Bible Sharing Activity Sheet
No ratings yet
Bible Sharing Activity Sheet
2 pages
MATH-8 Q2 Mod3
No ratings yet
MATH-8 Q2 Mod3
17 pages
Advanced Timing Closure Techniques
No ratings yet
Advanced Timing Closure Techniques
16 pages
Wilkinson ARABPERSIANLANDRELATIONSHIPS 1973
No ratings yet
Wilkinson ARABPERSIANLANDRELATIONSHIPS 1973
13 pages
Leisure Pleasure and Healing 1st Edition Dvorjetski E PDF Download
100% (1)
Leisure Pleasure and Healing 1st Edition Dvorjetski E PDF Download
39 pages
Inglés Tema 14
No ratings yet
Inglés Tema 14
3 pages
Outlier - Ratable Prompts
No ratings yet
Outlier - Ratable Prompts
5 pages
Linux Fundamentals: Chapter 1: Introduction To Linux
No ratings yet
Linux Fundamentals: Chapter 1: Introduction To Linux
27 pages
KC5 Tests U05 Essential
83% (6)
KC5 Tests U05 Essential
6 pages
Crossword Fred Piscop 2 Letter
No ratings yet
Crossword Fred Piscop 2 Letter
20 pages
Seo Vs Aeo Vs Geo
No ratings yet
Seo Vs Aeo Vs Geo
10 pages
2015 Test Passers
No ratings yet
2015 Test Passers
216 pages
Reported Speech Practice Guide
No ratings yet
Reported Speech Practice Guide
3 pages
Tiếng Anh 9 Friends Plus Unit 4 - Lesson 6 Speaking
No ratings yet
Tiếng Anh 9 Friends Plus Unit 4 - Lesson 6 Speaking
23 pages
MCQ 2 - Selenium - Main
No ratings yet
MCQ 2 - Selenium - Main
15 pages
Letter To Zones Clarification Ndkr1 - rdkr1 CKT 15.07.2025
No ratings yet
Letter To Zones Clarification Ndkr1 - rdkr1 CKT 15.07.2025
1 page
Fise Interactive
No ratings yet
Fise Interactive
15 pages
Network+ Exam Protocols & Ports
No ratings yet
Network+ Exam Protocols & Ports
2 pages
Java 01
No ratings yet
Java 01
22 pages
صلوات التدشين وحجر الاساس
No ratings yet
صلوات التدشين وحجر الاساس
220 pages
Class 6 Computers Chapter 8 Online Surfing Ms. Subhashree Rout
89% (9)
Class 6 Computers Chapter 8 Online Surfing Ms. Subhashree Rout
4 pages
SPPU Engineering Physics Guide
No ratings yet
SPPU Engineering Physics Guide
3 pages
Secrets of The Sakina Wife 1 PDF
No ratings yet
Secrets of The Sakina Wife 1 PDF
1 page
The Basic Counting Principle
No ratings yet
The Basic Counting Principle
18 pages
Managing Console IO Operations
No ratings yet
Managing Console IO Operations
36 pages
Road Signage & Marking Standards
No ratings yet
Road Signage & Marking Standards
2 pages
Apologetics Robert Haddad
No ratings yet
Apologetics Robert Haddad
103 pages
Matrix Basics for Students
No ratings yet
Matrix Basics for Students
126 pages