0% found this document useful (0 votes)

23 views12 pages

Engine Condition Classifier Report

Engine_Condition_Classifier_Report

Uploaded by

naniop7353

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views12 pages

Engine Condition Classifier Report

Engine_Condition_Classifier_Report

Uploaded by

naniop7353

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

MINI PROJECT

CAR ENGINE ANOMALY

DETECTOR
REPORT

PREPARED BY:
YOGESH R-71772217155
PIYUSH KUMAR MAHTO-71772217158
PRINCE KUMAR MAHTO-71772217159
GNANESH G-71772217303
KAMALAKANNAN S-71772217L02
1. Introduction
Predictive maintenance is revolutionizing industries by minimizing downtime and reducing
repair costs. One critical aspect of this is determining the condition of an engine before failure
occurs. In this project, a smart system has been developed to classify an engine as Healthy or
Faulty based on real-time sensor readings using supervised machine learning.

This engine condition classifier uses sensor data such as oil pressure, RPM, fuel pressure, and
temperature to identify patterns and detect faults. The system incorporates data preprocessing,
class balancing with SMOTE, model training, hyperparameter tuning, and visual analytics. With
a user-friendly Streamlit app, this solution brings predictive insights directly to the user.

2. Objective
- To classify engine conditions (Healthy or Faulty) based on sensor input.
- To use machine learning models to accurately predict engine faults.
- To build a pipeline with data preprocessing, model comparison, tuning, and visualization.
- To deploy the trained model using an intuitive web interface.

3. Tools and Technologies Used

- Python – Programming language for development
- Pandas & Numpy – For data handling and numerical operations
- Scikit-learn – For building and tuning ML models
- Matplotlib & Seaborn – For data visualization
- imblearn (SMOTE) – For handling class imbalance
- Joblib – For model serialization
- Streamlit – For deploying a simple user interface

4. Dataset
The dataset engine_data.csv includes real engine sensor readings and labels that indicate the
engine condition:

Features:
- Engine rpm
- Lubricant oil pressure
- Fuel pressure
- Coolant pressure
- Lubricant oil temperature
- Coolant temperature

Target:
- Engine Condition
- 0 – Healthy
- 1 – Faulty
5. Model Design
- Preprocessing: StandardScaler was used to scale features to improve model convergence.
- Train-Test Split: 80% for training and 20% for testing.
- Class Balancing: SMOTE was applied to address class imbalance in the dataset.
- Model Selection: Multiple classifiers were evaluated:
- Random Forest
- Logistic Regression
- Support Vector Machine
- K-Nearest Neighbors
- Tuning: GridSearchCV was used on the Random Forest classifier to identify optimal
hyperparameters.
- Final Evaluation: The best model was evaluated using accuracy, confusion matrix, and
classification report.

6. Application Workflow
1. Load and explore the dataset.
2. Preprocess and scale features.
3. Split dataset into training and testing sets.
4. Apply SMOTE to balance training data.
5. Train and compare multiple classification models.
6. Tune the best-performing model (Random Forest).
7. Evaluate the final model on the test set.
8. Visualize feature importance and model metrics.
9. Save the trained model and scaler.
10. Use Streamlit to deploy the classifier for real-time predictions.

7. Code Explanation
- Data Loading: Read engine_data.csv using pandas.
- Preprocessing: StandardScaler used to normalize features.
- Model Training: Loop over four models and compare test accuracy.
- SMOTE: Synthetic data generated to oversample the minority class.
- Grid Search: Fine-tune Random Forest parameters:
{'max_depth': None, 'min_samples_split': 2, 'n_estimators': 200}
- Evaluation Metrics: Accuracy, classification report, confusion matrix, cross-validation score.
- Feature Importance: Bar plot generated using seaborn.
- Model Saving: Trained model and scaler saved as .pkl files.
- Web Interface: Streamlit app developed for real-time predictions.
8. Visual Analytics
- Confusion Matrix: Highlights model's performance on test data.
- Classification Report: Shows precision, recall, and F1-score for both classes.
- Feature Importance Plot: Visualizes how much each sensor contributes to prediction.
- Cross-Validation: Accuracy averaged over 5 folds ensures model generalization.

9. Strengths
- Handles imbalanced datasets using SMOTE
- Compares multiple models for robustness
- Uses hyperparameter tuning for better accuracy
- Saves model and scaler for future use
- Streamlit integration allows easy interaction
- Provides visual understanding of model and data

10. Limitations & Future Work

- The model is trained on a limited dataset; performance can be improved with more real-world
data.
- Only binary classification is supported (Healthy/Faulty).
- Deep learning models can be explored for further improvement.
- Additional sensor inputs (vibration, acoustic signals) can enhance accuracy.
- Future versions can include real-time sensor data input and dashboard integration.

11. Source Code

GitHub Repository: https://github.com/Gnanesh-Nani/EngineSense
# train_engine_model.py

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split, GridSearchCV, cross_val_score

from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from imblearn.over_sampling import SMOTE
import joblib

# Load dataset
df = pd.read_csv('engine_data.csv')

print("📄 First 5 rows:")

print(df.head())

print("\n🔍 Info:")

print(df.info())

# Features and target

X = df.drop('Engine Condition', axis=1)
y = df['Engine Condition']

# Scale features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Train/test split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

# Apply SMOTE
sm = SMOTE(random_state=42)
X_train_res, y_train_res = sm.fit_resample(X_train, y_train)

print("\n⚖️Class distribution after SMOTE:")

print(pd.Series(y_train_res).value_counts())

# Try different models

models = {
"Random Forest": RandomForestClassifier(random_state=42),
"Logistic Regression": LogisticRegression(max_iter=1000),
"SVM": SVC(),
"KNN": KNeighborsClassifier()
}

print("\n📊 Model Comparison:")

for name, model in models.items():

model.fit(X_train_res, y_train_res)
acc = model.score(X_test, y_test)
print(f"{name}: {acc:.2f}")

# Grid search on Random Forest

param_grid = {
'n_estimators': [100, 200],
'max_depth': [None, 5, 10],
'min_samples_split': [2, 5],
}
grid = GridSearchCV(RandomForestClassifier(random_state=42), param_grid, cv=5, n_jobs=-1)
grid.fit(X_train_res, y_train_res)
print("\n✅ Best Parameters:")

print(grid.best_params_)

# Final evaluation
best_model = grid.best_estimator_
y_pred = best_model.predict(X_test)

print("\n📈 Classification Report:")

print(classification_report(y_test, y_pred))

print("🧮 Confusion Matrix:")

print(confusion_matrix(y_test, y_pred))

# Cross-validation score
cv_scores = cross_val_score(best_model, X_scaled, y, cv=5)
print("\n📉 Cross-Validated Accuracy: {:.2f}%".format(cv_scores.mean() * 100))

# Feature importance
importances = best_model.feature_importances_
feature_names = X.columns

plt.figure(figsize=(8, 5))
sns.barplot(x=importances, y=feature_names)
plt.title("Feature Importance")
plt.xlabel("Importance Score")
plt.tight_layout()
plt.show()

# Save model and scaler

joblib.dump(best_model, 'engine_condition_model.pkl')
joblib.dump(scaler, 'scaler.pkl')
print("\n💾 Model and scaler saved as 'engine_condition_model.pkl' and 'scaler.pkl'")
12. Output
Faulty Condition:
Normal Condition:
Accuracy:
Feature Importance:

Conclusion
This project demonstrates how machine learning can be effectively applied to monitor and
classify engine conditions. By building a robust classification pipeline with proper
preprocessing, class balancing, and model tuning, we can predict engine faults with promising
accuracy. The deployment-ready solution using Streamlit provides a strong foundation for
industrial applications in predictive maintenance and diagnostics. Future enhancements can make
the system more intelligent, scalable, and adaptive to new data streams.

Phase 1 Review 1 Presentation Format 23 24 (1) 1 (1) (Read Only)
No ratings yet
Phase 1 Review 1 Presentation Format 23 24 (1) 1 (1) (Read Only)
20 pages
Machine Learning
No ratings yet
Machine Learning
15 pages
ML5&6&7&8&9&10
No ratings yet
ML5&6&7&8&9&10
35 pages
Predictive Modeling for Data Scientists
No ratings yet
Predictive Modeling for Data Scientists
16 pages
Machine Learning Cheat Sheet
No ratings yet
Machine Learning Cheat Sheet
15 pages
Scalability Analysis of Predictive Maintenance
No ratings yet
Scalability Analysis of Predictive Maintenance
51 pages
Experiment 8: Aim: Objective: Tools Used: Theory
No ratings yet
Experiment 8: Aim: Objective: Tools Used: Theory
10 pages
Engine Health Prediction Presentation
No ratings yet
Engine Health Prediction Presentation
17 pages
ML Lab-1
No ratings yet
ML Lab-1
32 pages
Iml 51
No ratings yet
Iml 51
10 pages
Car Mock - ML Ans
No ratings yet
Car Mock - ML Ans
6 pages
Supple Maximizing Performance in Cs CuBiCl
No ratings yet
Supple Maximizing Performance in Cs CuBiCl
5 pages
ML - Assignment Advanced
No ratings yet
ML - Assignment Advanced
2 pages
Chapter 09 CART - Week 06 - 02
No ratings yet
Chapter 09 CART - Week 06 - 02
53 pages
Machine Learning Cheat Sheet: Karn Singh
No ratings yet
Machine Learning Cheat Sheet: Karn Singh
13 pages
3 Project Plan and Workflow
No ratings yet
3 Project Plan and Workflow
2 pages
QB 1
No ratings yet
QB 1
11 pages
GTE03 MO402 List of Experiments
No ratings yet
GTE03 MO402 List of Experiments
10 pages
Top Datasets for Data Science
100% (1)
Top Datasets for Data Science
9 pages
MlLabManualdocx 2024 09 04 22 02 58
No ratings yet
MlLabManualdocx 2024 09 04 22 02 58
19 pages
ML File Divya Goyal
No ratings yet
ML File Divya Goyal
28 pages
Project Documentation
No ratings yet
Project Documentation
1 page
Class Participation
No ratings yet
Class Participation
9 pages
Data Collection
No ratings yet
Data Collection
8 pages
A3 Classification and Feature Engineering
No ratings yet
A3 Classification and Feature Engineering
2 pages
Predictive Maintenance Insights
No ratings yet
Predictive Maintenance Insights
23 pages
8 To 12 Jaimeen
No ratings yet
8 To 12 Jaimeen
34 pages
Advanced Scikit Learn
No ratings yet
Advanced Scikit Learn
98 pages
Lab 1. Boston House
No ratings yet
Lab 1. Boston House
7 pages
Final Report
No ratings yet
Final Report
17 pages
Predicting Car MPG Using Decision Tree and Random Forest Algorithm Main
No ratings yet
Predicting Car MPG Using Decision Tree and Random Forest Algorithm Main
21 pages
Report
No ratings yet
Report
9 pages
4p Code
No ratings yet
4p Code
3 pages
MLA Lab 6:-Implementation of Decision Tree
No ratings yet
MLA Lab 6:-Implementation of Decision Tree
16 pages
Data Scientists: Fuel Efficiency Model
No ratings yet
Data Scientists: Fuel Efficiency Model
14 pages
AI Car Evaluation Analysis
No ratings yet
AI Car Evaluation Analysis
23 pages
AI
No ratings yet
AI
16 pages
LAB MANUAL For Machine Learning
No ratings yet
LAB MANUAL For Machine Learning
15 pages
Image Report-1
No ratings yet
Image Report-1
21 pages
Research Project On
No ratings yet
Research Project On
21 pages
ML Using Python Programs
No ratings yet
ML Using Python Programs
12 pages
Project
No ratings yet
Project
16 pages
FRM Course Syllabus IPDownload
No ratings yet
FRM Course Syllabus IPDownload
3 pages
Python Predictive Modeling
No ratings yet
Python Predictive Modeling
24 pages
Project Title
No ratings yet
Project Title
4 pages
Car Evaluation Data Analysis & Random Forest Model
No ratings yet
Car Evaluation Data Analysis & Random Forest Model
12 pages
Jadavpur University: Assignment Submission
No ratings yet
Jadavpur University: Assignment Submission
9 pages
Código Carros
No ratings yet
Código Carros
3 pages
Predictive Analytics Assignment
No ratings yet
Predictive Analytics Assignment
29 pages
ML Cheat Sheet
No ratings yet
ML Cheat Sheet
7 pages
Lecture 7.2 - DTC Algorithm Implementation
No ratings yet
Lecture 7.2 - DTC Algorithm Implementation
7 pages
Capstone Project - Jaro-Prof. Babji
No ratings yet
Capstone Project - Jaro-Prof. Babji
5 pages
Machine Downtime Prediction
No ratings yet
Machine Downtime Prediction
17 pages
Predictive Machine Maintenance Using Deep Learning
No ratings yet
Predictive Machine Maintenance Using Deep Learning
6 pages
Vehicle Fuel Efficiency Prediction - Project Outline (Interdisciplinary)
No ratings yet
Vehicle Fuel Efficiency Prediction - Project Outline (Interdisciplinary)
5 pages
Sample Project Proposal
No ratings yet
Sample Project Proposal
2 pages
An Ensemble Deep Learning Model For Vehicular Engine Health Prediction
No ratings yet
An Ensemble Deep Learning Model For Vehicular Engine Health Prediction
19 pages
Unit 5
No ratings yet
Unit 5
8 pages
D. E. Shaw India - SDET - FTE
100% (1)
D. E. Shaw India - SDET - FTE
2 pages
SPOS Assignment
No ratings yet
SPOS Assignment
11 pages
Water Proofing - CT
No ratings yet
Water Proofing - CT
24 pages
ILETIA: An AI-enhanced Method For Individualized Trigger-Oocyte Pickup Interval Estimation of Progestin-Primed Ovarian Stimulation Protocol
No ratings yet
ILETIA: An AI-enhanced Method For Individualized Trigger-Oocyte Pickup Interval Estimation of Progestin-Primed Ovarian Stimulation Protocol
76 pages
Report Format
No ratings yet
Report Format
20 pages
Ml-Mod 1 Pyq and Imp QN
No ratings yet
Ml-Mod 1 Pyq and Imp QN
12 pages
Stats216 hw4 PDF
No ratings yet
Stats216 hw4 PDF
27 pages
8034 26557 2 PB
No ratings yet
8034 26557 2 PB
9 pages
S2 PDF
No ratings yet
S2 PDF
21 pages
The Machine Learning Audit CRISP DM Framework - Joa - Eng - 0118
No ratings yet
The Machine Learning Audit CRISP DM Framework - Joa - Eng - 0118
6 pages
Machine MCQ
No ratings yet
Machine MCQ
32 pages
Zivot+Yollin R Forecasting
No ratings yet
Zivot+Yollin R Forecasting
90 pages
Data-Driven Analysis in 3D Concrete Printing: Predicting and Optimizing Construction Mixtures
No ratings yet
Data-Driven Analysis in 3D Concrete Printing: Predicting and Optimizing Construction Mixtures
25 pages
Credit Risk Classification Analysis
No ratings yet
Credit Risk Classification Analysis
16 pages
Homo Heuristicus - Why Biased Minds Make Better Inferences - Gigerenzer - 2009 - Topics in Cognitive Science - Wiley Online Library
No ratings yet
Homo Heuristicus - Why Biased Minds Make Better Inferences - Gigerenzer - 2009 - Topics in Cognitive Science - Wiley Online Library
37 pages
Altar Internship Report
No ratings yet
Altar Internship Report
30 pages
Losing Control (Group) The Machine Learning Control Method For Counterfactual Forecasting
No ratings yet
Losing Control (Group) The Machine Learning Control Method For Counterfactual Forecasting
44 pages
Sentiment Analysis Using Weka
No ratings yet
Sentiment Analysis Using Weka
3 pages
Seattle Car Accident Severity Analysis
No ratings yet
Seattle Car Accident Severity Analysis
12 pages
Autism ML Paper
No ratings yet
Autism ML Paper
7 pages
Peanut Protein Analysis with NIRS
No ratings yet
Peanut Protein Analysis with NIRS
7 pages
Final Year Project Report
100% (1)
Final Year Project Report
59 pages
Communication Circuits and Systems Proceedings of 3rd Iccaccs 2020 36127514
100% (1)
Communication Circuits and Systems Proceedings of 3rd Iccaccs 2020 36127514
81 pages
Data Science New Report
No ratings yet
Data Science New Report
39 pages
Assignment Solution
No ratings yet
Assignment Solution
21 pages
Evaluation Metrics & ML Problem Types
No ratings yet
Evaluation Metrics & ML Problem Types
49 pages
Artificial Neural Networks Model For Predicting Excavator (Ok)
No ratings yet
Artificial Neural Networks Model For Predicting Excavator (Ok)
7 pages
Introduction To Data Mining Global Edition Pang Ning Tan Michael Steinbach Anuj Karpatne Vipin Kumar
100% (2)
Introduction To Data Mining Global Edition Pang Ning Tan Michael Steinbach Anuj Karpatne Vipin Kumar
79 pages
Ml-1-Guided-Bus Report
No ratings yet
Ml-1-Guided-Bus Report
35 pages
Telco Churn Analysis for Students
No ratings yet
Telco Churn Analysis for Students
23 pages
Prob and Stats in AI Unit-4
No ratings yet
Prob and Stats in AI Unit-4
24 pages
IML-IITKGP - Assignment 1 Solution
100% (1)
IML-IITKGP - Assignment 1 Solution
7 pages
A First Course in Machine Learning Chapman Hall CRC Machine Learning Pattern Recognition 2nd Edition Simon Rogers Download
100% (1)
A First Course in Machine Learning Chapman Hall CRC Machine Learning Pattern Recognition 2nd Edition Simon Rogers Download
48 pages

Engine Condition Classifier Report

Uploaded by

Engine Condition Classifier Report

Uploaded by

MINI PROJECT

CAR ENGINE ANOMALY

3. Tools and Technologies Used

10. Limitations & Future Work

11. Source Code

from sklearn.model_selection import train_test_split, GridSearchCV, cross_val_score

print("📄 First 5 rows:")

# Features and target

print("\n⚖️Class distribution after SMOTE:")

# Try different models

print("\n📊 Model Comparison:")

for name, model in models.items():

# Grid search on Random Forest

print("\n📈 Classification Report:")

print("🧮 Confusion Matrix:")

# Save model and scaler

You might also like