0% found this document useful (0 votes)

19 views12 pages

SVM

Uploaded by

jitendrakhatua2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views12 pages

SVM

Uploaded by

jitendrakhatua2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

2|Page

Sl.No. Contents Page No.

01 Data Description 3

02 Problem Statement 4

03 Methodology 5
(i) Handling Null Value 5
(ii) Handling Missing Value 5
(iii) Noise Removal 5
(iv) Split 6
(v) Classification 6

04 Coding 7

05 Result and Discussion 10

06 Conclusion 11

07 Reference 12
3|Page

The Influenza Dataset extract from the UCI Machine Learning

Repository contains information on patients with Influenza, detailing
both clinical symptoms and lab results.

 Total Features (Attributes): 6 features ,these are [(flu_X_tr ),

(flu_Y_tr ),(flu_X_te ),(flu_Y_te),(flu_locs ),(flu_keywords)]
 Target Variable: Class, which indicates whether Influenza is
present (1) or absent (0).
4|Page

The purpose of this project is to build a classification model to

predict the presence of Influenza in patients based on their clinical
and laboratory data. By predicting Influenza early, healthcare
professionals can make timely decisions, potentially improving
patient outcomes.
5|Page

The methodology outlines steps used to prepare the data, clean it,
and build a classification model.

I. Null Value Method:

 The dataset is checked for null values, which are filled using statistical
methods (like median filling) to retain data consistency. Any null or
empty cells in columns are handled to maintain the integrity of the
model.
 Null values or empty fields are identified and replaced using mean or
median values of that feature’s column, ensuring data completeness
without introducing bias.

II. Missing Value Method:

 The dataset may contain missing entries denoted by “?”. These entries
are replaced with NaN values, which are then filled with the median
value for numerical features. This maintains a balanced representation
of each feature. This approach maintains the integrity of the dataset
while filling in gaps.

III. Noise Removal Method:

 Noise (outliers or incorrect values) in data may affect the model’s
performance. We handle noise by scaling numerical data to ensure
consistency. However, since the dataset is relatively clean, minimal
processing is needed.
 To minimize the impact of noise (outliers or inconsistent data), the
dataset is reviewed, and numerical values are scaled where necessary.
For example, outliers in numerical data, such as very high or low test
values, are adjusted by scaling to ensure uniform data.
6|Page

IV. Split:
The data is divided into two sets:

 Training Set (80%): Used to train the SVM model.

 Test Set (20%): Used to evaluate the model’s performance. This split
helps the model generalize and perform well on unseen data.

V. Method For Classification:

 A Support Vector Machine(SVM) is used to classify patients based on
their medical attributes. SVM is effective to create the best line or
decision boundary that can segregate n-dimensionalspace into classes,so
that we can easily put the new data point in the correct category in the
future.This best decision boundary is called a hyperplane.
 A SVM chooses the extreme point/vectors that help in creating the
hyperplane .These extreme cases are called as support vectors,and
hence algorithm is termed as Support Vector Machine(SVM).
7|Page

Here’s the Python code implementing the above steps for data preparation,
training, and evaluating the model.

Code:

# Import libraries

import pandas as pd

import numpy as np

from sklearn.svm import SVC

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score, classification_report

# Step 1: Load the data

df=pd.read_csv("C:/Users/KIIT0001/Downloads/influenza_outbreak_dataset.cs
v")

# Step 2: Assign column names

 df.columns = [(flu_X_tr ),
(flu_Y_tr ),(flu_X_te ),(flu_Y_te),(flu_locs ),(flu_keywords)]

#Step 3: Data Cleaning

8|Page

#filling null values with a single value

#filling missing value using fillna()

ndf=df

ndf.fillna(0)

#To drop rows with at least 1 null value

ndf.dropna()

# Step 4: Split data into features(x) and target(y)[ i.e training and testing sets]

X = flu_df.drop[‘target’,axis=1]

y = flu_df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,

random_state=42)

# Step 5: Support Vector Machine(SVM)

from sklearn.svm import SVC

Svc_model=SVC()

# Step 6: Evaluate the model

y_pred = rf_model.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))

print("Classification Report:\n", classification_report(y_test, y_pred))

9|Page

# Step 7: Feature Importance

feature_importance = rf_model.feature_importances_

indices = np.argsort(feature_importance)[::-1]

plt.figure(figsize=(10,6))

plt.title("Feature Importances")

plt.bar(range(X.shape[1]), feature_importance[indices], align="center")

plt.xticks(range(X.shape[1]), X.columns[indices], rotation=90)

plt.show()
10 | P a g e

 Accuracy: The model's accuracy on the test set, indicating its ability to
classify hepatitis cases correctly.

 Classification Report: The classification report includes metrics like

precision, recall, and F1-score for each class (hepatitis and non-hepatitis).
High F1-scores show balanced performance.

 Feature Importance: Random Forest highlights which features most

influence predictions. Features such as liver test results and age may
emerge as significant, revealing key health indicators for predicting
hepatitis.
11 | P a g e

This project demonstrates how the Random Forest model can effectively
classify hepatitis cases based on clinical and laboratory data. With good
accuracy and interpretability (feature importance), this model can be helpful
for medical professionals in understanding and diagnosing hepatitis. However,
further refinement or alternative models (e.g., boosting techniques) could
improve performance.
12 | P a g e

 UCI Machine Learning Repository - Hepatitis Dataset

 Scikit-Learn Documentation: For model functions and metrics used.
 General resources on data preprocessing, classification models, and
Random Forest methodology.

ML Model Report
No ratings yet
ML Model Report
8 pages
Team No-7
No ratings yet
Team No-7
12 pages
DM Final
No ratings yet
DM Final
79 pages
Boo PH 3
No ratings yet
Boo PH 3
11 pages
ML Project
No ratings yet
ML Project
11 pages
Disease Prediction Using ML
No ratings yet
Disease Prediction Using ML
20 pages
Assignment ML
No ratings yet
Assignment ML
5 pages
Report - SVM
No ratings yet
Report - SVM
13 pages
Article Eda
No ratings yet
Article Eda
7 pages
Synopsis 6 Extra
No ratings yet
Synopsis 6 Extra
5 pages
PROJECTS
No ratings yet
PROJECTS
6 pages
Mid-Term Project (Stroke Risk Classification)
No ratings yet
Mid-Term Project (Stroke Risk Classification)
3 pages
Diabetes Prediction Presentation
No ratings yet
Diabetes Prediction Presentation
12 pages
End To End Project Multiple Disease Detection Using ML - Nomidl
No ratings yet
End To End Project Multiple Disease Detection Using ML - Nomidl
24 pages
AIML Record Batch 9
No ratings yet
AIML Record Batch 9
88 pages
Health Care Analytics
No ratings yet
Health Care Analytics
30 pages
IEEE Conference Team ATOM
No ratings yet
IEEE Conference Team ATOM
5 pages
20BCE7620 AP2021228000397 Experiment-6 Removed
No ratings yet
20BCE7620 AP2021228000397 Experiment-6 Removed
19 pages
Additional Program
No ratings yet
Additional Program
573 pages
Disease Prediction Using Patient Data
No ratings yet
Disease Prediction Using Patient Data
7 pages
Journal Heart Attack
No ratings yet
Journal Heart Attack
6 pages
Bhavan Phase3 Prj.
No ratings yet
Bhavan Phase3 Prj.
24 pages
Data Mining Lab: Naive Bayes & Decision Trees
No ratings yet
Data Mining Lab: Naive Bayes & Decision Trees
4 pages
Decision Support
No ratings yet
Decision Support
21 pages
20MIS7095 (LAB 7) .Ipynb Colaboratory
No ratings yet
20MIS7095 (LAB 7) .Ipynb Colaboratory
4 pages
Binod ML Project-052
No ratings yet
Binod ML Project-052
14 pages
Review
No ratings yet
Review
5 pages
APA Chapter3 T20
No ratings yet
APA Chapter3 T20
24 pages
B24 ML Exp-3
No ratings yet
B24 ML Exp-3
10 pages
Final Research Paper
No ratings yet
Final Research Paper
3 pages
Da Pra Week 12 (SVM)
No ratings yet
Da Pra Week 12 (SVM)
15 pages
Medhun Final 1
No ratings yet
Medhun Final 1
4 pages
AI Based: Disease Prediction System: A Practical, Responsible, and Deployable Approach
No ratings yet
AI Based: Disease Prediction System: A Practical, Responsible, and Deployable Approach
7 pages
DMBI
No ratings yet
DMBI
15 pages
Experiment 2
No ratings yet
Experiment 2
17 pages
Lab Manual - MachineLearningLaboratory-DR - Vaishnavi
No ratings yet
Lab Manual - MachineLearningLaboratory-DR - Vaishnavi
71 pages
Processes 11 01210
No ratings yet
Processes 11 01210
31 pages
Project Synopsis - Disease Prediction System Using Multivariate Health Data
No ratings yet
Project Synopsis - Disease Prediction System Using Multivariate Health Data
2 pages
7073 21560 2 PB
No ratings yet
7073 21560 2 PB
9 pages
A Summer Internship Report
No ratings yet
A Summer Internship Report
27 pages
Thesis Presentation
No ratings yet
Thesis Presentation
22 pages
Project Presentation
No ratings yet
Project Presentation
18 pages
DATA 51000 ClassificationAssignment
No ratings yet
DATA 51000 ClassificationAssignment
10 pages
MLPPT 11 45
No ratings yet
MLPPT 11 45
31 pages
Heart Disease
No ratings yet
Heart Disease
20 pages
Machine File
No ratings yet
Machine File
27 pages
Python Cod1
No ratings yet
Python Cod1
3 pages
Ex 12
No ratings yet
Ex 12
4 pages
Report
No ratings yet
Report
11 pages
Research Report
No ratings yet
Research Report
35 pages
ML in Python Part-2
No ratings yet
ML in Python Part-2
21 pages
AI Heart Disease Prediction Tool
No ratings yet
AI Heart Disease Prediction Tool
16 pages
Research Paper-TWS-Assign - 2-With Mendeley Software
No ratings yet
Research Paper-TWS-Assign - 2-With Mendeley Software
6 pages
Kartik MLP 4-9prg
No ratings yet
Kartik MLP 4-9prg
10 pages
Heart Disease Prediction Using ML
No ratings yet
Heart Disease Prediction Using ML
16 pages
DataAnalytics Lab Manual
No ratings yet
DataAnalytics Lab Manual
35 pages
Meds Can
No ratings yet
Meds Can
34 pages
5 Markd
No ratings yet
5 Markd
24 pages
24 Nextcare RN3 Sept 2024
No ratings yet
24 Nextcare RN3 Sept 2024
240 pages
12 Physicaleducation Eng 2024 25
No ratings yet
12 Physicaleducation Eng 2024 25
6 pages
Physical Education and Health 12 Module 2 Core FINAL
No ratings yet
Physical Education and Health 12 Module 2 Core FINAL
27 pages
A Study On Employee Satisfaction of Paragon Footwer
No ratings yet
A Study On Employee Satisfaction of Paragon Footwer
74 pages
Who His Hsi Rev.2012.03 Eng
No ratings yet
Who His Hsi Rev.2012.03 Eng
35 pages
Indian Teens: Assertiveness & Self-Esteem
No ratings yet
Indian Teens: Assertiveness & Self-Esteem
6 pages
Reading Practice
No ratings yet
Reading Practice
3 pages
Current Status of Researches On Jaw
No ratings yet
Current Status of Researches On Jaw
15 pages
Jinnah Sindh Medical University Laboratory
No ratings yet
Jinnah Sindh Medical University Laboratory
2 pages
Karakterisasi Faktor-Faktor Virulensi Staphylococcus Aureus Asal Susu Kambing Peranakan Ettawa Secara Fenotip Dan Genotip
No ratings yet
Karakterisasi Faktor-Faktor Virulensi Staphylococcus Aureus Asal Susu Kambing Peranakan Ettawa Secara Fenotip Dan Genotip
13 pages
RESMETHAR Module 4 - Research Manual For Architecture 2022
No ratings yet
RESMETHAR Module 4 - Research Manual For Architecture 2022
7 pages
Informed Consent Form For General Dental Procedures2
No ratings yet
Informed Consent Form For General Dental Procedures2
3 pages
Coi Disclosure
No ratings yet
Coi Disclosure
2 pages
Charitable Hospital Project Report
86% (28)
Charitable Hospital Project Report
27 pages
DSP Report - Template Redacted
No ratings yet
DSP Report - Template Redacted
3 pages
Irene Lyons Nervous System Starter Kit P4
No ratings yet
Irene Lyons Nervous System Starter Kit P4
9 pages
Health Optimizing Physical Education 1: Quarters 1 and 2 - Module 1: The Healthiest and Fittest ME
93% (14)
Health Optimizing Physical Education 1: Quarters 1 and 2 - Module 1: The Healthiest and Fittest ME
30 pages
7S Good Housekeeping Training-Official
100% (2)
7S Good Housekeeping Training-Official
66 pages
West Bengal Covid-19 Health Bulletin - 7 November 2020
No ratings yet
West Bengal Covid-19 Health Bulletin - 7 November 2020
7 pages
HSS FBO Report Nigeria
No ratings yet
HSS FBO Report Nigeria
9 pages
اهداء من فريق العمالقة 12 نماذج التجريبي للصف الثالث الثانوي بالاجابات 2025 - 57
No ratings yet
اهداء من فريق العمالقة 12 نماذج التجريبي للصف الثالث الثانوي بالاجابات 2025 - 57
1 page
The Book of Lies Meltzer Brad Download
100% (1)
The Book of Lies Meltzer Brad Download
33 pages
A Guide To Living With Ehlers Danlos Syndrome (Hypermobility Type) Bending Without Breaking - 2nd Edition Instant PDF Download
100% (20)
A Guide To Living With Ehlers Danlos Syndrome (Hypermobility Type) Bending Without Breaking - 2nd Edition Instant PDF Download
17 pages
6 Principles of Cavity Preparation
No ratings yet
6 Principles of Cavity Preparation
58 pages
COVID-19 Vaccine Certificate Guide
No ratings yet
COVID-19 Vaccine Certificate Guide
3 pages
Sports Nutrition Quiz for Pros
No ratings yet
Sports Nutrition Quiz for Pros
5 pages
Form - EHS FORM 10-PRE-WORK RISK ASSESSMENT JHA OTF R5 120324
No ratings yet
Form - EHS FORM 10-PRE-WORK RISK ASSESSMENT JHA OTF R5 120324
4 pages
Application of Biomedical Instrumentation in Space Technology
No ratings yet
Application of Biomedical Instrumentation in Space Technology
13 pages
Ethnozoological Remedies in Assam
No ratings yet
Ethnozoological Remedies in Assam
7 pages
Intraosseous Devices: Michael W. Day
No ratings yet
Intraosseous Devices: Michael W. Day
10 pages