Ijirt156181 Paper

This paper discusses the growing issue of spam emails and presents a method for spam detection using various machine learning algorithms, including Naïve Bayes, SVM, and Genetic Algorithms. It highlights the effectiveness of automatic email filtering and compares the performance of different classifiers on spam datasets. The proposed system aims to enhance email security by optimizing spam detection through content and URL analysis, ultimately improving classification accuracy.

Uploaded by

cricketkibaat11

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views5 pages

Ijirt156181 Paper

Uploaded by

cricketkibaat11

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

© July 2022| IJIRT | Volume 9 Issue 2 | ISSN: 2349-6002

Paper on Spam Email Detection with Classification Using

Machine Learning

Naresh Vinod Wankhade1, Dr.Ranjit. R. Keole2, Prof.Tushar. R. Mahore3

1
ME (Computer Science and Engineering) Second Year, Dr.RGIT&R, Amravati, India
2
Head of the Department, Information Technology, HVPM’s CET, Amravati, India
3
Head of the Department, Computer Science & Engineer DRGIT&R, Amravati, India

Abstract— The increasing volume of unsolicited bulk e- Spam is waste of time, storage space and
mail (also known as spam) has generated a need for communication bandwidth. The problem of spam e-
reliable anti-spam filters. Machine learning techniques mail has been increasing for years. In recent statistics,
now days used to automatically filter the spam e-mail in
40% of all emails are spam which about 15.4 billion
a very successful rate. In this paper we review some of
email per day and that cost internet users about $355
the most popular machine learning methods (Bayesian
classification, k-NN, ANNs, SVMs, Artificial immune million per year. Automatic e-mail filtering seems to
system and rough sets) and of their applicability to the be the most effective method for countering spam at
problem of spam Email classification. Descriptions of the the moment and a tight competition between
algorithms are presented, and the comparison of their spammers and spam-filtering methods is going on.
performance on the Spam Assassin spam corpus is Only several years ago most of the spam could be
presented. Electronic mail has eased communication reliably dealt with by blocking e-mails coming from
methods for many organizations as well as individuals. certain addresses or filtering out messages with certain
This method is exploited for fraudulent gain by
subject lines. Spammers began to use several tricky
spammers through sending unsolicited emails. This
methods to overcome the filtering methods like using
article aims to present a method for detection of spam
emails with machine learning algorithms that are random sender addresses and/or append random
optimized with bio-inspired methods. A literature review characters to the beginning or the end of the message
is carried to explore the efficient methods applied on subject line [11]. Knowledge engineering and machine
different datasets to achieve good results. Extensive learning are the two general approaches used in e-mail
research was done to implement machine learning filtering. In knowledge engineering approach a set of
models using Naïve Bayes, Support Vector Machine, rules has to be specified according to which emails are
Random Forest, Decision Tree and Multi-Layer categorized as spam or ham. A set of such rules should
Perceptron on seven different email datasets, along with
be created either by the user of the filter, or by some
feature extraction and pre-processing. The bio-inspired
other authority (e.g., the software company that
algorithms like Particle Swarm Optimization and
Genetic Algorithm were implemented to optimize the provides a particular rule-based spam-filtering tool).
performance of classifiers. Multinomial Naïve Bayes By applying this method, no promising results shows
with Genetic Algorithm performed the best overall. The because the rules must be constantly updated and
comparison of our results with other machine learning maintained, which is a waste of time, and it is not
and bio-inspired models to show the best suitable model convenient for most users. Machine learning approach
is also discussed. is more efficient than knowledge engineering
approach; it does not require specifying any rules [4].
Index Terms- ANN, Data Extraction, IP Filtration,
Instead, a set of training samples, these samples is a
Machine Learning, URL
set of pre classified e-mail messages. A specific
algorithm is then used to learn the classification rules
I. INTRODUCTION
from these e-mail messages. Machine learning
approach has been widely studied and there are lots of
Recently unsolicited commercial/bulk e-mail also
algorithms can be used in e-mail filtering. They
known as spam, become a big trouble over the internet.
include Naïve Bayes, support vector machines, Neural

IJIRT 156181 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 1055

Networks, K-nearest neighbor, rough sets and the artificial immune system (IA-AIS) and applied to the
artificial immune system. problem of identification of unsolicited bulk e-mail
The proposed system will help to enhance the security messages (SPAM).
of user through previous checking of email. In which 3. PROPOSED METHODOLOGY
the evolutionary mechanism firstly check the content
of the mail which passed through various machine As per the things seen it is necessary to propose the
learning technique. In this the proposed methodology mechanism in which mail are going to cross verify the
will perform the various check for the link as well mail content in which we are going to filter the both
which will help for the security enhancement. It will content and links of shared email. Most probably the
handle the cyber security attack to stop the entry. spam mails contain the malicious link in which URL
classification or parsing need to be work out. So that
2. EXISTING SYSTEM& ALGORITHM in proposed we analyze the URL data as well as mail
content
There are some research works that apply machine
learning methods in e-mail classification, Muhammad Start
N. Marsono, M. Watheq El-Kharashi, Fayez Gebali[2]
. They demonstrated that the naïve Bayes e-mail
content classification could be adapted for layer-3 Retrieve Email
processing, without the need for reassembly.
Suggestions on redetecting e-mail packets on spam
control middle boxes to support timely spam detection Apply Content
at receiving e-mail servers were presented. M. N. filter
Marsono, M. W. El-Kharashi, and F. Gebali[1] They
presented hardware architecture of na¨ıve Bayes
inference engine for spam control using two class e- Mail Content Extracted URL
mail classification. That can classify more 117 million
features per second given a stream of probabilities as
inputs. This work can be extended to investigate Data Parsing and Content
proactive spam handling schemes on receiving e-mail Extraction Extraction Analysis
servers and spam throttling on network gateways. Y.
Tang, S. Krasser, Y. He, W. Yang, D. Alperovitch [3]
proposed a system that used the SVM for classification
purpose, such system extract email sender behavior Apply Sentimental Analysis
data based on global sending distribution, analyze
them and assign a value of trust to each IP address
sending email message, the Experimental results show Analys
NO
that the SVM classifier is effective, accurate and much
Is Malicious
faster than the Random Forests (RF) Classifier. Yoo,
Content?
S., Yang, Y., Lin, F., and Moon [11] developed
personalized email prioritization (PEP) method that
Yes
specially focus on analysis of personal social networks
to capture user groups and to obtain rich features that
Mark Spam Mail
represent the social roles from the viewpoint of
particular user, as well as they developed a supervised
classification framework for modeling personal Finish
priorities over email messages, and for predicting
Fig 1 –Flow chart of Proposed Methodology for spam
importance levels for new messages. Guzella, Mota-
email detection Above diagram represents the flow
Santos, J.Q. Uch, and W.M. Caminhas[4] proposed an
chart of proposed methodology in which mails are
immune-inspired model, named innate and adaptive

IJIRT 156181 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 1056

given as input to the system in which on mail content (Sender) can send mail to another registered user
the content extraction will be done and followed with (Receiver) by selecting appropriate email id. At the
execution process of breaking it in to the links and data receiver end each mail has to go through stages
in this it is going to filter in various aspect like content discussed in system design and implementation of all
filtration counting the malicious word and shows it in stages is as follow.
appropriate manner firstly the link and data
classification will be workout latterly the data process Steps in Spam email Detection system
with sentimental analysis in which the various Admin Login
keywords compared and evaluate . Latterly the step of Admin has access to all contents in the Spam email
IP check will be encounter in which the send email id detection system, Admin can make certain changes.
will be retrieve and perform with evaluation. This Following screenshot shows the login window for
process followed by result evaluation. At the end the admin
spam email detection will be concluded
New Registration Window
Architecture New User has to register using Username, email id,
contact details and password to be chosen by the user.
User has to remember all credentials in order to access
account under user login. In this case, email id shall be
used as username. Screenshot for New Registration is
shown below.

User Login Window

Registered user has to use emailed as username to
login into the account. Once login is done, registered
user (Sender) can send mail to another registered user
(receiver) using appropriate email id.

Spam Email Detection System

All registered users can be accessed by admin. Admin
can certainly changes to spam mail, users, can remove
spam and can add training.
However, the fields are confined to username, contact
Fig 2-Execution Spam email number, email id which is later used as username for
Above diagram represents the architecture of login and password. In this system one user can have
execution of spam email in which the first step will be same username with different email ids. Here in this
perform as content filtration URL extraction and case, email id acts as a primary key. Duplication of
separating the data . in this the link-based evaluation email id strictly restricted here in Spam Email
well done the content of the mail will be compared Detection System
with existing keyword and IPs. So that the spam email
detection will be done. Spam Detection Mail Window
After login into user account, registered users can
Implementation make certain changes to spam mail, users, can remove
Spam email detection system has following stages in spam and can add training. However, the fields are
order to detect spam email. Admin has access to all confined to username, contact number, email id which
contents in the Spam email detection system. New is later used as username for login and password. In
User has to register using Username, email id, Contact this system one user can have same username with
details and password to be chosen by the user. After different email ids. Here in this case, email id acts as a
registering user can access his account by login using primary key. After receiving mail, user can check the
email id as a username and password. Registered user mail body, if he finds inappropriate word then he can

IJIRT 156181 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 1057

report that mail as spam or otherwise non spam. Under [1] Shukor Bin AbdRazak, Ahmad Fahrulrazie Bin
this mechanism it performs actions like Mail Splitting, Mohamad “Identification of Spam Email Based
Content Extraction ,URL Filtration, IP Extraction to on Information from Email Header” 13th
detect the spam International Conference on Intelligent Systems
Design and Applications (ISDA), 2013.
Result Analysis [2] Mohammed Reza Parsei, Mohammed Salehi “E-
We pass certain email content to both existing and Mail Spam Detection Based on Part of Speech
proposed in which all mails are pass which are non- Tagging” 2nd International Conference on
spam so below system show the detections of mails Knowledge Based Engineering and
which are shown below Innovation(KBEI), 2015.
[3] Sunil B. Rathod, Tareek M. Pattewar “Content
Precision Existing Vs Proposed Based Spam Detection in Email using Bayesian
0.3 Classifier”, presented at the IEEE ICCSP 2015
conference.
0.2 [4] AakashAtulAlurkar, Sourabh Bharat Ranade,
Shreeya Vijay Joshi, SiddheshSanjay Ranade,
0.1
Piyush A. Sonewa, Parikshit N.Mahalle, Arvind
0 V. Deshpande “A Proposed Data Science
Recall Approach for Email Spam Classification using
Machine Learning Techniques”, 2017.
By Existing Proposed
[5] KritiAgarwal, Tarun Kumar “Email Spam
Fig.3- Precision Existing Vs Proposed System Detection using integrated approach of Naïve
Bayes and Particle Swarm Optimization”,
4.CONCLUSION Proceedings of the Second International
Conference on Intelligent Computing and Control
In Spam mail classification is major area of concern Systems (ICICCS), 2018.
these days as it helps in the detection of unwanted [6] CihanVarol, HezhaM. TareqAbdulhadi
emails and threats. So now a day’s most of the “Comparison of String-Matching Algorithms on
researchers are working in this area in order to find out Spam Email Detection”, International Congress
the best classifier for detecting the spam mails. So a on Big Data, Deep Learning and Fighting
filter is required with high accuracy to filter the CyberTerrorism Dec 2018.
unwanted mails or spam mails. In this paper we [7] Duan, Lixin, Dong Xu, and Ivor Wai-Hung
focused on finding the best classifier for spam mail Tsang. "Domain adaptation from multiple
classification using Data Mining techniques. So, we sources: A domain dependent regularization
applied various classification algorithms on the given approach." IEEE Transactions on Neural
input data set and check the results. From this study Networks and Learning Systems 23.3 (2012).
we analyze that classifier works well when we embed [8] Anitha, PU &Rao, Chakunta& ,T.Sireesha.
feature selection approach in the classification process (2013). A Survey On: E-mail Spam Messages and
that is the accuracy improved drastically when Bayesian Approach for Spam Filtering.
classifiers are applied on the reduced data set instead International Journal of Advanced Engineering
of the entire data set. As in proposed the spam and Global Technology (IJAEGT). 1. 124- 136.
classification done on all parameters like IP , Previous [9] Attenberg, J., Weinberger, K., Dasgupta, A.,
history and content of shared URL and data so that the Smola, A., &Zinkevich, M. (2009, July).
proposed mechanism will helps a lot to go improved Collaborative email-spam filtering with the
spam mail detection. hashing trick.In Proceedings of the Sixth
Conference on Email and Anti-Spam.
REFERENCE [10] Awad, W. A., &ELseuofi, S. M. (2011). Machine
learning methods for spam e-mail classification.

IJIRT 156181 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 1058

International Journal of Computer Science &

Information Technology (IJCSIT), 3(1), 173-184.

IJIRT 156181 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 1059

Pending Proj
No ratings yet
Pending Proj
37 pages
VBK23 Cse 041
No ratings yet
VBK23 Cse 041
6 pages
Enhancing Email Security With Naïve Bayes Spam Detection - Docx Fully Edited
No ratings yet
Enhancing Email Security With Naïve Bayes Spam Detection - Docx Fully Edited
64 pages
A Comparative Performance Evaluation of Content Based Spam and Malicious URL Detection in E-Mail
No ratings yet
A Comparative Performance Evaluation of Content Based Spam and Malicious URL Detection in E-Mail
6 pages
E-Mail Spam Detection
No ratings yet
E-Mail Spam Detection
8 pages
Email Spam Detection Using Machine Learning
No ratings yet
Email Spam Detection Using Machine Learning
2 pages
EMAIL+SPAM+DETECTION Final Fishries++ (2658+to+2664) - 1
No ratings yet
EMAIL+SPAM+DETECTION Final Fishries++ (2658+to+2664) - 1
7 pages
Using Support Vector Machine For Classification and Feature Extraction of Spam in Email
No ratings yet
Using Support Vector Machine For Classification and Feature Extraction of Spam in Email
7 pages
Id - 3747 - Literature Review
No ratings yet
Id - 3747 - Literature Review
3 pages
Email Spam Detection for ML Experts
No ratings yet
Email Spam Detection for ML Experts
7 pages
Email Spam Detection (Research Paper)
No ratings yet
Email Spam Detection (Research Paper)
8 pages
IJRPR8167
No ratings yet
IJRPR8167
7 pages
ML Techniques for Spam Detection
No ratings yet
ML Techniques for Spam Detection
7 pages
ML Techniques for Spam Detection
No ratings yet
ML Techniques for Spam Detection
10 pages
46 - Ijme... Mech Engg..Research Paper-1
No ratings yet
46 - Ijme... Mech Engg..Research Paper-1
10 pages
AI-Enabled Email Classiciation Spam Detection (RP)
No ratings yet
AI-Enabled Email Classiciation Spam Detection (RP)
6 pages
ML Algorithms for Spam Detection
No ratings yet
ML Algorithms for Spam Detection
10 pages
Presentation 3
No ratings yet
Presentation 3
13 pages
IEEE Conference Template 148
No ratings yet
IEEE Conference Template 148
6 pages
Madhavan 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012113
No ratings yet
Madhavan 2021 IOP Conf. Ser. Mater. Sci. Eng. 1022 012113
12 pages
Spam Detection in Email Using Machine Le
No ratings yet
Spam Detection in Email Using Machine Le
8 pages
Spam Classification Based On Supervised Learning U
No ratings yet
Spam Classification Based On Supervised Learning U
6 pages
Spam Detection via Machine Learning
No ratings yet
Spam Detection via Machine Learning
11 pages
Email
No ratings yet
Email
27 pages
Hybrid Machine Learning Based E-Mail Spam Filtering Technique
100% (2)
Hybrid Machine Learning Based E-Mail Spam Filtering Technique
58 pages
Considering Behavior of Sender in Spam Mail Detection: S. Naksomboon, C. Charnsripinyo and N. Wattanapongsakorn
No ratings yet
Considering Behavior of Sender in Spam Mail Detection: S. Naksomboon, C. Charnsripinyo and N. Wattanapongsakorn
5 pages
1822 B Deleted Merged Cropped
No ratings yet
1822 B Deleted Merged Cropped
40 pages
Decision Tree Model For Email Classification: Ivana Čavor
No ratings yet
Decision Tree Model For Email Classification: Ivana Čavor
4 pages
Spam Email Using Machine Learning
No ratings yet
Spam Email Using Machine Learning
13 pages
Email Spam Detection
No ratings yet
Email Spam Detection
8 pages
44 Decision Tree Model For Email Classification
No ratings yet
44 Decision Tree Model For Email Classification
4 pages
Project Report Emaildetection 4 44
No ratings yet
Project Report Emaildetection 4 44
41 pages
Content Based Spam Detection in Email Us PDF
No ratings yet
Content Based Spam Detection in Email Us PDF
5 pages
Jebin 2
No ratings yet
Jebin 2
22 pages
Synopsis Email Spam
No ratings yet
Synopsis Email Spam
9 pages
Emai Spam Detection Using Machine Learning and Python - IJRPR3714
No ratings yet
Emai Spam Detection Using Machine Learning and Python - IJRPR3714
6 pages
Related Work
No ratings yet
Related Work
5 pages
Moutafis EWS 098
No ratings yet
Moutafis EWS 098
8 pages
PPT
0% (1)
PPT
15 pages
E-Mail Spam Filtering
No ratings yet
E-Mail Spam Filtering
7 pages
Spam Filtering Techniques Survey
No ratings yet
Spam Filtering Techniques Survey
7 pages
Published Paper
No ratings yet
Published Paper
9 pages
B.Sc. Project: Email Spam Filter
No ratings yet
B.Sc. Project: Email Spam Filter
35 pages
1822 B Deleted
No ratings yet
1822 B Deleted
38 pages
Comparative Analysis of Classifiers For PDF
No ratings yet
Comparative Analysis of Classifiers For PDF
6 pages
(IJCST-V11I2P16) :shikha, Jatinder Singh Saini
No ratings yet
(IJCST-V11I2P16) :shikha, Jatinder Singh Saini
9 pages
Spam Detection
No ratings yet
Spam Detection
4 pages
Fin Irjmets1697888326
No ratings yet
Fin Irjmets1697888326
4 pages
Evaluating The Effectiveness of Machine Learning Methods For
No ratings yet
Evaluating The Effectiveness of Machine Learning Methods For
8 pages
2023 V14i805
No ratings yet
2023 V14i805
7 pages
Spam Detection & Classification Final
No ratings yet
Spam Detection & Classification Final
38 pages
Research PPR
No ratings yet
Research PPR
6 pages
Optimizing Spam Filtering With Machine Learning
No ratings yet
Optimizing Spam Filtering With Machine Learning
35 pages
Spam Mail Detection Using Machine Learning
No ratings yet
Spam Mail Detection Using Machine Learning
5 pages
Email Based Spam Detection
No ratings yet
Email Based Spam Detection
5 pages
Naive Bayes Spam Filte....
No ratings yet
Naive Bayes Spam Filte....
10 pages
Kongunadu College of Engineering and Technology: Automated Spam Filtering: A Fuzzy Similarity Approach
No ratings yet
Kongunadu College of Engineering and Technology: Automated Spam Filtering: A Fuzzy Similarity Approach
6 pages
$RB0DCAN
No ratings yet
$RB0DCAN
10 pages
Naive Bayesian Spam Filtering
No ratings yet
Naive Bayesian Spam Filtering
6 pages
10 Essential InDesign Skills by InDesignSkills
100% (5)
10 Essential InDesign Skills by InDesignSkills
14 pages
201 64-OS-2 Hindi
No ratings yet
201 64-OS-2 Hindi
24 pages
Warhammer 40k: Assassinorum Guide
0% (1)
Warhammer 40k: Assassinorum Guide
4 pages
7 Fresh and Simple Ways To Test Cross-Browser Compatibility - FreelanceFolder
No ratings yet
7 Fresh and Simple Ways To Test Cross-Browser Compatibility - FreelanceFolder
45 pages
Manual SIMOTION Web Accumulator V3.0.0
No ratings yet
Manual SIMOTION Web Accumulator V3.0.0
59 pages
Zero Point Calibration
100% (1)
Zero Point Calibration
4 pages
SVFE CBS-Specification
No ratings yet
SVFE CBS-Specification
98 pages
ITU T A5 TD New G.1028.2
No ratings yet
ITU T A5 TD New G.1028.2
7 pages
Apple’s Unique Social Media Strategy
No ratings yet
Apple’s Unique Social Media Strategy
6 pages
Sam International CV
No ratings yet
Sam International CV
3 pages
Internship Report
No ratings yet
Internship Report
25 pages
Cafs Individual Work Notes
No ratings yet
Cafs Individual Work Notes
17 pages
Duration:: Internship Report From Wolkite University Ict Center Olkite
No ratings yet
Duration:: Internship Report From Wolkite University Ict Center Olkite
50 pages
Phenotype Boot
No ratings yet
Phenotype Boot
3 pages
WCS Wireless Communication by T L Singal - PDF
No ratings yet
WCS Wireless Communication by T L Singal - PDF
26 pages
Powerpoint Dissertation Proposal
100% (2)
Powerpoint Dissertation Proposal
5 pages
Modbus Scan Task User's Guide, July 21, 2011 PDF
No ratings yet
Modbus Scan Task User's Guide, July 21, 2011 PDF
62 pages
Path Planning For Unmanned Ground Vehicle: Fethi DEMIM, Kahina LOUADJ, Abdelkrim NEMRA
No ratings yet
Path Planning For Unmanned Ground Vehicle: Fethi DEMIM, Kahina LOUADJ, Abdelkrim NEMRA
3 pages
DBMS Keys: Primary, Candidate, Super, Alternate and Foreign Etc
No ratings yet
DBMS Keys: Primary, Candidate, Super, Alternate and Foreign Etc
25 pages
LinQ trong lập trình C# Winform - How Kteam-trang-3-6
No ratings yet
LinQ trong lập trình C# Winform - How Kteam-trang-3-6
3 pages
Swarm Intelligence Seminar
100% (1)
Swarm Intelligence Seminar
35 pages
Solution Methodology
No ratings yet
Solution Methodology
5 pages
Chapter4 OK
No ratings yet
Chapter4 OK
39 pages
Unit-6 Ai Tools-Chatgpt
100% (1)
Unit-6 Ai Tools-Chatgpt
9 pages
CMOS 4000 Series IC List
No ratings yet
CMOS 4000 Series IC List
6 pages
Advanced Stack Implementations
No ratings yet
Advanced Stack Implementations
2 pages
Controller Design
No ratings yet
Controller Design
253 pages
Packet Tracer - Access Control List Demonstration: Objectives
No ratings yet
Packet Tracer - Access Control List Demonstration: Objectives
3 pages
E&H SIL Poster
No ratings yet
E&H SIL Poster
1 page
Aadhaar Details for Mehtab
No ratings yet
Aadhaar Details for Mehtab
1 page

Ijirt156181 Paper

Uploaded by

Ijirt156181 Paper

Uploaded by

© July 2022| IJIRT | Volume 9 Issue 2 | ISSN: 2349-6002

Paper on Spam Email Detection with Classification Using

Naresh Vinod Wankhade1, Dr.Ranjit. R. Keole2, Prof.Tushar. R. Mahore3

IJIRT 156181 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 1055

IJIRT 156181 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 1056

User Login Window

Spam Email Detection System

IJIRT 156181 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 1057

IJIRT 156181 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 1058

International Journal of Computer Science &

IJIRT 156181 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 1059

You might also like