G.H.
RAISONI COLLEGE OF ENGINEERING
& MANAGEMENT, PUNE
Department of Computer Engineering
< Phishing Website Detector Using ML>
Guide: Co-Guide:
Prof.Nivedita Kadam Prof.Gayatree Bedre
Name of Projectees
BCOB81-Safin Tamboli BCOB82-Sairaj Jadhav
BCOB85-Sarthak Pawar BCOB90-Shoeb Shah
Contents
Introduction
Justifications for Selecting the Title
Problem Statement
Examples of Phishing Websites
Literature Survey
Block Diagram
Expected Result
Work plan
Future Scope
References
Introduction
Nowadays Phishing becomes a main area of concern for security researchers
because it is not difficult to create the fake website which looks so close to
legitimate website. Experts can identify fake websites but not all the users can
identify the fake website and such users become the victim of phishing attack.
Main aim of the attacker is to steal banks account credentials.
Phishing attacks are becoming successful because lack of user awareness. Since
phishing attack exploits the weaknesses found in users, it is very difficult to
mitigate them, but it is very important to enhance phishing detection techniques.
Justification For Selecting The Title
The main purpose of the project is to detect a website created by any phisher for
hacking data from a user and making aware the user of such threats once
detected. It proposes to prove much beneficial for users for safe browsing and
keeping their data untouched by any phisher who is trying to use the user’s
credentials in illegal means.
So the title mentioned clearly gives the ideology and goal of our project.
i.e. “Phishing Website Detector Using ML)”
Problem Statement
Machine learning technology consists of a many algorithms which requires past
data to make a decision or prediction on future data. Using this technique,
algorithm will analyze various blacklisted and legitimate URLs and their
features to accurately detect the phishing websites including zero- hour phishing
websites.
The following were the questions we will be proposing to solve through this
project-
1. How URL detectors identify the phishing URLs or websites?
2. How to apply ML methods to classify malicious and legitimate websites?
3. How to evaluate a URL detector performance?
Examples of Phishing Websites-
1. Phishing Website send via mail-
2. Phishing Website sent via SMS-
Literature Survey
Sr. No. Paper title & its author Methodology Advantages Future Scope
1. The proposed framework employs RNN- The outcome of this study reveals
Title: Detecting phishing websites using The future direction of this study is to
machine learning technique LSTM to identify the properties Pm and that the proposed method presents develop an unsupervised deep learning
Pl in an order to declare an URL as superior results rather than the method to generate insight from a URL.
Author: Ashit Kumar Dutta malicious or legitimate. existing deep learning methods
2. Title: URLs of benign websites were collected We have implemented python In future hybrid technology will be
from www.alexa.com and The URLs of program to extract features from implemented to detect phishing websites
Phishing Website Detection using Machine phishing websites were collected from URL. Below are the features that we more accurately, for which random forest
Learning Algorithms www.phishtank.com. have extracted for detection of algorithm of machine learning
phishing URLs. technology and blacklist method will be
Authors: And Classifies using Decision Tree as
used.
Splitter.
Rishikesh Mahajan
3. Title :Phishing Website Detection Using To select features, we used the Weka In our approach, to find most This is important, as we hope with a
Machine Learning Classifiers Optimized by tool and its algorithms for feature valuable features we used multiple decrease in the number of features, we
selection. feature selection filters. The outputs decreased time needed to build a model.
Feature Selection.
of these filters are analyzed and
To perform phishing websites detection,
features that are proposed as most
in this work we applied K-Nearest
important.
Neighbor (KNN)
Authors: Dželila Mehanović* | Jasmin Kevrić
4. Title: AN APPROACH FOR DETECTING PHISHING It consists of the parallel Support vector machine gives an In the future, we can find a better way to
ATTACKS accuracy of 91.3% on test data set. find a phishing website by using
decision tree which take the input and
This helps in providing accuracy. advanced features of the URL.
USING MACHINE LEARNING TECHNIQUES. produce a specific class. Thus, n number
of trees produce different classes.
Author: K.Venkateshwara Rao
Block Diagram
Block diagram of various stages of project-
Detecting Phising Websites using ML
Phising Website Detector Using ML
Algorithms
Phistank
Feature
Malicious URLs Legitimate URLs Extraction
Crawler
Data
RNN & RNN
Emails/SMS/
Random Forest Training
Enterprises Testing Phase
Phase
Evaluating The
Result
Future Scope-
1.Creating a safe user friendly environment which can detect illegitimate activities.
2.It is possible to report and block a hacker using phishing website URL and tracing the
location of such anonymous hackers.
3.Awareness can be created among users by displaying certain type of Phishing URLs
available or cause more harm to our system like zero hour phishing websites.
Expected Result
System Description: Detecting Websites/URLs
Input: URLs, Random websites, Transaction IDs, Suspicious Mails
Output: Safe for Browsing (Continue) / Unsafe For Browsing (Block Website)
Possible Success Conditions: Developing a cautious way of browsing on internet,
checking random URLs forwarded on our mails or social media.
Failure Conditions: New format of phishing website may go undetected.
Work – Plan
Months Activities AUG’22 SEP’22 OCT’22 NOV’22 DEC’23 JAN’23 FEB’23
Literature Reviews √ √
Component Identification & Selection √
Designing √ √ √
Experimental Analysis √ √
Fabrication
Testing and Debugging
Preparation of Project Report
References
1. Anti-Phishing Working Group (APWG), https://docs.apwg.org//reports/apwg_trends_report_q4_2019. pdf
2. Jain A.K., Gupta B.B. “PHISH-SAFE: URL Features-Based Phishing Detection System Using Machine Learning”, Cyber
Security. Advances in Intelligent Systems and Computing, vol. 729, 2018, https://doi. org/10.1007/978-981-10-
8536-9_44
3. Purbay M., Kumar D, “Split Behavior of Supervised Machine Learning Algorithms for Phishing URL Detection”,
Lecture Notes in Electrical Engineering, vol. 683, 2021, https://doi.org/10.1007/978-981- 15-6840-4_40
4. Gandotra E., Gupta D, “An Efficient Approach for Phishing Detection using Machine Learning”, Algorithms for
Intelligent Systems, Springer, Singapore, 2021, https://doi.org/10.1007/978-981-15-8711-5_ 12.
5. Hung Le, Quang Pham, Doyen Sahoo, and Steven C.H. Hoi, “URLNet: Learning a URL Representation with Deep
Learning for Malicious URL Detection”, Conference’17, Washington, DC, USA, arXiv:1802.03162, July 2017.
6. Hong J., Kim T., Liu J., Park N., Kim SW, “Phishing URL Detection with Lexical Features and Blacklisted Domains”,
Autonomous Secure Cyber Systems. Springer, https://doi.org/10.1007/978-3-030-33432- 1_12.
7. J. Kumar, A. Santhanavijayan, B. Janet, B. Rajendran and B. S. Bindhumadhava, “Phishing Website Classification
and Detection Using Machine Learning,” 2020 International Conference on Computer Communication and
Informatics (ICCCI), Coimbatore, India, 2020, pp. 1–6, 10.1109/ ICCCI48352.2020.9104161.
Thank you !