0% found this document useful (0 votes)

14 views3 pages

Chapter 6

The document outlines the process for developing an SMS spam filter, including dataset collection, preprocessing, feature engineering, model selection, training, evaluation, testing, and deployment. Key steps involve cleaning text data, converting it into numerical formats, selecting appropriate machine learning or deep learning models, and evaluating performance using metrics like accuracy and F1 score. The final goal is to deploy the model for real-world SMS classification.

Uploaded by

shalinigowda004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views3 pages

Chapter 6

Uploaded by

shalinigowda004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

SMS spam filter 2024-25

Chapter 6
Testing
1. Dataset Collection - Obtain a dataset: Use an existing SMS dataset like the "SMS
Spam Collection Dataset" from UCI or Kaggle.

- Create a dataset: Collect SMS messages and label them as "spam" or "ham" (not spam).

2. Preprocessing

- Text cleaning: Remove unnecessary characters (punctuation, special symbols, etc.).

- Tokenization: Split messages into words or tokens.

- Lowercasing: Convert all text to lowercase for uniformity.

- Stopword removal: Remove common words that don’t add much meaning (e.g., "the",
"and").

- Stemming/Lemmatization: Reduce words to their root form.

3. Feature Engineering

- Convert text to numerical data:

- Bag of Words (BoW).

- TF-IDF (Term Frequency-Inverse Document Frequency).

- Word embeddings: Pre-trained embeddings like Word2Vec or GloVe, or embeddings from
transformer models (e.g., BERT).

4. Model Selection

- Use machine learning models like:

- Naive Bayes.

- Support Vector Machines (SVM).

- Logistic Regression.

- Random Forest.

- Or deep learning models:

- Recurrent Neural Networks (RNNs) or Long Short-Term Memory (LSTM) networks.

- Transformer-based models (e.g., BERT, DistilBERT).

5. Train/Test Split

- Split the dataset into training and testing subsets (e.g., 80/20 split).

Department of CS&BS P a g e | 53
SMS spam filter 2024-25

6. Model Training

- Train the model using the training dataset.

7. Evaluation

- Use metrics like:

- Accuracy: Percentage of correct predictions.

- Precision: Ratio of correctly predicted spam messages to total predicted spam messages.

- Recall (Sensitivity): Ratio of correctly predicted spam messages to actual spam messages.

- F1 Score: Harmonic mean of precision and recall.

8. Testing

- Use the test dataset to evaluate the model's performance.

- Input example SMS texts to check the filter's accuracy.

9. Deployment

- Deploy the model in a real-world application to classify incoming SMS messages.

Department of CS&BS P a g e | 54
SMS spam filter 2024-25

Chapter 7

Result Analysis

Department of CS&BS P a g e | 55

Chapter 1
No ratings yet
Chapter 1
22 pages
Implemention of Sms Spam Filtering
No ratings yet
Implemention of Sms Spam Filtering
27 pages
Sms
No ratings yet
Sms
16 pages
SMS SPAM FILTERING Report
No ratings yet
SMS SPAM FILTERING Report
38 pages
Sms Spam Using Machine Learning 4
No ratings yet
Sms Spam Using Machine Learning 4
42 pages
Application Development Lab Report: Sree Dattha Group of Institution, Hyderabad
No ratings yet
Application Development Lab Report: Sree Dattha Group of Institution, Hyderabad
32 pages
Real Time Spam Detection
No ratings yet
Real Time Spam Detection
65 pages
Final Report69
No ratings yet
Final Report69
34 pages
(KAVYA R SHETTY)
No ratings yet
(KAVYA R SHETTY)
21 pages
Synopsys of Spam Classifer
No ratings yet
Synopsys of Spam Classifer
4 pages
Report
No ratings yet
Report
62 pages
Aiml Pro
No ratings yet
Aiml Pro
14 pages
Sms Spam Filtering System Hybrid Approaches
No ratings yet
Sms Spam Filtering System Hybrid Approaches
25 pages
228w1f0040 Review1
No ratings yet
228w1f0040 Review1
15 pages
1822 B.E Cse Batchno 109
No ratings yet
1822 B.E Cse Batchno 109
55 pages
Spam Detection Using Machine Learning - (Mohd - Ammaar - Khan)
No ratings yet
Spam Detection Using Machine Learning - (Mohd - Ammaar - Khan)
20 pages
Department of Cse (Artificial Intelligence & Data Science) : Sms Spam Detection
No ratings yet
Department of Cse (Artificial Intelligence & Data Science) : Sms Spam Detection
27 pages
Roma Seminar
No ratings yet
Roma Seminar
12 pages
Opll
No ratings yet
Opll
20 pages
Spam Detection System 1
No ratings yet
Spam Detection System 1
21 pages
SMS Spam Detection for Developers
No ratings yet
SMS Spam Detection for Developers
9 pages
Major Pro On Sentiment Analysis of Mobile Reviews PDF
No ratings yet
Major Pro On Sentiment Analysis of Mobile Reviews PDF
73 pages
Kriti - Report FINAL
No ratings yet
Kriti - Report FINAL
11 pages
EasyChair Preprint 5166
No ratings yet
EasyChair Preprint 5166
36 pages
SMS Spam Detection Using Machine Learning: An Experimental Study
No ratings yet
SMS Spam Detection Using Machine Learning: An Experimental Study
7 pages
Spamemaildetectionusingmachinelearningppt 230201113400 20a802e7
No ratings yet
Spamemaildetectionusingmachinelearningppt 230201113400 20a802e7
21 pages
Black Yellow Modern Minimalist Elegant Presentation
No ratings yet
Black Yellow Modern Minimalist Elegant Presentation
29 pages
Final Project Report PDF
No ratings yet
Final Project Report PDF
35 pages
Intern 2
No ratings yet
Intern 2
26 pages
ML Lab
No ratings yet
ML Lab
13 pages
Sms Project - For Merge
No ratings yet
Sms Project - For Merge
41 pages
Sms Spam Detection Project Final
No ratings yet
Sms Spam Detection Project Final
59 pages
RTRP Batch 10
No ratings yet
RTRP Batch 10
20 pages
Final Report Scanned
No ratings yet
Final Report Scanned
100 pages
B 14 Sms Spam Detection ML Ieee Report
No ratings yet
B 14 Sms Spam Detection ML Ieee Report
5 pages
Spam Detection
No ratings yet
Spam Detection
10 pages
IJNRD2403165
No ratings yet
IJNRD2403165
5 pages
Format Termpaper
No ratings yet
Format Termpaper
9 pages
Abhishek Mini Proj . File
No ratings yet
Abhishek Mini Proj . File
19 pages
Group 17 Blackbook Final Report
No ratings yet
Group 17 Blackbook Final Report
40 pages
Document 1
No ratings yet
Document 1
1 page
2 Review
No ratings yet
2 Review
21 pages
Major Project Format
No ratings yet
Major Project Format
43 pages
Spam Detection and Filtering
No ratings yet
Spam Detection and Filtering
16 pages
Major-Final Research Paper
No ratings yet
Major-Final Research Paper
3 pages
Spam Message Classification: RTRP Review-1
No ratings yet
Spam Message Classification: RTRP Review-1
12 pages
BERT Against Social Engineering Attack Phishing Text Detection
No ratings yet
BERT Against Social Engineering Attack Phishing Text Detection
6 pages
Mini Project Final 10,42,52
No ratings yet
Mini Project Final 10,42,52
39 pages
Spam Email Classifier
No ratings yet
Spam Email Classifier
17 pages
A Summer Internship Report On Naïve Bayes-Spam Classifier: Prepared by
No ratings yet
A Summer Internship Report On Naïve Bayes-Spam Classifier: Prepared by
26 pages
EmailSpam
No ratings yet
EmailSpam
14 pages
Project Name Spam Email Detection 1
No ratings yet
Project Name Spam Email Detection 1
7 pages
Spam Detection with Python
No ratings yet
Spam Detection with Python
26 pages
Artificial and Intelligence
No ratings yet
Artificial and Intelligence
19 pages
Spa Ming
No ratings yet
Spa Ming
39 pages
Functional Document
No ratings yet
Functional Document
3 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
77 pages

Chapter 6

Uploaded by

Chapter 6

Uploaded by

SMS spam filter 2024-25

- Text cleaning: Remove unnecessary characters (punctuation, special symbols, etc.).

- Tokenization: Split messages into words or tokens.

- Lowercasing: Convert all text to lowercase for uniformity.

- Stemming/Lemmatization: Reduce words to their root form.

- Convert text to numerical data:

- Bag of Words (BoW).

- TF-IDF (Term Frequency-Inverse Document Frequency).

- Use machine learning models like:

- Support Vector Machines (SVM).

- Or deep learning models:

- Recurrent Neural Networks (RNNs) or Long Short-Term Memory (LSTM) networks.

- Transformer-based models (e.g., BERT, DistilBERT).

- Train the model using the training dataset.

- Use metrics like:

- Accuracy: Percentage of correct predictions.

- F1 Score: Harmonic mean of precision and recall.

- Use the test dataset to evaluate the model's performance.

- Input example SMS texts to check the filter's accuracy.

- Deploy the model in a real-world application to classify incoming SMS messages.

You might also like