[go: up one dir, main page]

0% found this document useful (0 votes)
10 views14 pages

Tweet Sentiment Classification Report

The mini project titled 'Tweet Sentiment Classification using NLP and VADER' by Naitik focuses on classifying tweets into positive and negative sentiments using Natural Language Processing and the VADER sentiment analyzer. The project utilizes a dataset of over 33,000 tweets sourced from Kaggle, applying data cleaning and visualization techniques to analyze sentiment trends. Future work may include enhancements such as sarcasm detection and multilingual support.

Uploaded by

naitikpawar22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views14 pages

Tweet Sentiment Classification Report

The mini project titled 'Tweet Sentiment Classification using NLP and VADER' by Naitik focuses on classifying tweets into positive and negative sentiments using Natural Language Processing and the VADER sentiment analyzer. The project utilizes a dataset of over 33,000 tweets sourced from Kaggle, applying data cleaning and visualization techniques to analyze sentiment trends. Future work may include enhancements such as sarcasm detection and multilingual support.

Uploaded by

naitikpawar22
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Mini Project Report

Title: Tweet Sentiment Classification using NLP and VADER


Name: Naitik

Roll Number: [Your Roll Number]

Department: Computer Engineering

Institution: [Your College Name]

Guide: [Guide/Supervisor's Name]

Date: April 2025


Certificate
This is to certify that the mini project titled "Tweet Sentiment Classification using NLP and
VADER" has been carried out by Naitik under my guidance and supervision. This work is a
record of the student’s own efforts and has not been submitted elsewhere.

(Signature & Stamp)


Guide/Supervisor Name
Department of Computer Engineering
Acknowledgment
I would like to express my sincere thanks to my guide [Guide’s Name], for their valuable
guidance, consistent support, and encouragement throughout this project. I would also like
to thank my department and peers who contributed to this project directly or indirectly.
Abstract
This project presents a sentiment classification approach for tweets related to data science
using Natural Language Processing (NLP) and the VADER sentiment analyzer from the
NLTK library. The dataset was sourced from Kaggle, containing over 33,000 tweets. The
objective was to classify tweets into "positive" and "negative" sentiments, excluding neutral
ones. Data cleaning techniques and sentiment scoring were applied, followed by
visualization using Plotly to observe sentiment trends over time.
Table of Contents
1. Title Page

2. Certificate Page

3. Acknowledgment

4. Abstract

5. Table of Contents

6. List of Figures and Tables

7. Introduction

8. Literature Review

9. Methodology

10. Implementation

11. Results and Discussion

12. Conclusion and Future Work

13. References

14. Appendix
List of Figures and Tables
Figure 1: Sample Tweet Data

Figure 2: Sentiment Over Time Plot


Introduction
Background
With the rise of social media, understanding public sentiment through platforms like
Twitter has become important.

Problem Statement
To classify tweets into positive and negative sentiment classes using NLP techniques.

Objectives
- Clean raw Twitter data
- Analyze sentiment using VADER
- Visualize sentiment trends

Scope
This project is limited to English tweets and focuses only on binary sentiment classification.
Literature Review
Several tools and techniques exist for sentiment analysis including TextBlob, VADER, and
machine learning models. VADER is known for its accuracy with social media text. Studies
show that combining rule-based sentiment analysis with domain-specific dictionaries
improves performance.

References:
- Hutto & Gilbert, "VADER: A Parsimonious Rule-based Model for Sentiment Analysis of
Social Media Text"
- NLTK Documentation
Methodology
Tools and Technologies Used
- Python
- Pandas
- NLTK (VADER)
- Plotly

System Design
1. Data collection from Kaggle
2. Text cleaning and preprocessing
3. Sentiment analysis with VADER
4. Visualization with Plotly

Architecture Diagram
Raw Dataset → Data Cleaning → Sentiment Scoring → Visualization
Implementation
Data Collection
Data was loaded using Pandas from the CSV file. df.info() showed 33,590 records and 36
columns.

Sentiment Analysis
Used VADER SentimentIntensityAnalyzer to compute scores and classify into positive or
negative.

Categorization
Tweets were categorized based on compound score using a custom function.

Visualization
Positive and negative sentiment data was plotted over time using Plotly.
Results and Discussion
- The classifier was able to label tweets with decent accuracy based on compound score.
- The visualization showed spikes in sentiment around specific dates.
- Limitations: Did not include neutral class or sarcasm detection.
Conclusion and Future Work
This project demonstrated effective tweet classification using rule-based sentiment
analysis. Future work could involve:
- Adding sarcasm detection
- Training custom ML models
- Including multilingual support
References
- Hutto, C.J., & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment
Analysis of Social Media Text.

- NLTK Documentation

- Kaggle Dataset: https://www.kaggle.com/ruchi798/data-science-tweets


Appendix
Full cleaned dataset sample
Additional charts or plots

You might also like