Aws Ai ML Report
Aws Ai ML Report
On
Submitted to
For
In
By
Eppili Jatin
(21311A6902)
This is to certify that this Summer Industry Internship – II Report on “AI-ML Virtual Internship”,
submitted by Eppili Jatin (21311A6902) in the year 2024 in partial fulfillment of the academic requirements
of Jawaharlal Nehru Technological University for the award of the degree of Bachelor of Technology in
Computer Science and Engineering-IOT , is a bonafide work in industry internship that has been carried out
during IV B Tech CSE-IOT I Semester Summer, will be evaluated in IV B Tech CSE-IOT I Semester ,
under our guidance. This report has not been submitted to any other institute or university for the award of
any degree.
External Examiner
Date:-
DECLARATION
It is declared to the best of our knowledge that the work reported does not form part of any dissertation submitted to
any other University or Institute for award of any degree
EPPILI JATIN
21311A6902
ACKNOWLEDGEMENT
I would like to express my gratitude to all the people behind the screen who helped me to transform an idea into a real
application.
I would like to thank my Project co- ordinator Mrs. C. Swetha for his technical guidance, constant encouragement and
support in carrying out my project at college.
I profoundly thank Dr. T. Venkat Narayana Rao, Head of the Department of Computer Science & Engineering –IOT
who has been an excellent guide and also a great source of inspiration to my work.
I would like to express my heart-felt gratitude to my parents without whom I would not have been privileged to achieve
and fulfill my dreams. I am grateful to our principal, Dr. T. Ch. Siva Reddy, who most ably run the institution and has
had the major hand in enabling me to do my project.
The satisfaction and euphoria that accompany the successful completion of the task would be great but incomplete
without the mention of the people who made it possible with their constant guidance and encouragement crowns all the
efforts with success. In this context, I would like thank all the other staff members, both teaching and non-teaching,
who have extended their timely help and eased my task.
EPPILI JATIN
21311A6902
ABSTRACT
The AI-ML virtual internship offered a comprehensive and practical understanding of artificial
intelligence (AI) and machine learning (ML), focusing on the transformative power of these technologies
in real-world applications. Participants explored foundational AI concepts such as natural language
processing (NLP), generative models, and supervised learning, bridging theory with practice. A notable
highlight was the development of an email spam detector, utilizing tools like Scikit-learn and NLTK. This
project involved key processes such as data preprocessing, vectorization, and the application of Naive
Bayes classifiers to distinguish spam from legitimate emails. The internship emphasized hands-on
learning, fostering problem-solving skills through practical projects. Participants gained experience in
designing machine learning pipelines, understanding feature engineering, and evaluating model
performance.
By integrating AI technologies into meaningful applications, the program showcased how automation
enhances efficiency, such as in spam email filtering, while improving overall user experience. Beyond
technical skills, the internship cultivated critical thinking and encouraged participants to consider ethical
implications and real-world challenges of AI systems. It highlighted AI's role in automating tasks,
improving decision-making, and its potential across various industries. In conclusion, the AI-ML virtual
internship provided a transformative experience, equipping participants with the skills, confidence, and
knowledge to create scalable, impactful AI-driven solutions while understanding the broader societal
impact of these technologies.
TABLE OF CONTENTS
1. Executive Summary 1
1.1 Course Learnings Objectives 2
1.2 Course Outcomes 2
2. Overview of the Organization 3
2.1 Introduction of the Organization 3
2.2 Vision 3
3. Internship Part 5
3.1 Intern’s day-to-day Responsibilities include 5
4. weekly report 6
4.1 Activity log for first week 6
5. Project 26
5.1 Install Required Libraries 26
5.2 python code for email spam detection 26
5.3 Explanation of Code 27
5.4 Improving the Model 28
5.5 Output Screens 28
6. Outcomes description 30
5.1 Describe the work environment you have experienced. 30
5.2 Describe the real time technical skills you have acquired. 30
5.3 Describe the managerial skills you have acquired. 30
7. Conclusion 31
7.1 Bibliography 32
8. Appendix 33
1. EXECUTIVE SUMMARY
The internship involved gaining a good understanding of a Machine Learning model for
employee promotion. My task is to design and develop this model, which involves:
• Understanding the data set
• Cleaning of the data set
• Get to know bow the metrics of the data are evaluated
• Create a model suitable for this problem statement
One of the important achievements of this internship was the development of the model
object such that it is flexible according to the data given to it. The objective is to take
anything thrown at it, even though it is not pre-processed sufficiently and outputs the
predicted labels.
A model was finally developed using the above object. It was a prototype solution to a
real-life problem which is promotion of employees based on their performance metrics.
I acquired many new technical skills throughout my work team. I acquired new
knowledge in the area of Machine Learning. I also brushed up my Python skills while
making the Machine Learning Model. Then I got introduced to the area of research and
bow to approach it. Most importantly, the work experience was particularly good which
included good fellowship, cooperative teamwork and accepting responsibilities.
Although I spent a lot of time learning new things, I found that l was well trained in
certain areas that helped me substantially in my projects. Many programming skills that I
used in my projects, such as programming style and design, were ones that I had acquired
during my studies in Computing Science. Work techniques like completing the work
beforehand even though it does not require to be completed today and as well as others
are also learnt during this internship. It taught how to solve a particular problem based
only 011 data as input. Here data means raw data as in numbers. These techniques can be
used in my future job as the whole topic of Analyst is dependent on this. This is the
internship report based on the two-month long internship program that I had successfully
completed in AICTE from 18/07/2022 to 24/09/2022 as a requirement of my B. Tech .
Program on Department of Computer Science and Engineering. As being completely new
to practical, corporate world setting. Every hour spent in the internship gave me some
amount of experience all the time all of which cannot be explained in words. But
nevertheless, they were all useful for my career.
The Report will cover background information on the internship I was involved in, as well
as details on how the projects or tasks were developed. This concludes my overall work
experience as well as my opinion of the Industrial Internship Program in general.
1
1.1 COURSE LEARNINGS OBJECTIVES
• Internships are generally thought of to be reserved for college students looking to gain
experience in a particular field. However, a wide array of people can benefit from Training
Internships in order to receive real world experience and develop their skills.
• An objective for this position should emphasize the skills you already possess in the area
and your interest in learning more
• Internships are utilized in a number of different career fields, including architecture.
2
2. OVERVIEW OF THE ORGANIZATON
3
WEBISTE : aicte-india.org
The company:
4
3. INTERNSHIP PART
5
4. ACTIVITY LOG AND WEEKLY REPORT
WEEK-1
6
WEEKLY REPORT:
WEEK–1 (From Date 18-04-2024 to Date 23-04-2024)
Objective of the Activity Done:
Cloud Concepts Overview & Cloud Economic
Detailed Report:
In this week, I have learned how to:
7
WEEK-2
8
WEEKLY REPORT
WEEK–2 (From Date 25-04-2024 to Date 30-04-2024)
Objective of the Activity Done:
AWS Global Infrastructure Over view.
AWS Cloud Security &Network
Networking and Content Delivery
Detailed Report:
● Identify the difference between AWS Regions, Availability Zones, and edge locations
9
WEEK-3
10
WEEKLY REPORT
WEEK–3 (From Dt 01-05-2024 to Dt 06-05-2024)
Objective of the Activity Done:
● Differentiate between Amazon EBS, Amazon S3, Amazon EFS, and Amazon S3 Glacier
11
WEEK-4
12
WEEKLY REPORT
Objective of the Activity Done: Databases, Cloud Architecture and Introduction to Auto Scaling
and Monitoring.
Detailed Report:
● Basic process of how the machine Learning works on real time projects.
13
WEEK-5
14
WEEKLY REPORT
WEEK–5 (From Dt 15-05-2024to Dt 20-05-2024)
Objective of the Activity Done: Introduction to Machine Learning Pipeline
Detailed Report:
In this week, I have learned how to:
● Identifying Correlations.
15
WEEK-6
Day–1
Implementing Cleaning your Data, Dealing
a Machine Learning with Outliners and selecting
pipeline with Amazon sage features.
Maker
Day-2
Implementing LAB:AMAZON
a Machine Learning sage Maker -
pipeline with Amazon sage Encoding Categorical Data
Maker
Day–3
Implementing Training a model using
a Machine Learning Amazon SageMaker LAB:
pipeline with Amazon sage Training Model
Maker
Hosting and using the
Day–4
Implementing model , LAB: Amazon
a Machine Learning SageMaker
pipeline with Amazon sage Deploying a model.
Maker
Implementing Evaluating the accuracy of
Day–5 a Machine Learning the model. Calculating
pipeline with Amazon sage classification metrics.
Maker Selecting classification
thresholds. LAB: Generating
model performance metrics.
Implementing Hyper parameter and model
Day–6
a Machine Learning tuning, LAB: 1-
pipeline with Amazon sage lyperparameter and model
Maker tuning.
16
WEEKLY REPORT
WEEK –6 (From Date 22-05-2024 to Date 27-05-2024)
17
WEEK-7
18
WEEKLY REPORT
WEEK –7 (From Date 29-05-2023 to Date 03-06-2024)
19
WEEK-8
Introduction to
Day–1 Introduction Computer Computer
Vision. Vision, Image and Video
Analysis
Facial recognition and
Day-2 Introduction Computer Video Analysis with
Vision. Amazon Recognition
20
WEEKLY REPORT
WEEK –8 (From Date 05-06-2024 to Date 10-06-2024)
Detailed Report:
In this week, I have learned how to:
● Using of Datasets.
● Extracting of Datasets.
● Processing of Datasets.
21
WEEK-9
22
WEEKLY REPORT
WEEK –9 (From Date 12-06-2024 to Date 17-06-2024 )
Detailed Report:
In this week, I have learned how to:
● Identifying Language.
23
WEEK-10
Day-2 Introduction
Introduction to Natural
Language Processing to Amazon Dolly
Day–3 Description
Introduction to Natural
Language Processing about Amazon Polly
24
WEEKLY REPORT
WEEK –10 (From Date 19-06-2024 to Date 24-06 -2024)
25
5. PROJECT
From the learnings throughout this course, I have got a deeper understanding about the tools and frameworks
used in the
AI-ML projects, using all the learnings I have implemented a project which detects whether an email is spam or
ham.
An email spam detector can be created using a machine learning model trained on a dataset containing labeled
spam and non-spam emails. Below is a Python implementation using the Natural Language Toolkit (NLTK)
and Scikit-learn, two popular libraries for natural language processing and machine learning.
STEP-BY-STEP IMPLEMENTATION:
Ensure the following libraries are installed. Run the following commands if needed:
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report
df = pd.DataFrame(data)
● Replace the sample data with a larger dataset like the SpamAssassin Public Corpus or similar datasets
available online.
● The Label column uses 1 for spam and 0 for ham (non-spam).
2. Text Vectorization:
● Converts text into numerical data using CountVectorizer, which creates a bag-of-words representation.
3. Model Training:
● The MultinomialNB model is used, which is well-suited for text classification tasks like spam detection.
27
4. Evaluation:
5. Prediction:
● Tests the model on new email text to determine if it's spam (1) or not (0).
● Use TF-IDF (Term Frequency-Inverse Document Frequency) for better text representation:
● Experiment with other machine learning models (e.g., Logistic Regression, Support Vector Machines).
TEST – 1:
Accuracy: 1.0
Classification Report:
precision recall f1-score support
0 1.00 1.00 1.00 2
1 1.00 1.00 1.00 1
Prediction (1=spam, 0=ham): [1]
28
TEST – 2 :
SAMPLE INPUT:
"Congratulations! You’ve been selected as a lucky winner for our special offer. Act now to
claim your prize!"
OUTPUT:
Accuracy: 1.0
Classification Report:
precision recall f1-score support
0 1.00 1.00 1.00 2
1 1.00 1.00 1.00 1
Prediction (1=spam, 0=ham): [1]
TEST – 3:
SAMPLE INPUT:
"Your invoice for last month’s services is attached. Let us know if you have any questions."
OUTPUT:
Accuracy: 1.0
Classification Report:
precision recall f1-score support
0 1.00 1.00 1.00 2
1 1.00 1.00 1.00 1
Prediction (1=spam, 0=ham): [0]
29
6. OUTCOMES DESCRIPTION
My work environment is one where I'm able to work as part of a team and that allows everyone's talents
to grow. As I researched your company, I noticed its devotion to cultivating each employee's skills and
abilities. I've found that this type of environment is most conducive to my productivity, especially in a
position that requires me to constantly improve my design skills. It allows me to remain passionate
about my job and helps me express my creativity to the best of my ability.
6.2 DESCRIBE THE REAL TIME TECHNICAL SKILLS YOU HAVE
ACQUIRED
Technical skills, I have acquired:
• Data extraction
• Data cleaning
• Classifications
• Regression
• Numpy
• Pandas
• Sklearn
• Keras
a. Technical Skill.
b. Conceptual Skill.
c. Interpersonal and Communication Skills.
d. Decision-Making Skill.
30
7. CONCLUSION
The AI-ML virtual internship was an enriching experience that provided a robust
understanding of artificial intelligence (AI) and machine learning (ML) concepts, particularly
their practical applications in solving real-world problems. During this program, I explored
core principles such as natural language processing (NLP), generative models, and the
integration of these techniques into projects. The focus on NLP expanded my comprehension
of how AI systems interpret, process, and generate human language effectively, which is vital
for tasks like text classification, sentiment analysis, and conversational AI.
One of the key projects undertaken was the implementation of a Video Game Sales Prediction
using Linear Regression and Decision Trees. This project involved preprocessing data, applying
machine learning models, and evaluating their performance. Leveraging tools like Scikit-learn
and Natural Language Toolkit (NLTK), I gained hands-on experience in creating a classifier
capable of distinguishing Video Game Sales Prediction from legitimate ones. The use of
techniques like vectorization and supervised learning models (e.g., Naive Bayes) was crucial
in building a robust and efficient solution.
The Video Game Sales Prediction project underscored the significance of AI in automating
tedious and repetitive tasks, such as analyzing trends of game sales , which enhances
productivity and user experience. This hands-on exercise reinforced critical problem-solving
skills and emphasized the importance of data preprocessing, feature engineering, and model
evaluation for achieving accurate results.
Overall, the internship bridged theoretical knowledge with practical application, deepening
my understanding of AI's transformative capabilities. It also fostered critical thinking and
collaboration, equipping me with skills to innovate in AI-driven domains. With projects like
the spam detector, I now feel more confident in applying AI-ML techniques to build scalable
and impactful solutions in future endeavors.
31
7.1 BIBLIOGRAPHY
1. Bishop, Christopher M
Pattern Recognition and Machine Learning.
This book provides an in-depth understanding of supervised learning models like Naive
Bayes, which are foundational for text classification tasks such as spam detection.
Publisher: Springer, 2006.
URL: [Springer Link](https://www.springer.com/gp/book/9780387310732)
2. Scikit-learn Documentation
Scikit-learn: Machine Learning in Python.
This official documentation explains the implementation and usage of machine learning
models, including Naive Bayes, vectorization techniques, and model evaluation metrics, used
in the spam detector.
URL: [Scikit-learn Documentation](https://scikit-learn.org/stable/)
3. NLTK Documentation
Natural Language Toolkit Documentation.
This source offers insights into processing text data, tokenization, and other natural
language processing techniques relevant to building the spam detection pipeline.
URL: [NLTK Documentation](https://www.nltk.org/)
5. Raschka, Sebastian
Python Machine Learning.
This book provides practical examples of implementing machine learning projects,
including text classification and data preprocessing, using Python libraries like Scikit-learn.
Publisher: Packt Publishing, 2015.
URL: [Python Machine Learning](https://www.packtpub.com/product/python-machine-
learning-third-edition/9781789955750)
32
APPENDIX C: ABSTRACT
ABSTRACT
Batch
Title
Roll No Name
33
Table 1: Project/Internship correlation with appropriate POs/PSOs (Please specify
level of Correlation, H/M/L against POs/PSOs)
P
PO
O PO PO PO PO PO PO PO PO PO1 PSO PSO PSO
PO 12
1 2 3 4 5 6 7 8 9 0 1 2 3
11
M L L H H L M H M H H H H H M
Batch
Title
Roll No Name
34
Table 2: Nature of the Project/Internship work (Please tick Appropriate for your project)
35
Table 3: Domain of the Project/ Internship work (Please tick Appropriate for your project)
Video
Game
Sales
Prediction
using
20
Linear
Regressio
n and
Decision
Trees
36