0% found this document useful (0 votes)

96 views35 pages

A Report Submitted in Partial Fulfillment of The Requirement of The Award of Degree of

This internship report details Keerthi R M's experience at AK Infopark Private Limited, focusing on Python for Data Science over a week in June 2024. The report covers various topics including Python basics, operators, and libraries like NumPy and Pandas, emphasizing their application in real-time data science projects. It highlights the importance of adaptability and teamwork, concluding with recommendations for future marketing strategies based on insights gained during the internship.

Uploaded by

Sathiya Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

96 views35 pages

A Report Submitted in Partial Fulfillment of The Requirement of The Award of Degree of

Uploaded by

Sathiya Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

INTERNSHIP REPORT

A report submitted in partial fulfillment of the requirement of the award of degree of

MASTER OF SCIENCE IN MATHEMATICS

KEERTHI R M

Reg. No.:23083076511012007

(Duration: 12th June to18th June, 2024)

DEPARTMENT OF MATHEMATICS

GOVERNMENT ARTS AND SCIENCE COLLEGE

KANYAKUMARI – 629 401

DATA SCIENCE WITH PYTHON

Submitted by

KEERTHI R M

Reg .No. 23083076511012007

OCTOBER- 2024
ABSTRACT

This report presents a comprehensive analysis of my internship

experience at AK INFOPARK PRIVATE LIMITED,

PARVATHIPURAM, a leading firm in the sector. The primary focus of

the internship was to understand and learn about various software

packages through hands on approach. I was involved in python for Data

Science.

This internship provided an invaluable opportunity to apply theoretical

knowledge acquired in academic studies to real-world scenarios. I

participated in several learning notably python, python operators, working

with numpy and Panda, data science in real time applications, data

visualization and data science components. I utilized various analytical tools

that are very useful for job opportunities.

The internship underscored the importance of adaptability, teamwork,

and continuous learning within the latest updates of Python with data science.

This report details the project undertaken, skills developed and lessons

learned throughout the internship helped me to develop the skills in python

and data science. It concludes with recommendations for future marketing

endeavors based on the insights gained.

TABLE OF CONTENTS

CHAPTER TITLE PAGE NO.

INTRODUCTION 1

1 PYTHON 3

2 PYTHON OPERATORS 8

3 WORKING WITH NUMPY, PANDAS

4 DATA SCIENCE IN REALTIME APPLICATION

5 DATA VISUALIZATION
21

6 DATA SCIENCE COMPONENTS

CONCLUSION
INTRODUCTION

A program is a sequence of instructions that specifies how to

perform a computation. The computation might be something

mathematical, such as solving a system of equations or finding the roots

of a polynomial, but it can also be a symbolic computation such as

searching and replacing text in a document or something graphical, like

processing an image or playing a video. Python is a powerful and

versatile programming language that has become increasingly popular in

the field of data science. With its simple syntax and vast array of libraries

and tools, python has made it easier for data science to manipulate and

analyze data, build predictive models and make data driven decisions. In

this report, we will explore how python is used in data science, as well as

some of the key libraries and tools that data scientists use to perform

their work.

Python, favored by data scientists is flexible and ease of use.

Python is a high-level programming language that is both easy to learn

and easy to read, making it ideal for data science who may not have

strong background in programming. Python also offers a wide range of

libraries and tools that are specifically designed for data analysis and

machine learning such as NumPy, Pandas, matplotlip and scikit-learn.

1
These libraries allow data science to easily manipulate and visualize data,

as well as build and evaluate predictive models.

NumPy is a fundamental package for scientific computing with

python, providing support for large, multi-dimensional arrays and

matrices, as well as a variety of mathematical functions to operate on

these arrays. Pandas are powerful data manipulation library that offers

data structure like data frames and series, which allow data science to

easily work with structured data. Matplotlib is a plotting library that

enables data science to create a wide variety of visualizations, such as

line plots, scatter plots and histogram. Scikit- learns is a machine

learning library that provides a wide range of algorithms for

classification, regression, clustering and more.

As the field of data science continues to grow and evolve, python

still likely remain as a key programming language for data science

around the world.

2
CHAPTER 1

PHYTHON

Python is a popular programming language. It was created by

Guido van Rossum, and released in 1991. It is used for web development

(server-side), software development, mathematics, system scripting.

Python’s popularity in data science is largely attributed to its readability,

ease of learning, and the powerful libraries it provides. These libraries

enable data manipulation, statistical analysis and machine learning,

making Python an invaluable tool for data scientists.

Key Libraries and Tools

Pandas: A library providing high-performance data manipulation

and analysis. It introduces data structures like Data Frames that simplify

data handling and preparation.

NumPy: This library offers support for arrays and matrices, along

with a collection of mathematical functions to operate on these arrays. It

forms the backbone for many scientific computations in Python.

In Python we have list, that serve the purpose of arrays, but they

are slow to process. NumPy aims to provide an array object that is up to

50x faster than traditional Python lists. The array object in NumPy is

3
called ndarray, it provides a lot of supporting functions that make

working with nd array very easy. Arrays are very frequently used in data

science, where speed and resources are very important.

Matplotlib and Seaborn: These libraries are used for data

visualization. Matplotlib offers a wide range of plotting options, while

Seaborn provides a high-level interface for drawing attractive and

informative statistical graphics.

Scikit-learn: A library for machine learning includes simple and

efficient tools for data mining and data analysis. It supports various

algorithms for classification, regression, clustering and dimensionality

reduction.

TensorFlow and PyTorch: These libraries are used for deep

learning. Tensor Flow developed by Google and PyTorch developed by

Facebook, is popular for building and training neural networks.

Workflow in Python

Data Collection: Data can be collected from various sources,

including databases, APIs and web scraping. Libraries like requests and

Beautiful Soup are commonly used for these tasks.

4
Data Cleaning and Preparation: Data often needs to be cleaned

and transformed before analysis. Pandas are particularly useful for

handling missing values, filtering data and merging datasets.

Exploratory Data Analysis (EDA): EDA involves summarizing

the main characteristics of a dataset. This step helps in understanding the

data distribution and uncovering patterns.

Model Building: Using libraries like Scikit-learn or Tensor Flow,

data scientists build and train models to make predictions or classify

data. This involves selecting algorithms, training the model, and tuning

hyper parameters.

PYTHON BASICS

Python is an interpreted high level programming language known for its

simplicity and readability. Python uses indentation to define code blocks,

making it easy to read and understand.

VARIABLES AND DATA TYPES

Variables store data in memory and are assigned using the

assignment operator “=”. Common data types in python include integers,

floats, strings, lists, tuples, dictionaries, and sets.

5
CONTROL STRUCTURES

Conditional statements like ‘if,’ ‘elif’ and ‘else’ allow to make

decisions based on conditions. Loops like ‘for’ and ‘while’ can be used

for iteration and repetitive tasks.

FUNCTIONS

Functions are blocks of reusable code that performs a specific

task. Functions can take arguments as input and return values as output.

MODULES AND PACKAGES

Python modules and files contain python code. Modules are used

to import statement. Packages are directories, containing multiple

modules and a special file called _init_.py.

FILE I/O

Python provides built in functions for reading from and writing

to files. Use ‘open ( )’ to open a file and ‘read ( ) or write ( )’ to

manipulate file contents.

6
OBJECT-ORIENTED PROGRAMMING

Python supports OOP principles like encapsulation, inheritance

and polymorphism. Classes are blueprints for creating objects, while

objects are instance of classes.

7
CHAPTER 2

PYTHON OPERATORS

Operators are standard symbols used for logical and arithmetic

operations and are used to perform operations on variables and values.

Example: +, -, *, /…...The value on which the operator is applied is

called Operand. Python divides the operator as Python Arithmetic

Operators, Python Assignment Operators, Python Comparison Operators,

Python Logical Operators, Python Identity Operators, Python

Membership Operators, and Python Bitwise Operators.

8
Python Arithmetic Operators

Arithmetic operators are used with numeric values to perform common

mathematical operations.

Operator Name Example

Addition x+y
+

x-y
Subtraction
-

* Multiplication x*y

/ Division x/y

% Modulus x*y

** Exponentiation x ** y

Floor division x // y
\\

9
Python Assignment Operators

Assignment operators are used to assign values to variables.

Operator Example Same As

x=5
= x=5

+= x += 3 x=x+3

-= x -= 3 x=x-3

*= x *= 3 x=x*3

/= x /= 3 x=x/3

%= x %= 3 x=x%3

10
Python Comparison Operators

Comparison operators are used to compare two values.

Operators Name Example

== Equal x == y

!= Not equal x != y

> Greater than x>y

< Less than x<y

Greater than or qual to

>= x >= y

11
Python Logical Operators

Logical operators are used to combine conditional statements.

Operator Description Example

Returns True if both

and x < 5 and x < 10
statements are true

Returns True if one of the

or x < 5 or x < 4
statements is true

Reverse the result,

not returns False if the not(x < 5 and x < 10)

result is true

12
Python Identity Operators

Identity operators are used to compare the objects, not if they are

equal, but if they are actually the same object, with the same memory

location.

Operator Name Example

Returns True if both

is Variables are x is y
the same object

Returns True if both

is not Variables are x is not y

not the same object

13
Python Membership Operators

Membership operators are used to test, if a sequence is presented in an

object

Operator Description Example

Returns True if a sequence

In with the specified value is x in y

present in object

Returns True if a sequence

not in with the specified value is x not in y

not present in the object

14
Python Bitwise Operators

Bitwise operators are used to compare (binary) numbers.

Operator Name Description Example

& AND Sets each bit to 1 if x&y

both bits are 1

Sets each bit to 1 if

| OR one of two bits x|y

is 1

Sets each bit to 1 if

^ XOR only one of two x^y

bits is 1

~ NOT Inverts all the bits ~x

Shift left by
pushing zeros in
from the right and x << 2
<< Zero fill left shift
let the leftmost bits
off

15
Examples

16
CHAPTER 3

WORKING WITH NUMPY AND PANDAS

NumPy and pandas are popular libraries in python that are

commonly used for data manipulation and data analysis.

NumPy provides support for multidimensional arrays and

mathematical functions.

Basic operators are importing NumPy, creating arrays, array

indexing, array slicing,

Basic math operations element-wise operations, matrix

multiplication, array reshape, array transpose are array operators

Pandas offer data structures like data frames and series that make

it easy to work with structured data.

In Pandas, creating data frames, creating series, data selection,

data filtering, importing pandas are basic operators. merging, joining,

pivoting, reshaping, grouping and statistics are used for data

manipulation.

17
NumPy and pandas are indispensable tools for best practice

which includes utilizing vectored operations, optimizing data structures

and visualizing data effectively.

Screenshots for NumPy& Pandas

18
CHAPTER 4

DATA SCIENCE IN REAL TIME APPLICATION

Data Science is the deep study of a large quantity of data, which

involves extracting some meaning from the raw, structured and

unstructured data. Extracting meaningful data from large amounts uses

algorithms, processing of data and this processing can be done using

statistical techniques and algorithms, scientific techniques, different

technologies etc. It uses various tools and techniques to extract

meaningful data from raw data.

Data Science is applied in Finance (stock market prediction,

credit scoring), Healthcare (patient monitoring, diseases diagnosis),

Marketing (customer segment) and IOT (sensor data analysis, predictive

maintenance).

In Python Libraries, NumPy and Pandas are used for data

manipulation, Scikitlearn are used for machine learning, Tensor flow or

PyTorch for deep learning, Matplotlib and seaborne for visualization.

In Real time, data sources are used for streaming data (twitter,

sensor, data), API calls (weather, stock prices) and web scraping. Data

19
cleaning and preprocessing, Feature extraction and selection and Data

transformation are used in data processing.

Example:

Import panda as pd

import numpy as np

From sklearn. Model_selection import

train_ test _ split

from sklearn.Linear_modelimport

20
CHAPTER 5

DATA VISUALIZATION

Data visualization is used to represent data graphically to facilitate

understanding, identifying trends, patterns, and correlation. In Popular

Libraries, Matplotlip is used for 2D/3D plotting, Seaborn is used for

statistical visualization, Plotly is used for interactive visualization, Bokeh

are used for web based visualizations and Pandas are used for data

manipulation and visualizations

BASIC PLOTS

Line plots (plt.Plot( ))

Scatter plots (plt.Scatter( ))

Bar charts (plt.bar( ))

Histograms (plt.hidt( ))

21
In real world Data Visualization is applied in business

intelligence, scientific research, machine learning and web analytics.

df=pd. Read_csv(“data.csv”) is a program to load data.

BASIC PLOT

Plt.plot(df („column‟)) Plt. show( )

Data Visualization is a powerful tool for exploring and

communicating insights from data. Python provides a rich set of libraries

for creating a wide range of visualization from basic static plots to

interactive plots.

22
INPUT

OUTPUT

23
CHAPTER 6

DATA SCIENCE COMPONENTS

Data components are essential elements used for storing,

organizing and manipulating data. These components are crucial for

development and programming tasks, as they allow developers to work

with various types of data efficiency.

Data science in an interdisciplinary field that uses scientific

techniques, procedures, algorithms and structures, to extract know-how

and insights from established and unstructured information.

VARIABLES

Variables are used to store data values in memory. Variables are

created simply by assigning a value to a name. For example, a=10 creates

a variable named ‘a’ with a value of 10. Variables can store different

types of data such as integers, floats, strings, lists and dictionaries.

SETS

Sets are unordered collections of unique elements in python. Sets

do not allow duplicate values and elements to store in a random order.

Sets are defined using curly braces{}.

24
ARRAYS

Arrays in python are data structures that can store multiple values

of the same type. Python does not have built-in support for arrays, but the

NumPy library provides multidimensional arrays that are widely used in

scientific computing and data analysis.

Examples:

DATA FRAMES

Data frames are data structures commonly used in data analysis

and manipulation tasks. Data frames are provided by libraries such as

NumPy and pandas which allow developers to work with tabular data in

a versatile and efficient manner.

25
PACKAGES

Packages are collection of python modules that are organized in a

directory hierarchy. Packages allows developers to structure their code in

a more organized and maintainable way and provide a namespace for

organizing related functionality.

Each data components has its own characteristics and advantages,

allowing developers to choose the most suitable data structure.

26
ASSIGNMENTS

27
28
29
CONCLUSION

Python has emerged as the predominant language in the field of

data science due to its flexibility, extensive libraries, and strong

community support. Its simplicity and readability makes it accessible for

both beginners and experienced programmers, enabling data science to

efficiently manipulate, analyze and visualize data. Python’s libraries such

as NumPy, pandas, and matplotlib provide powerful tools for data

manipulation and visualization, while frame works like scikit-learn and

tensor flow enable the development of complex machine learning

models. The languages versatility allows data science to seamlessly

integrate different tools and technologies, making it an invaluable asset

in tackling diverse data science can easily process large data set, derive

meaningful insights, and build predictive models, making it an essential

tool for anyone working in the field of data science.

30
31

Python
No ratings yet
Python
323 pages
Christopher Wilkinson - Python Data Science - An Ultimate Guide For Beginners To Learn Fundamentals of Data Science Using Python (2020)
100% (2)
Christopher Wilkinson - Python Data Science - An Ultimate Guide For Beginners To Learn Fundamentals of Data Science Using Python (2020)
141 pages
Training Report On Data Science With Python
No ratings yet
Training Report On Data Science With Python
9 pages
Python U-5 Combined Notes
No ratings yet
Python U-5 Combined Notes
76 pages
DS Unit 1 - NUMPY
No ratings yet
DS Unit 1 - NUMPY
29 pages
PDS Unit1-1
No ratings yet
PDS Unit1-1
104 pages
Unit 1
No ratings yet
Unit 1
69 pages
Data Science Python
No ratings yet
Data Science Python
42 pages
Python For Data Science
No ratings yet
Python For Data Science
8 pages
Python Self Study Material
0% (1)
Python Self Study Material
9 pages
Python For Data Science
No ratings yet
Python For Data Science
20 pages
AML LAB MANUAL Yash
No ratings yet
AML LAB MANUAL Yash
60 pages
Report On Python
No ratings yet
Report On Python
57 pages
INternship Report
No ratings yet
INternship Report
22 pages
Report On Python
No ratings yet
Report On Python
44 pages
Python
No ratings yet
Python
10 pages
Finall Report Internship
No ratings yet
Finall Report Internship
45 pages
Python
No ratings yet
Python
37 pages
Course Pack - Programming For Data Science
No ratings yet
Course Pack - Programming For Data Science
72 pages
Data Analysis Resume
No ratings yet
Data Analysis Resume
2 pages
Python Theory
No ratings yet
Python Theory
22 pages
Introduction To Python 1
No ratings yet
Introduction To Python 1
13 pages
Manoj 5th Sem Project Report
No ratings yet
Manoj 5th Sem Project Report
20 pages
Play With Python - An Intro To Data Science
No ratings yet
Play With Python - An Intro To Data Science
64 pages
Python Data Analysis Sample Chapter
No ratings yet
Python Data Analysis Sample Chapter
40 pages
Suraj Report File
No ratings yet
Suraj Report File
17 pages
Unit 1
No ratings yet
Unit 1
18 pages
SENG419-python 98745
No ratings yet
SENG419-python 98745
103 pages
Handouts Lecture 1
No ratings yet
Handouts Lecture 1
43 pages
DSBA Curriculum Guide
No ratings yet
DSBA Curriculum Guide
18 pages
Python Note For Class 9th
No ratings yet
Python Note For Class 9th
21 pages
07.24.24 From Insights To Actions - Enterprise Automation With SAP Through RISE With SAP
No ratings yet
07.24.24 From Insights To Actions - Enterprise Automation With SAP Through RISE With SAP
53 pages
Day 2 Feedback
No ratings yet
Day 2 Feedback
13 pages
T - Report Abhishek Choudary
No ratings yet
T - Report Abhishek Choudary
17 pages
Ppt1 Variable Strings Functions
No ratings yet
Ppt1 Variable Strings Functions
87 pages
Data Science Report
No ratings yet
Data Science Report
126 pages
SIHC First Circular Kozhikode Session 2024
No ratings yet
SIHC First Circular Kozhikode Session 2024
5 pages
Python For Data Analysis
No ratings yet
Python For Data Analysis
49 pages
Python Data Science Wilkinson CH
100% (1)
Python Data Science Wilkinson CH
153 pages
ML Model 1
No ratings yet
ML Model 1
42 pages
Nitin Seminar Report
No ratings yet
Nitin Seminar Report
47 pages
Python For Data Science - ANR PL - Final
No ratings yet
Python For Data Science - ANR PL - Final
194 pages
Computer Science Resume
100% (1)
Computer Science Resume
6 pages
MTE204 Data Python
No ratings yet
MTE204 Data Python
45 pages
Regression Models With Python
No ratings yet
Regression Models With Python
128 pages
Python and Its Libraries in Data Science and Related Fields
No ratings yet
Python and Its Libraries in Data Science and Related Fields
4 pages
Data Science
No ratings yet
Data Science
29 pages
O180421 Summer Internship Report
No ratings yet
O180421 Summer Internship Report
33 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
16 pages
Business Management Dissertation PDF
100% (2)
Business Management Dissertation PDF
8 pages
Lecture 01
No ratings yet
Lecture 01
69 pages
Facet
No ratings yet
Facet
12 pages
Explain The Role of Data Science With Python? Ans
No ratings yet
Explain The Role of Data Science With Python? Ans
2 pages
What Is Python?: Why Python For Data Science?
No ratings yet
What Is Python?: Why Python For Data Science?
3 pages
UIReport
No ratings yet
UIReport
31 pages
DS Final
No ratings yet
DS Final
46 pages
Lesson 01 Course Introduction
No ratings yet
Lesson 01 Course Introduction
28 pages
Data Science Collected Resources
No ratings yet
Data Science Collected Resources
30 pages
Unit 1
No ratings yet
Unit 1
14 pages
Data Science Using With Python
No ratings yet
Data Science Using With Python
14 pages
Python Libraries Seminar Report
100% (2)
Python Libraries Seminar Report
16 pages
An Assessment of Football Through The Lens of Data Science
No ratings yet
An Assessment of Football Through The Lens of Data Science
14 pages
Data Science in 2021
No ratings yet
Data Science in 2021
28 pages
BalajiCV PDF
No ratings yet
BalajiCV PDF
5 pages
Internship
No ratings yet
Internship
31 pages
Alisha Class 9th A
No ratings yet
Alisha Class 9th A
7 pages
Enhancement of Road Safety in The University of The Philippines Diliman Campus Through Effective Data Management
No ratings yet
Enhancement of Road Safety in The University of The Philippines Diliman Campus Through Effective Data Management
12 pages
Unit 1 Part 1
No ratings yet
Unit 1 Part 1
18 pages
ChatGPT Cheat Sheet
No ratings yet
ChatGPT Cheat Sheet
9 pages
Royal Event 2
No ratings yet
Royal Event 2
10 pages
Anshika Summer Training
No ratings yet
Anshika Summer Training
11 pages
Python For Data Science Extended Ebook PDF
100% (5)
Python For Data Science Extended Ebook PDF
56 pages
DSV QB and Solutions
No ratings yet
DSV QB and Solutions
8 pages
MyUHealthChart - Test Details
No ratings yet
MyUHealthChart - Test Details
4 pages
Internship Provider Using Web Application
No ratings yet
Internship Provider Using Web Application
5 pages
Data Science Machine Learning 17054
No ratings yet
Data Science Machine Learning 17054
27 pages
16 Nov2021
No ratings yet
16 Nov2021
1 page
Program Overview: #Datascience - Data Science in Iot
100% (1)
Program Overview: #Datascience - Data Science in Iot
9 pages
de 5
No ratings yet
de 5
1 page
FDS Syllabus and CIS
No ratings yet
FDS Syllabus and CIS
10 pages
Astera Data Integration Bootcamp 23
No ratings yet
Astera Data Integration Bootcamp 23
4 pages
Letter
No ratings yet
Letter
1 page
Data Science Infographic en
No ratings yet
Data Science Infographic en
4 pages
Career Transition: Data Science
No ratings yet
Career Transition: Data Science
12 pages
Acct 9
No ratings yet
Acct 9
2 pages
Brochure
No ratings yet
Brochure
6 pages
ANUBHAV (1) - Anubhav Agrawal
No ratings yet
ANUBHAV (1) - Anubhav Agrawal
1 page
Paper 5184
No ratings yet
Paper 5184
7 pages
Python Natural Language Processing Cookbook: Over 60 recipes for building powerful NLP solutions using Python and LLM libraries
From Everand
Python Natural Language Processing Cookbook: Over 60 recipes for building powerful NLP solutions using Python and LLM libraries
Zhenya Antić
No ratings yet
XGBoost for Regression Predictive Modeling and Time Series Analysis: Learn how to build, evaluate, and deploy predictive models with expert guidance
From Everand
XGBoost for Regression Predictive Modeling and Time Series Analysis: Learn how to build, evaluate, and deploy predictive models with expert guidance
Partha Pritam Deka
No ratings yet
Data Literacy Fundamentals: Understanding the Power & Value of Data
From Everand
Data Literacy Fundamentals: Understanding the Power & Value of Data
Ben Jones
No ratings yet
Building Data-Driven Applications with LlamaIndex: A practical guide to retrieval-augmented generation (RAG) to enhance LLM applications
From Everand
Building Data-Driven Applications with LlamaIndex: A practical guide to retrieval-augmented generation (RAG) to enhance LLM applications
Andrei Gheorghiu
No ratings yet
ISO 80000-3 A Complete Guide
From Everand
ISO 80000-3 A Complete Guide
Gerardus Blokdyk
No ratings yet
Mastering WebGL: Crafting Advanced 3D Web Experiences: WebGL Wizadry
From Everand
Mastering WebGL: Crafting Advanced 3D Web Experiences: WebGL Wizadry
Kameron Hussain
No ratings yet
Trackpad Pro Ver. 5.0 Class 6
From Everand
Trackpad Pro Ver. 5.0 Class 6
Nidhi Arora
No ratings yet