0% found this document useful (0 votes)

40 views13 pages

Session-1 DataFrame

Data Frame

Uploaded by

Ssk Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views13 pages

Session-1 DataFrame

Data Frame

Uploaded by

Ssk Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

EDA

Exploratory Data Analysis (EDA) is a crucial step in the data analysis process.

In [ ]: ====================== Data Analysis =======================

1.Pandas ------- Dataframe read and write operations
2.Numpy ------- Numerical python Math operations
3.Matplotlib ------- plots , graphs , visualization
4.Seaborn ------- plots
5.Plotly ------- plots
6.Bokhe ------- plots

====================== Machine Learning =====================

7.Sickit-learn (sklearn) ------ Model development
8.stats packages ------ Linear Regression

====================== Webscrapping and Database connection ======

9.Sqlite ------ SQL Connection
10.Beautiful soup ------ scrap the data
11.websocket ------ scrap the data

====================== Deep Learning ==========================

12.Tensorflow ------ Deep learning models development(google)
13.keras
14.pytorch ------ develop by
15.Opencv ------ computer vision(reading and writing images)
16.Pillow ------ reading images

====================== NLP ======================================

17.NLTK ----- Natural language tool kit
18.SpaCy ----- NLP Models
19.wordcloud -----

====================== Web development - API ======================

20.Flask
21.Django
22.Fask API
23.Gradio

====================== Apps creation ==============================

24.Streamlit

====================== Transformers BERT (NLP models) ==============

25.Transformers ------ Huggingface (Google)

====================== DL:Pretarained Models bject Detections =======

26.vgg16
27.Mobilenet
28.Yolo ----- Ultralytics

====================== NLP pretrained Models ========================

29.Word2Vec ----- Google
30.GloVe ----- StandforUniversity

====================== Model save ==================================

31.Pickle
32.Joblib

====================== GenAI LLM ====================================

33.Azure openAI
34.Google Gemini
35.Amazon BedRock
36.LLAMA Meta
37.Langchain Framework
====================== Model Deployment ================================
38.MLFlow

====================== Cloud Services ==================================

39.Azure ML Related packages
40.GCP vertex ai packages
41.Amazon sagemaker packages

====================== Alle NLP ======================================

42. Allen NLP packages

====================== ML using Pyspark ================================

43.MLlib package

====================== Small packages ==================================

44.random
45.math
46.time
47.logger

Step-1 : Import Packages

In [1]: import pandas as pd

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

Step-2 : Create a DataFrame using List

In [7]: import pandas as pd

pd.DataFrame

Out[7]: pandas.core.frame.DataFrame

In [9]: import pandas as pd

pd.DataFrame()

Out[9]:

In [13]: import pandas as pd

data=pd.DataFrame()
data

# we created a DataFrame
# But no data (no rows and no columns)
# we saved our DataFrame with a name 'data'

Out[13]:

Step-3 : Provide The Data

In [16]: name=['Navya','Sneha','Yamu']
pd.DataFrame()
Out[16]:

In [18]: name=['Navya','Sneha','Yamu']
pd.DataFrame(name)

Out[18]: 0

0 Navya

1 Sneha

2 Yamu

In [20]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
pd.DataFrame(zip(name,age))

Out[20]: 0 1

0 Navya 20

1 Sneha 21

2 Yamu 22

In [22]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
pd.DataFrame(zip(name,age,city))

Out[22]: 0 1 2

0 Navya 20 Hyd

1 Sneha 21 Delhi

2 Yamu 22 Pune

In [24]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
data=[name,age,city]
pd.DataFrame(data)

Out[24]: 0 1 2

0 Navya Sneha Yamu

1 20 21 22

2 Hyd Delhi Pune

In [26]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
df=pd.DataFrame(zip(name,age,city))
df
Out[26]: 0 1 2

0 Navya 20 Hyd

1 Sneha 21 Delhi

2 Yamu 22 Pune

Step-4 : Provide The Columns

Columns we need to provide in a list

The number of columns exactly match with data

Here we have 3 columns , so we need to create a list with 3 names

In [30]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
df=pd.DataFrame(zip(name,age,city),columns=cols)
df

Out[30]: Names Age City

0 Navya 20 Hyd

1 Sneha 21 Delhi

2 Yamu 22 Pune

Step-5 : Provide the Index

In [33]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=[1,2,3]
df=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df

Out[33]: Names Age City

1 Navya 20 Hyd

2 Sneha 21 Delhi

3 Yamu 22 Pune

In [35]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df
Out[35]: Names Age City

A Navya 20 Hyd

B Sneha 21 Delhi

C Yamu 22 Pune

Step-6 : How to provide a New Column to already existed dataframe

Here we already has a dataframe with name df

It has 3 columns

Now we want to add a new column Marks

we need to create new array or list

That length of list should be equal to length of rows

so here we have 3 rows , so new list also must have 3 values

In [ ]: # df['<new column name>']=<list>

In [38]: marks=[100,200,300]
df['Marks']=marks
df

Out[38]: Names Age City Marks

A Navya 20 Hyd 100

B Sneha 21 Delhi 200

C Yamu 22 Pune 300

Step-7 : Create a DataFrame using empty DataFrame

In above case we created a list

we create a dataframe by passing list

In [41]: df1=pd.DataFrame()
df1

Out[41]:

In [43]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
df1['Name']=name
df1['Age']=age
df1['City']=city
df1
Out[43]: Name Age City

0 Navya 20 Hyd

1 Sneha 21 Delhi

2 Yamu 22 Pune

Step-8 : Create a DataFrame using Dictionary

In [50]: dict1={'Names':['Navya','Sneha','Yamu'],'Age':[20,21,22],'City':['Hyd','Delhi','Pune']}
dict1

Out[50]: {'Names': ['Navya', 'Sneha', 'Yamu'],

'Age': [20, 21, 22],
'City': ['Hyd', 'Delhi', 'Pune']}

In [52]: df2=pd.DataFrame(dict1)
df2

Out[52]: Names Age City

0 Navya 20 Hyd

1 Sneha 21 Delhi

2 Yamu 22 Pune

In [54]: df2=pd.DataFrame(dict1,index=['A','B','C'])
df2

Out[54]: Names Age City

A Navya 20 Hyd

B Sneha 21 Delhi

C Yamu 22 Pune

Keys Behaves as Columns

Values Behaves as Rows

In [57]: dict2={'Name':'Navya','Age':20,'City':'Hyd'}
dict2

Out[57]: {'Name': 'Navya', 'Age': 20, 'City': 'Hyd'}

In [61]: df3=pd.DataFrame(dict2)
df3
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[61], line 1
----> 1 df3=pd.DataFrame(dict2)
2 df3

File ~\anaconda3\Lib\site-packages\pandas\core\frame.py:778, in DataFrame.init(self, data, in

dex, columns, dtype, copy)
772 mgr = self._init_mgr(
773 data, axes={"index": index, "columns": columns}, dtype=dtype, copy=copy
774 )
776 elif isinstance(data, dict):
777 # GH#38939 de facto copy defaults to False only in non-dict cases
--> 778 mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
779 elif isinstance(data, ma.MaskedArray):
780 from numpy.ma import mrecords

File ~\anaconda3\Lib\site-packages\pandas\core\internals\construction.py:503, in dict_to_mgr(dat

a, index, columns, dtype, typ, copy)
499 else:
500 # dtype check to exclude e.g. range objects, scalars
501 arrays = [x.copy() if hasattr(x, "dtype") else x for x in arrays]
--> 503 return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)

File ~\anaconda3\Lib\site-packages\pandas\core\internals\construction.py:114, in arrays_to_mgr(ar

rays, columns, index, dtype, verify_integrity, typ, consolidate)
111 if verify_integrity:
112 # figure out the index, if necessary
113 if index is None:
--> 114 index = _extract_index(arrays)
115 else:
116 index = ensure_index(index)

File ~\anaconda3\Lib\site-packages\pandas\core\internals\construction.py:667, in _extract_index(d

ata)
664 raise ValueError("Per-column arrays must each be 1-dimensional")
666 if not indexes and not raw_lengths:
--> 667 raise ValueError("If using all scalar values, you must pass an index")
669 if have_series:
670 index = union_indexes(indexes)

ValueError: If using all scalar values, you must pass an index

In [63]: dict2={'Name':'Navya','Age':20,'City':'Hyd'}
pd.DataFrame(dict2,index=[1])

# If using all scalar values, you must pass an index

Out[63]: Name Age City

1 Navya 20 Hyd

In [65]: dict2={'Name':'Navya','Age':20,'City':'Hyd'}
pd.DataFrame(dict2,index=[1,2])
Out[65]: Name Age City

1 Navya 20 Hyd

2 Navya 20 Hyd

Data in the form of array can print 3 ways :

list : Normal way

numpy: Numpy package

tensor: Tensorflow

In [68]: l1=[1,2,3]
import numpy as np
np.array(l1)

Out[68]: array([1, 2, 3])

In [70]: l1=[1,2,3]
l2=[11,12,13]
l1+l2

Out[70]: [1, 2, 3, 11, 12, 13]

In [72]: import numpy as np

np.array(l1)
np.array(l2)
np.array(l1+l2)

Out[72]: array([ 1, 2, 3, 11, 12, 13])

In [74]: l1=[1,2,3]
a=np.array(l1)
l2=[11,12,13]
b=np.array(l2)
a+b

Out[74]: array([12, 14, 16])

In [76]: l1=[1,2,3]
a=np.array(l1)
l2=[11,12,13]
b=np.array(l2)
a*b

Out[76]: array([11, 24, 39])

In [78]: l1=[1,2,3]
a=np.array(l1)
l2=[11,12,13]
b=np.array(l2)
a+b,a*b

Out[78]: (array([12, 14, 16]), array([11, 24, 39]))

Step-9 : Drop the column

In order to drop a column we need to use drop method

All the methods based on dataframe names similar as the string names

It requires mainly 3 arguments

1.Column name

2.axis

axis = 1 represents column

axis = 0 represents rows

3.Inplace

once you drop the column , dataframe affected

The modified dataframe wants to save in a same or different name

if you want to keep at same name then inplace=True

In [ ]: # create a dataframe and drop any column

In [81]: df4=pd.DataFrame()
df4

Out[81]:

In [ ]: df4.drop()

In [87]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df4=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df4

Out[87]: Names Age City

A Navya 20 Hyd

B Sneha 21 Delhi

C Yamu 22 Pune

In [97]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df4=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df4.drop('City',axis=1)
Out[97]: Names Age

A Navya 20

B Sneha 21

C Yamu 22

In [103… name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df4=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df4.drop('A',axis=0)

Out[103… Names Age City

B Sneha 21 Delhi

C Yamu 22 Pune

In [107… name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df4=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df4.drop('A',axis=0,inplace=True)

In [109… df4

Out[109… Names Age City

B Sneha 21 Delhi

C Yamu 22 Pune

In [4]: name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df4=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df4.drop('A',axis=0,inplace=False)

Out[4]: Names Age City

B Sneha 21 Delhi

C Yamu 22 Pune

In [ ]: # create two dataframes df1 and df2

# add those dataframes

############# df1 ###########

Names Age City
Ramesh 20 Hyd
############ df2 ###########
Names Age City
Suresh 21 Blr

Names Age City

Ramesh 20 Hyd
Suresh 21 Blr

append
concate
join

In [34]: dict1={'Name':'Ramesh','Age':20,'City':'Hyd'}
df5=pd.DataFrame(dict1,index=[1])
dict2={'Name':'Suresh','Age':21,'City':'Blr'}
df6=pd.DataFrame(dict2,index=[2])
result=pd.concat([df5,df6],ignore_index=True)
print(result)

Name Age City

0 Ramesh 20 Hyd
1 Suresh 21 Blr

Step-10 : How to overwrite existed column

we already has a dataframe

now we want to replace all the values of specific column with new values

first create a list with new values

Then update the column with new values , in the same way of how to create a new column

df[new col]=data , to create a new column

df[old col]=new data , to overwrite theold column

In [8]: df4['Age']=[33,44,34]
df4

Out[8]: Names Age City

A Navya 33 Hyd

B Sneha 44 Delhi

C Yamu 34 Pune

In [48]: df4['Names']=['anshu','chinni','adya']
df4

Out[48]: Names Age City Name

A anshu 33 Hyd anshu

B chinni 44 Delhi chinni

C adya 34 Pune adya

Step-11 : How to save the DataFrame

we can save the dataframe using 2 ways

csv:comma seperated value

excel

For csv : to_csv extension = .csv

For excel : read_csv extension = .xlsx

In [51]: # create a dataframe

name=['Navya','Sneha','Yamu']
age=[20,21,22]
city=['Hyd','Delhi','Pune']
cols=['Names','Age','City']
id=['A','B','C']
df=pd.DataFrame(zip(name,age,city),index=id,columns=cols)
df

Out[51]: Names Age City

A Navya 20 Hyd

B Sneha 21 Delhi

C Yamu 22 Pune

Csv Format

In [56]: # DataFramename.methodname
# where you want to save
# in what name you want to save

df.to_csv('data12.csv')

Excel sheet

In [61]: df.to_excel('data13.xlsx')

Step-12 : Read the data

read_csv

read_excel

both available on pandas

In [65]: pd.read_csv('data12.csv')
Out[65]: Unnamed: 0 Names Age City

0 A Navya 20 Hyd

1 B Sneha 21 Delhi

2 C Yamu 22 Pune

In [69]: pd.read_excel('data13.xlsx')

Out[69]: Unnamed: 0 Names Age City

0 A Navya 20 Hyd

1 B Sneha 21 Delhi

2 C Yamu 22 Pune

Step-13 : How to avoid extra column

while we are saving the data , we have argument name index

keep index=False

In [74]: # Give the different name , provide index=False

df.to_csv('data21.csv',index=False)
pd.read_csv('data21.csv')

Out[74]: Names Age City

0 Navya 20 Hyd

1 Sneha 21 Delhi

2 Yamu 22 Pune

In [76]: df.to_excel('data31.xlsx',index=False)
pd.read_excel('data31.xlsx')

Out[76]: Names Age City

0 Navya 20 Hyd

1 Sneha 21 Delhi

2 Yamu 22 Pune

In [ ]:

NumPy and Pandas Tutorial
No ratings yet
NumPy and Pandas Tutorial
8 pages
Class Xii PDF For Practical
No ratings yet
Class Xii PDF For Practical
24 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
43 pages
FDS Record-1-4
No ratings yet
FDS Record-1-4
18 pages
DSC Lab Programs
No ratings yet
DSC Lab Programs
24 pages
Data Aggregation and Group Operations
No ratings yet
Data Aggregation and Group Operations
34 pages
Pandas DataFrame Cheat Sheet
No ratings yet
Pandas DataFrame Cheat Sheet
4 pages
Pandas DataFrame Cheat Sheet
100% (1)
Pandas DataFrame Cheat Sheet
10 pages
Pandas DataFrame Notes
100% (1)
Pandas DataFrame Notes
10 pages
Pandas, Numpy, Matplotlib
No ratings yet
Pandas, Numpy, Matplotlib
11 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
10 pages
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
No ratings yet
Cheat Sheet: The Pandas Dataframe Object I: Preliminaries Get Your Data Into A Dataframe
12 pages
AI & Data Science Lab Record
No ratings yet
AI & Data Science Lab Record
28 pages
DAP 3 Module
No ratings yet
DAP 3 Module
62 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
12 pages
Exp - 1 - Introduction To Data Analytics and Python Fundamentals - SDK - Ok
No ratings yet
Exp - 1 - Introduction To Data Analytics and Python Fundamentals - SDK - Ok
9 pages
Cheat Sheet
No ratings yet
Cheat Sheet
12 pages
Ilovepdf Merged (2) Merged
No ratings yet
Ilovepdf Merged (2) Merged
65 pages
Datascience Internship
No ratings yet
Datascience Internship
43 pages
Ip Study
No ratings yet
Ip Study
18 pages
Pandas & Numpy
No ratings yet
Pandas & Numpy
32 pages
Python Libraries - 2025 (1) Python Libraries - 2025 (1) Python Libraries - 2025
No ratings yet
Python Libraries - 2025 (1) Python Libraries - 2025 (1) Python Libraries - 2025
9 pages
Final Dev Record
No ratings yet
Final Dev Record
49 pages
NumPy and Pandas Step
No ratings yet
NumPy and Pandas Step
9 pages
PR Final File
No ratings yet
PR Final File
70 pages
Asset Data Analysis
No ratings yet
Asset Data Analysis
47 pages
100 Pandas Puzzles
No ratings yet
100 Pandas Puzzles
20 pages
De&v Lab Manual
No ratings yet
De&v Lab Manual
91 pages
Eda Code Snippets
No ratings yet
Eda Code Snippets
17 pages
Notebook PYTHON DATA SCIENCE
No ratings yet
Notebook PYTHON DATA SCIENCE
16 pages
Dataframe Notes
No ratings yet
Dataframe Notes
26 pages
Dataframe Notes
No ratings yet
Dataframe Notes
39 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Pandas For Machine Learning
No ratings yet
Pandas For Machine Learning
10 pages
Python Cheat Sheet 2.0
100% (2)
Python Cheat Sheet 2.0
10 pages
EDS - Python Cheat Sheet
0% (1)
EDS - Python Cheat Sheet
3 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (3)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
9 pages
Pandas
No ratings yet
Pandas
27 pages
2A - Python+Data Analysis For Pyhton2 v2
No ratings yet
2A - Python+Data Analysis For Pyhton2 v2
38 pages
Python-Numpy & Pandas
No ratings yet
Python-Numpy & Pandas
78 pages
Pandas
No ratings yet
Pandas
35 pages
NumPy and Pandas
No ratings yet
NumPy and Pandas
12 pages
Machine Learning Experiment
No ratings yet
Machine Learning Experiment
69 pages
12 Pandas
No ratings yet
12 Pandas
9 pages
Python in Research
No ratings yet
Python in Research
18 pages
Pandas Documentation PDF
No ratings yet
Pandas Documentation PDF
86 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
EXP1-siddhant Gupta (23 - SE - 148)
No ratings yet
EXP1-siddhant Gupta (23 - SE - 148)
17 pages
IP Slybuss
No ratings yet
IP Slybuss
21 pages
MLT Lab Manual
No ratings yet
MLT Lab Manual
41 pages
Numpy Boolean Indexing: Filter
No ratings yet
Numpy Boolean Indexing: Filter
39 pages
Advanced Python & Data Science Guide
No ratings yet
Advanced Python & Data Science Guide
42 pages
Data Analysis Tools
No ratings yet
Data Analysis Tools
26 pages
Python Library Functions
No ratings yet
Python Library Functions
12 pages
Day08-Pandas-Tutorial: Pandas - by Punith V T
No ratings yet
Day08-Pandas-Tutorial: Pandas - by Punith V T
8 pages
ELE492 - ELE492 - Image Process Lecture Notes 5
No ratings yet
ELE492 - ELE492 - Image Process Lecture Notes 5
41 pages
Building Multi Agent System With Python and crewAI
No ratings yet
Building Multi Agent System With Python and crewAI
9 pages
Pig Is Big On Books
No ratings yet
Pig Is Big On Books
14 pages
NLP Transformer-Based Models Used For Sentiment Analysis: 1. BERT
No ratings yet
NLP Transformer-Based Models Used For Sentiment Analysis: 1. BERT
98 pages
Cybersecurity Training Hyderabad 2017
No ratings yet
Cybersecurity Training Hyderabad 2017
1 page
Maharashtra Economic Survey 2016-17
No ratings yet
Maharashtra Economic Survey 2016-17
306 pages
Environmentally Senitive Maintenance
No ratings yet
Environmentally Senitive Maintenance
25 pages
Animals in The Zoo
No ratings yet
Animals in The Zoo
4 pages
Lagos Transport System Challenges
No ratings yet
Lagos Transport System Challenges
10 pages
SAP Internal Tables
No ratings yet
SAP Internal Tables
6 pages
Brownfields and Land Revitalization
No ratings yet
Brownfields and Land Revitalization
2 pages
Workday Adaptive Planning Course Content
No ratings yet
Workday Adaptive Planning Course Content
9 pages
A Review of Studies On Expert Estimation of Software Development Effort
No ratings yet
A Review of Studies On Expert Estimation of Software Development Effort
24 pages
MCQ On Chapter5
No ratings yet
MCQ On Chapter5
11 pages
IoT and AI For Silambam Martial Arts A Review
No ratings yet
IoT and AI For Silambam Martial Arts A Review
6 pages
Computer Organization and Assembly Language (CS2411 & CSC2201)
No ratings yet
Computer Organization and Assembly Language (CS2411 & CSC2201)
22 pages
Python Tutorial Text 2024-1
No ratings yet
Python Tutorial Text 2024-1
82 pages
New Perspectives On The Internet Comprehensive 9th Edition Schneider Test Bank
No ratings yet
New Perspectives On The Internet Comprehensive 9th Edition Schneider Test Bank
46 pages
1.0 Route Mapping Syllabus PMC RM-110
No ratings yet
1.0 Route Mapping Syllabus PMC RM-110
4 pages
How To Protect Worksheets and Unprotect Excel Sheet Without Password
No ratings yet
How To Protect Worksheets and Unprotect Excel Sheet Without Password
19 pages
Hitachi Storage Adapter For Oracle VM Storage Connect
No ratings yet
Hitachi Storage Adapter For Oracle VM Storage Connect
2 pages
Catia - Mold Tooling Design
No ratings yet
Catia - Mold Tooling Design
60 pages
Internet of Things 281 IOT281
No ratings yet
Internet of Things 281 IOT281
2 pages
Step 2: Run The Install Script: For Enter
No ratings yet
Step 2: Run The Install Script: For Enter
5 pages
Yoga Resume Uber
No ratings yet
Yoga Resume Uber
1 page
Midterm Exam Instruction CEI411
No ratings yet
Midterm Exam Instruction CEI411
4 pages
Empowerment Module 9
No ratings yet
Empowerment Module 9
6 pages
Embedded Systems Career Profile
No ratings yet
Embedded Systems Career Profile
3 pages
Activity # 4: Software Reuse
No ratings yet
Activity # 4: Software Reuse
54 pages
Golang Developer Opportunity
No ratings yet
Golang Developer Opportunity
1 page
AP70-80 SoftwareUpgrade EN 988-10442-001 W
No ratings yet
AP70-80 SoftwareUpgrade EN 988-10442-001 W
3 pages
RKAS 2020 SMP Negeri 3 Pekalongan
No ratings yet
RKAS 2020 SMP Negeri 3 Pekalongan
3 pages
SSRN 4158451
No ratings yet
SSRN 4158451
9 pages
Introduction To Logic Programming 1st Edition Michael Genesereth PDF Version
No ratings yet
Introduction To Logic Programming 1st Edition Michael Genesereth PDF Version
93 pages
HS
No ratings yet
HS
28 pages
Casambi Pro Getting Started Guide - v1.2
No ratings yet
Casambi Pro Getting Started Guide - v1.2
23 pages
AutoCAD Surface Data Techniques
No ratings yet
AutoCAD Surface Data Techniques
2 pages
Quality Assurance
No ratings yet
Quality Assurance
5 pages
VIM 23.4 SPS1 Release Notes For Invoice Solution
No ratings yet
VIM 23.4 SPS1 Release Notes For Invoice Solution
31 pages
Revision Paper Theory
No ratings yet
Revision Paper Theory
6 pages
9-UML Modeling-System Sequence Diagram
No ratings yet
9-UML Modeling-System Sequence Diagram
33 pages

Session-1 DataFrame

Uploaded by

Session-1 DataFrame

Uploaded by

EDA

In [ ]: ====================== Data Analysis =======================

====================== Machine Learning =====================

====================== Webscrapping and Database connection ======

====================== Deep Learning ==========================

====================== NLP ======================================

====================== Web development - API ======================

====================== Apps creation ==============================

====================== Transformers BERT (NLP models) ==============

====================== DL:Pretarained Models bject Detections =======

====================== NLP pretrained Models ========================

====================== Model save ==================================

====================== GenAI LLM ====================================

====================== Cloud Services ==================================

====================== Alle NLP ======================================

====================== ML using Pyspark ================================

====================== Small packages ==================================

Step-1 : Import Packages

In [1]: import pandas as pd

Step-2 : Create a DataFrame using List

In [7]: import pandas as pd

In [9]: import pandas as pd

In [13]: import pandas as pd

Step-3 : Provide The Data

0 Navya Sneha Yamu

2 Hyd Delhi Pune

Step-4 : Provide The Columns

Columns we need to provide in a list

The number of columns exactly match with data

Here we have 3 columns , so we need to create a list with 3 names

Out[30]: Names Age City

Step-5 : Provide the Index

Out[33]: Names Age City

Step-6 : How to provide a New Column to already existed dataframe

Here we already has a dataframe with name df

Now we want to add a new column Marks

we need to create new array or list

That length of list should be equal to length of rows

so here we have 3 rows , so new list also must have 3 values

In [ ]: # df['<new column name>']=<list>

Out[38]: Names Age City Marks

A Navya 20 Hyd 100

B Sneha 21 Delhi 200

C Yamu 22 Pune 300

Step-7 : Create a DataFrame using empty DataFrame

In above case we created a list

we create a dataframe by passing list

Step-8 : Create a DataFrame using Dictionary

Out[50]: {'Names': ['Navya', 'Sneha', 'Yamu'],

Out[52]: Names Age City

Out[54]: Names Age City

Keys Behaves as Columns

Values Behaves as Rows

Out[57]: {'Name': 'Navya', 'Age': 20, 'City': 'Hyd'}

File ~\anaconda3\Lib\site-packages\pandas\core\frame.py:778, in DataFrame.__init__(self, data, in

File ~\anaconda3\Lib\site-packages\pandas\core\internals\construction.py:503, in dict_to_mgr(dat

File ~\anaconda3\Lib\site-packages\pandas\core\internals\construction.py:114, in arrays_to_mgr(ar

File ~\anaconda3\Lib\site-packages\pandas\core\internals\construction.py:667, in _extract_index(d

ValueError: If using all scalar values, you must pass an index

# If using all scalar values, you must pass an index

Out[63]: Name Age City

Data in the form of array can print 3 ways :

list : Normal way

numpy: Numpy package

Out[68]: array([1, 2, 3])

Out[70]: [1, 2, 3, 11, 12, 13]

In [72]: import numpy as np

Out[72]: array([ 1, 2, 3, 11, 12, 13])

Out[74]: array([12, 14, 16])

Out[76]: array([11, 24, 39])

Out[78]: (array([12, 14, 16]), array([11, 24, 39]))

In order to drop a column we need to use drop method

It requires mainly 3 arguments

axis = 1 represents column

axis = 0 represents rows

once you drop the column , dataframe affected

File ~\anaconda3\Lib\site-packages\pandas\core\frame.py:778, in DataFrame.init(self, data, in