[go: up one dir, main page]

0% found this document useful (0 votes)
5 views7 pages

Pandas Introduction Notes

bbbb

Uploaded by

Saraphina Kirika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views7 pages

Pandas Introduction Notes

bbbb

Uploaded by

Saraphina Kirika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Introduction to Pandas for Data Analysis

Demo Links: Links::

https:://githu .co Links:m/Data-Science-Eas:t-AFrica/Data-Science-Bo Links:o Links:t-Camp-No Links:tes:/tree/main/

pandas:

-Ipytho Links:n no Links:te o Links:o Links:k file with demo Links:: https:://githu .co Links:m/Data-Science-Eas:t-AFrica/Data-

Science-Bo Links:o Links:t-Camp-No Links:tes:/tree/main/pandas:

•What Is: Pandas:

•Pandas: Operatio Links:n

- Slicing the data frame

- Merging & Jo Links:ining

- Co Links:ncatenatio Links:n

- Changing the index

- Change Co Links:lumn headers:

- Data munging

- Us:e-Cas:e: Analyze yo Links:uth unemplo Links:yment data

What Is Pandas:

Pandas: is: us:ed fo Links:r data manipulatio Links:n, analys:is: and cleaning. Pytho Links:n pandas: is:

well s:uited fo Links:r different kinds: o Links:f data, s:uch as: :

• Ta ular data with hetero Links:geneo Links:us:ly-typed co Links:lumns:

• Ordered and uno Links:rdered time s:eries: data

• Ar itrary matrix data with ro Links:w & co Links:lumn la els:

• Unla elled data

• Any o Links:ther fo Links:rm o Links:f o Links: s:ervatio Links:nal o Links:r s:tatis:tical data s:ets:

Installing Pandas:
To Links: ins:tall Pytho Links:n Pandas:, go Links: to Links: yo Links:ur co Links:mmand line/ terminal and type “pip

install pandas” o Links:r els:e, if yo Links:u have anaco Links:nda ins:talled in yo Links:ur s:ys:tem, jus:t type

in “conda install pandas”.

Once the ins:tallatio Links:n is: co Links:mpleted, go Links: to Links: yo Links:ur IDE (Jupyter, PyCharm etc.) and

s:imply impo Links:rt it y typing: “import pandas as pd”

If yo Links:u are us:ing jupyter/go Links:o Links:gle co Links:la s: run, “!pip install pandas”.

Python Pandas Operations

Us:ing Pytho Links:n pandas:, yo Links:u can perfo Links:rm a lo Links:t o Links:f o Links:peratio Links:ns: with s:eries:, data frames:,
mis:s:ing data, gro Links:up y etc. So Links:me o Links:f the co Links:mmo Links:n o Links:peratio Links:ns: fo Links:r data manipulatio Links:n
are lis:ted elo Links:w:

1). Slicing

2). Merging & Joining

3). Concatenation

4). Changing the index

5). Data Munging

Slicing the Data Frame

In o Links:rder to Links: perfo Links:rm s:licing o Links:n data, yo Links:u need a data frame. Data frame is: a 2-
dimens:io Links:nal data s:tructure and a mo Links:s:t co Links:mmo Links:n pandas: o Links: ject.

import pandas as pd

XYZ_web = {'Day':[1,2,3,4,5,6], "Visitors":[1000, 700,6000,1000,400,350],

"Bounce_Rate":[20,20, 23,15,10,34]}

df= pd.DataFrame(XYZ_web)
print(df)

The co Links:de a o Links:ve will co Links:nvert a dictio Links:nary into Links: a pandas: Data Frame alo Links:ng

with index to Links: the left, run the co Links:de o Links:n yo Links:ur env to Links: s:ee the o Links:utput o Links:r

reference to Links: the link that was: pro Links:vided to Links: GitHu no Links:te o Links:o Links:k example.

print(df.head(2))

This: is: pro Links:vide the firs:t two Links: ro Links:ws:, and if yo Links:u want the las:t two Links: ro Links:ws: us:e the

co Links:mmand elo Links:w:

print(df.tail(2))

Merging & Joining

In merging, yo Links:u can merge two Links: data frames: to Links: fo Links:rm a s:ingle data frame. Yo Links:u

can als:o Links: decide which co Links:lumns: yo Links:u want to Links: make co Links:mmo Links:n. Let implement that

practically, firs:t create three data frames:, which has: s:o Links:me key-value pairs: and

then merge the data frames: to Links:gether.

import pandas as pd

df1= pd.DataFrame({ "HPI":[80,90,70,60],"Int_Rate":[2,1,2,3],"IND_GDP":


[50,45,45,67]}, index=[2001, 2002,2003,2004])

df2=pd.DataFrame({ "HPI":[80,90,70,60],"Int_Rate":[2,1,2,3],"IND_GDP":
[50,45,45,67]}, index=[2005, 2006,2007,2008])
merged= pd.merge(df1,df2)
print(merged)

As: yo Links:u can s:ee a o Links:ve, the two Links: data frames: has: merged into Links: a s:ingle data

frame. No Links:w, yo Links:u can als:o Links: s:pecify the co Links:lumn which yo Links:u want to Links: make
co Links:mmo Links:n. Run the a o Links:ve co Links:de in yo Links:ur env and s:ee the o Links:utput, o Links:r refer to Links:

the GitHu link that I pro Links:vided fo Links:r the no Links:te o Links:o Links:k example.

Task 1: Make the “HPI” column to be common for everything else and separate

columns.

Solution:

df1 = pd.DataFrame({"HPI":[80,90,70,60],"Int_Rate":[2,1,2,3], "IND_GDP":

[50,45,45,67]}, index=[2001, 2002,2003,2004])

df2 = pd.DataFrame({"HPI":[80,90,70,60],"Int_Rate":[2,1,2,3],"IND_GDP":
[50,45,45,67]}, index=[2005, 2006,2007,2008])

merged= pd.merge(df1,df2, on ="HPI")

print(merged)

Join is: a co Links:nvenient metho Links:d to Links: co Links:m ine two Links: differently indexed
dataframes: into Links: a s:ingle res:ult dataframe. This: is: quite s:imilar to Links: the
“merge” o Links:peratio Links:n, except the jo Links:ining o Links:peratio Links:n will e o Links:n the “index”
ins:tead o Links:f the “co Links:lumns:” .

df1 = pd.DataFrame({"Int_Rate":[2,1,2,3], "IND_GDP":[50,45,45,67]},


index=[2001, 2002,2003,2004])

df2 = pd.DataFrame({"Low_Tier_HPI":[50,45,67,34],"Unemployment":[1,3,5,6]},
index=[2001, 2003,2004,2004])

joined= df1.join(df2)
print(joined)

Run the a o Links:ve co Links:de and s:tudy the o Links:utput. As: yo Links:u may no Links:tice in yo Links:ur
o Links:utput, in year 2002(index), there is: no Links: value attached to Links: co Links:lumns:
“lo Links:w_tier_HPI” and “unemplo Links:yment” , therefo Links:re it has: printed NaN (No Links:t a
Num er). Later in 2004, o Links:th the values: are availa le, therefo Links:re it has:
printed the res:pective values:.

Task 2: Make sure you can clearly differentiate merge and join in
pandas.
Concatenation

Co Links:ncatenatio Links:n as:ically glues: the dataframes: to Links:gether. Yo Links:u can s:elect the

dimens:io Links:n o Links:n which yo Links:u want to Links: co Links:ncatenate. Fo Links:r that, jus:t us:e “pd.concat” and

pas:s: in the lis:t o Links:f dataframes: to Links: co Links:ncatenate to Links:gether.

df1 = pd.DataFrame({"HPI":[80,90,70,60],"Int_Rate":[2,1,2,3], "IND_GDP":


[50,45,45,67]}, index=[2001, 2002,2003,2004])

df2 = pd.DataFrame({"HPI":[80,90,70,60],"Int_Rate":[2,1,2,3],"IND_GDP":
[50,45,45,67]}, index=[2005, 2006,2007,2008])

concat= pd.concat([df1,df2])
print(concat)

Run the a o Links:ve co Links:de in yo Links:ur lo Links:cal env and s:tudy yo Links:ur o Links:utput, as: yo Links:u might
realize, the two Links: dataframes: are glued to Links:gether in as:ingle dataframe, where
the index s:tarts: fro Links:m 2001 all the way upto Links: 2008.
Yo Links:u can als:o Links: s:pecify axis:=1 in o Links:rder to Links: jo Links:in, merge o Links:r co Links:ncatenate alo Links:ng
the co Links:lumns:.

df1 = pd.DataFrame({"HPI":[80,90,70,60],"Int_Rate":[2,1,2,3], "IND_GDP":


[50,45,45,67]}, index=[2001, 2002,2003,2004])

df2 = pd.DataFrame({"HPI":[80,90,70,60],"Int_Rate":[2,1,2,3],"IND_GDP":
[50,45,45,67]}, index=[2005, 2006,2007,2008])

concat= pd.concat([df1,df2],axis=1)
print(concat)

Run the a o Links:ve co Links:de in yo Links:ur lo Links:cal env and s:tudy the o Links:utput and as: yo Links:u
might realize, there are unch o Links:f mis:s:ing values:. This: happens: ecaus:e the
dataframes: didn’t have values: fo Links:r all the indexes: yo Links:u want to Links: co Links:ncatenate
o Links:n. Therefo Links:re, yo Links:u s:ho Links:uld make s:ure that yo Links:u have all the info Links:rmatio Links:n
lining up co Links:rrectly when yo Links:u jo Links:in o Links:r co Links:ncatenate o Links:n the axis:.
Change the index

No Links:w let unders:tand ho Links:w to Links: change the index values: in a dataframe. Fo Links:r

example, let create a dataframe with s:o Links:me key value pairs: in a dictio Links:nary and

change the index values:.

import pandas as pd
df= pd.DataFrame({"Day":[1,2,3,4], "Visitors":[200, 100,230,300], "Bounce_Rate":
[20,45,60,10]})

df.set_index("Day", inplace= True)


print(df)

Run the a o Links:ve co Links:de and s:tudy the o Links:utput and yo Links:u will realize that the
index value has: een changed with res:pect to Links: the “Day” co Links:lumn.

Change the Column Headers

Let take the a o Links:ve example, where we will change the co Links:lumn header fro Links:m

“Vis:ito Links:rs:” to Links: “Us:ers:” .

import pandas as pd

df = pd.DataFrame({"Day":[1,2,3,4], "Visitors":[200, 100,230,300],


"Bounce_Rate":[20,45,60,10]})

df = df.rename(columns={"Visitors":"Users"})
print(df)

Run the a o Links:ve co Links:de in yo Links:ur lo Links:cal env and yo Links:u will no Links:tice that co Links:lumn

header “Vis:ito Links:rs:” has: een changed to Links: “Us:ers:” .

Data Munging

In Data munging, yo Links:u can co Links:nvert a particular data into Links: a different fo Links:rmat.

Fo Links:r example, if yo Links:u have a .cs:v file, yo Links:u can co Links:nvert it into Links: .html o Links:r any

o Links:ther data fo Links:rmat as: well.

import pandas as pd

country= pd.read_csv("train.csv",index_col=0)
country.to_html('index.html')

Once yo Links:u run this: co Links:de in yo Links:ur lo Links:cal env, a HTML file will e created
named “index.html” . Yo Links:u can directly co Links:py the path o Links:f the file and pas:te it
in yo Links:ur ro Links:ws:er which dis:plays: the data in a HTML fo Links:rmat.

You might also like