DATAFRAME
DATAFRAME-It is a two-dimensional object that is useful in
representing data in the form of rows and columns. It is similar to a
spreadsheet or an SQL table. This is the most commonly used pandas
object. Once we store the data into the Dataframe, we can perform
various operations that are useful in analyzing and understanding the
data.
DATAFRAME STRUCTURE
COLUMNS PLAYERNAME IPLTEAM BASEPRICEINCR
0 ROHIT MI 13
1 VIRAT RCB 17
2 HARDIK MI 14
INDEX DATA
PROPERTIES OF DATAFRAME
A Dataframe has axes (indices)-
Row index (axis=0)
Column index (axes=1)
It is similar to a spreadsheet , whose row index is called index and
column index is called column name.
A Dataframe contains Heterogeneous data.
A Dataframe Size is Mutable.
A Dataframe Data is Mutable.
A data frame can be created using any of the following-
1. Series
2. Lists
3. Dictionary
4. A numpy 2D array
How to create Dataframe From Series
Program-
Output-
import pandas as pd
0
s = pd.Series(['a','b','c','d']) 0 a
1 b Default Column Name As 0
df=pd.DataFrame(s)
2 c
print(df) 3 d
DataFrame from Dictionary of Series
Example-
DataFrame from List of Dictionaries
Example-
Iteration on Rows and Columns
If we want to access record or data from a data frame row wise or
column wise then iteration is used. Pandas provide 2 functions to
perform iterations-
1. iterrows ()
2. iteritems ()
iterrows()
It is used to access the data row wise. Example-
iteritems()
It is used to access the data column wise.
Example-
Select operation in data frame
To access the column data ,we can mention the column name as
subscript.
e.g. - df[empid] This can also be done by using df.empid.
To access multiple columns we can write as df[ [col1, col2,---] ]
Example -
>>df.empid or df[‘empid’]
0 101
1 102
2 103
3 104
4 105
5 106
Name: empid, dtype: int64
>>df[[‘empid’,’ename’]]
empid ename
0 101 Sachin
1 102 Vinod
2 103 Lakhbir
3 104 Anil
4 105 Devinder
5 106 UmaSelvi
To Add & Rename a column in data
frame
import pandas as pd
s = pd.Series([10,15,18,22])
df=pd.DataFrame(s)
df.columns=[‘List1’] To Rename the default column of Data
Frame as List1
df[‘List2’]=20 To create a new column List2 with all values
as 20
df[‘List3’]=df[‘List1’]+df[‘List2’] Output-
Add Column1 and Column2 and store in List1 List2 List3
0 10 20 30
New column List3 1 15 20 35
2 18 20 38
print(df) 3 22 20 42