Analyzing Data Using Pandas
Python Pandas Is used for relational or labeled data and provides various
data structures for manipulating such data and time series. This library is
built on top of the NumPy library. This module is generally imported as:
import pandas as pd
Pandas generally provide two data structures for manipulating data, They
are:
Series
Dataframe
Pandas Series is a one-dimensional labeled array capable of holding data
of any type (integer, string, float, python objects, etc.).
Pandas Series is nothing but a column in an excel sheet.
Pandas Series Examples
# import pandas as pd
import pandas as pd
# simple array
data = [1, 2, 3, 4]
ser = pd.Series(data)
print(ser)
Output:
0 1
1 2
2 3
3 4
dtype: int64
Creating a Pandas Series
Pandas Series can be created from the lists, dictionary, and from a scalar
value etc. Series can be created in different ways, here are some ways by
which we create a series:
Creating a series from array: In order to create a series from array, we
have to import a numpy module and have to use array() function.
# import pandas as pd
import pandas as pd
# import numpy as np
import numpy as np
# simple array
data = np.array(['V','J','T','I'])
ser = pd.Series(data)
print(ser)
Output:
0 V
1 J
2 T
3 I
dtype: object
Creating a series from Lists :
import pandas as pd
# a simple list
list = ['V', 'J', 'T', 'I']
# create series form a list
ser = pd.Series(list)
print(ser)
Output:
0 V
1 J
2 T
3 I
Accessing Element from Series with Position
# import pandas and numpy
import pandas as pd
import numpy as np
# creating simple array
data = np.array(['V','J','T','I','D',’A’,’T', 'A'])
ser = pd.Series(data)
#retrieve the first element
print(ser[:4])
Output:
0 V
1 J
2 T
3 I
# import pandas lib as pd
import pandas as pd
# read by default 1st sheet of an excel file
dataframe1 = pd.read_excel('VJTI_DEPT.xlsx')
ser = pd.Series(df['NAME'])
print(dataframe1)
data
# using indexing operator
data[1:4]
Indexing a Series using .loc[ ] :
data.loc[1:4]
# using .iloc[] function
data.iloc[1:4]