[go: up one dir, main page]

0% found this document useful (0 votes)
363 views9 pages

Data Structures in Pandas Solution.: Code

The document contains code snippets demonstrating various data structures and operations in Pandas: 1. It creates Series objects for student heights and weights, and combines them into a DataFrame. 2. It shows how to handle missing data by dropping rows with NaN values. 3. It demonstrates merging DataFrames by appending a Series, concatenating DataFrames, and merging on indexes. 4. Additional code snippets showcase indexing DataFrames, filtering, grouping, accessing data, working with CSV files, and generating random data.

Uploaded by

Mayuri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
363 views9 pages

Data Structures in Pandas Solution.: Code

The document contains code snippets demonstrating various data structures and operations in Pandas: 1. It creates Series objects for student heights and weights, and combines them into a DataFrame. 2. It shows how to handle missing data by dropping rows with NaN values. 3. It demonstrates merging DataFrames by appending a Series, concatenating DataFrames, and merging on indexes. 4. Additional code snippets showcase indexing DataFrames, filtering, grouping, accessing data, working with CSV files, and generating random data.

Uploaded by

Mayuri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

1.

 Data Structures in Pandas Solution.

  Code:- 

   #Write your code here


import pandas as pd
import numpy as np
heights_A = pd.Series([176.2,158.4,167.6,156.2,161.4])
heights_A.index = ['s1','s2','s3','s4','s5']
print(heights_A.shape)

# TASK 2
weights_A = pd.Series([85.1,90.2,76.8,80.4,78.9])
weights_A.index = ['s1','s2','s3','s4','s5']
print(weights_A.dtype)

#TASK 3
df_A = pd.DataFrame()
df_A['Student_height'] = heights_A
df_A['Student_weight'] = weights_A
print(df_A.shape)
#TASK 4
my_mean = 170.0
my_std = 25.0
np.random.seed(100)
heights_B = pd.Series(np.random.normal(loc = my_mean, scale = my_std, size = 5))
heights_B.index = ['s1','s2','s3','s4','s5']

my_mean1 = 75.0
my_std1 = 12.0
weights_B = pd.Series(np.random.normal(loc = my_mean1,scale = my_std1,size = 5))
weights_B.index = ['s1','s2','s3','s4','s5']
print(heights_B.mean())

#TASK 5
df_B = pd.DataFrame()
df_B['Student_height'] = heights_B
df_B['Student_weight'] = weights_B
print(df_B.columns)

#TASK 6
data = {'ClassA' : df_A,'ClassB':df_B}
p = pd.Panel.from_dict(data)
print(p.shape)

2. Data Cleaning Solutions - Python Pandas 

Code:-
#Write your code here
import pandas as pd
import numpy as np
height_A = pd.Series([176.2,158.4,167.6,156.2,161.4])
height_A.index = ['s1','s2','s3','s4','s5']
weight_A = pd.Series([85.1,90.2,76.8,80.4,78.9])
weight_A.index = ['s1','s2','s3','s4','s5']
df_A = pd.DataFrame()
df_A['Student_height'] = height_A
df_A['Student_weight'] = weight_A

df_A.loc['s3'] = np.nan
df_A.loc['s5'][1] = np.nan

df_A2 = df_A.dropna(how = 'any')


print(df_A2)

3. Data Merging Hands-On(2) Solution: Python Pandas 

Code:- 

#Write your code here


import pandas as pd
import numpy as np

height_A = pd.Series([176.2,158.4,167.6,156.2,161.4])
height_A.index = ['s1','s2','s3','s4','s5']

weights_A = pd.Series([85.1,90.2,76.8,80.4,78.9])
weights_A.index = ['s1','s2','s3','s4','s5']

df_A = pd.DataFrame()
df_A['Student_height'] = height_A
df_A['Student_weight'] = weights_A
df_A['Gender'] = ['M','F','M','M','F']

s = pd.Series([165.4,82.7,'F'],index =
['Student_height','Student_weight','Gender'],name='s6')

df_AA = df_A.append(s)
print(df_AA)

#TASK - 2
my_mean = 170.0
my_std = 25.0
np.random.seed(100)
heights_B = pd.Series(np.random.normal(loc = my_mean,scale=my_std,size = 5))
heights_B.index = ['s1','s2','s3','s4','s5']

my_mean1 = 75.0
my_std1 = 12.0
np.random.seed(100)
weights_B = pd.Series(np.random.normal(loc = my_mean1,scale=my_std1,size = 5))
weights_B.index = ['s1','s2','s3','s4','s5']

df_B = pd.DataFrame()
df_B['Student_height'] = heights_B
df_B['Student_weight'] = weights_B

df_B.index=['s7','s8','s9','s10','s11']
df_B['Gender'] = ['F','M','F','F','M']

df = pd.concat([df_AA,df_B])
print(df)

4. Data Merging Hands-On(1) Solutions:- Python Pandas

Code:- 

#Write your code here


import pandas as pd
import numpy as np
nameid = pd.Series(range(101,111))
name = pd.Series(['person' + str(i) for i in range(1,11)])
master = pd.DataFrame()
master['nameid'] = nameid
master['name'] = name
transaction = pd.DataFrame({'nameid':[108,108,108,103],'product':
['iPhone','Nokia','Micromax','Vivo']})
mdf = pd.merge(master,transaction,on='nameid')
print(mdf)

5. Indexing Dataframe Hands-On Solutions - Python Pandas

Code:- 
import pandas as pd
import numpy as np

#TASK- 1
DatetimeIndex = pd.date_range(start = '09/01/2017',end='09/15/2017')
print(DatetimeIndex[2])

#TASK - 2
datelist = ['14-Sep-2017','09-Sep-2017']
date_to_be_searched = pd.to_datetime(datelist)
print(date_to_be_searched)

#TASK - 3
print(date_to_be_searched.isin(datelist))

#TASK - 4
arraylist = [['classA']*5 + ['classB']*5,['s1','s2','s3','s4','s5']* 2]
mi_index = pd.MultiIndex.from_product(arraylist,names=['First Level','Second Level'])
print(mi_index.levels)

6.Data Aggression:- Python Pandas


Code:- 

#Write your code here


import pandas as pd
import numpy as np
heights_A = pd.Series([176.2,158.4,167.6,156.2,161.4])
heights_A.index = ['s1','s2','s3','s4','s5']
weights_A = pd.Series([85.1,90.2,76.8,80.4,78.9])
weights_A.index = ['s1','s2','s3','s4','s5']

df_A = pd.DataFrame()

df_A['Student_height'] = heights_A
df_A['Student_weight'] = weights_A

df_A_filter1 = df_A[(df_A.Student_weight < 80.0) & (df_A.Student_height > 160.0)]


print(df_A_filter1)

#TASK - 2
df_A_filter2 = df_A[df_A.index.isin(['s5'])]
print(df_A_filter2)

#TASK - 3
df_A['Gender'] = ['M','F','M','M','F']
df_groups = df_A.groupby('Gender')
print(df_groups.mean())

7. Accessing Pandas Data Structures - Python Pandas


 
Code:- 

#Write your code here


import pandas as pd
import numpy as np

heights_A = pd.Series([176.2,158.4,167.6,156.2,161.4])
heights_A.index = ['s1','s2','s3','s4','s5']
print(heights_A[1])

# TASK 2
print(heights_A[1:4])

# TASK 3
weights_A = pd.Series([85.1,90.2,76.8,80.4,78.9])
weights_A.index = ['s1','s2','s3','s4','s5']

df_A = pd.DataFrame()
df_A['Student_height'] = heights_A
df_A['Student_weight'] = weights_A

height = df_A['Student_height']
print(type(height))

# TASK 4
df_s1s2 = df_A[df_A.index.isin(['s1','s2'])]
print(df_s1s2)
# TASK 5
df_s2s5s1 = df_A[df_A.index.isin(['s1','s2','s5'])]
df_s2s5s1 = df_s2s5s1.reindex(['s2','s5','s1'])
print(df_s2s5s1)

#TASK 6
df_s1s4 = df_A[df_A.index.isin(['s1','s4'])]
print(df_s1s4)

8. Working With CSV Files 

Code:- 

#Write your code here


import pandas as pd
import numpy as np
heights_A = pd.Series([176.2,158.4,167.6,156.2,161.4])
heights_A.index = ['s1','s2','s3','s4','s5']
weights_A = pd.Series([85.1,90.2,76.8,80.4,78.9])
weights_A.index = ['s1','s2','s3','s4','s5']
df_A = pd.DataFrame()
df_A['Student_height'] = heights_A
df_A['Student_weight'] = weights_A
df_A.to_csv('classA.csv')

# TASK 2
df_A2 = pd.read_csv('classA.csv')
print(df_A2)

#TASK 3
df_A3 = pd.read_csv('classA.csv',index_col = 0)
print(df_A3)

#TASK 4
my_mean = 170.0
my_std = 25.0
np.random.seed(100)
heights_B = pd.Series(np.random.normal(loc = my_mean, scale = my_std, size = 5))
heights_B.index = ['s1','s2','s3','s4','s5']
my_mean1 = 75.0
my_std1 = 12.0
np.random.seed(100)
weights_B = pd.Series(np.random.normal(loc = my_mean1,scale = my_std1,size = 5))
weights_B.index = ['s1','s2','s3','s4','s5']
df_B = pd.DataFrame()
df_B['Student_height'] = heights_B
df_B['Student_weight'] = weights_B

df_B.to_csv('classB.csv',index = False)
print('classB.csv')

#TASK 5
df_B2 = pd.read_csv('classB.csv')
print(df_B2)

#TASK 6
df_B3 = pd.read_csv('classB.csv',header = None)
print(df_B3)

#TASK 7

df_B4 = pd.read_csv('classB.csv',header = None, skiprows = 2)


print(df_B4)

You might also like