[go: up one dir, main page]

0% found this document useful (0 votes)
14 views12 pages

PYTHON PROGRAMMING: Data Handling

Uploaded by

samarsingh9551
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views12 pages

PYTHON PROGRAMMING: Data Handling

Uploaded by

samarsingh9551
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

PRACTICAL PROGRAM

PYTHON PROGRAMMING: Data Handling


Q1: Create a panda’s series from a dictionary of values and a ndarray

Creating a Pandas Series from a dictionary:

import pandas as pd

# Create a dictionary

data_dict = {'a': 10, 'b': 20, 'c': 30}

# Create a Pandas Series from the dictionary

series_from_dict = pd.Series(data_dict)

print("Series from dictionary:")

print(series_from_dict)

OUTPUT:

Creating a Pandas Series from a NumPy ndarray:

import numpy as np

import pandas as pd

# Create a NumPy ndarray

data_ndarray = np.array([10, 20, 30])

# Create a Pandas Series from the ndarray

series_from_ndarray = pd.Series(data_ndarray, index=['a', 'b', 'c'])

print("Series from ndarray:")


print(series_from_ndarray)

OUTPUT:

Q2: Given a Series, print all the elements that are above the 75th percentile

import pandas as pd

# Example Series

data = pd.Series([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])

# Calculate the 75th percentile

percentile_75 = data.quantile(0.75)

# Filter the Series to include only elements above the 75th percentile

above_75th_percentile = data[data > percentile_75]

print("Elements above the 75th percentile:")

print(above_75th_percentile)

OTPUT:

Q3: Create a Data Frame quarterly sales where each row contains the item category, item name, and
expenditure. Group the rows by the category and print the total expenditure per category.

Coding
import pandas as pd

# Create a DataFrame with sample quarterly sales data

data = {

'Item Category': ['Electronics', 'Electronics', 'Furniture', 'Furniture', 'Clothing', 'Clothing'],

'Item Name': ['Laptop', 'Smartphone', 'Sofa', 'Desk', 'Jacket', 'Shoes'],

'Expenditure': [1200, 800, 600, 300, 150, 100]

df = pd.DataFrame(data)

# Group by 'Item Category' and calculate the total expenditure per category

total_expenditure_per_category = df.groupby('Item Category')['Expenditure'].sum()

# Print the results

print("Total expenditure per category:")

print(total_expenditure_per_category)

Output:

Q4: . Create a data frame for examination result and display row labels, column labels data types of
each column and the dimensions.

Coding:

import pandas as pd

# Create a DataFrame with sample examination results data

data = {

'Student Name': ['Alice', 'Bob', 'Charlie', 'David'],

'Math': [85, 90, 78, 88],


'Science': [92, 85, 80, 87],

'English': [88, 91, 85, 82]

df = pd.DataFrame(data)

# Display row labels (index)

print("Row labels (index):")

print(df.index)

# Display column labels

print("\nColumn labels:")

print(df.columns)

# Display data types of each column

print("\nData types of each column:")

print(df.dtypes)

# Display dimensions of the DataFrame (rows, columns)

print("\nDimensions of the DataFrame:")

print(df.shape)

Output:
Q5:

Let's start by creating an example DataFrame that contains some duplicate rows:

Create a sample DataFrame with duplicate rows

data = {

'Student Name': ['Alice', 'Bob', 'Charlie', 'David', 'Alice', 'Bob'],

'Math': [85, 90, 78, 88, 85, 90],

'Science': [92, 85, 80, 87, 92, 85],

'English': [88, 91, 85, 82, 88, 91]

df = pd.DataFrame(data)

print("Original DataFrame:")

print(df)

Removing Duplicate Rows

To remove duplicate rows based on all columns or specific columns, you can use the
drop_duplicates() method.

Remove duplicate rows based on all columns


df_no_duplicates_all = df.drop_duplicates()

print("\nDataFrame with duplicate rows removed (based on all columns):")

print(df_no_duplicates_all)

Removing Duplicate Rows Based on Specific Columns

# Remove duplicate rows based on specific columns, e.g., 'Student Name'


df_no_duplicates_specific = df.drop_duplicates(subset=['Student Name'])
print("\nDataFrame with duplicate rows removed (based on 'Student Name'):")
print(df_no_duplicates_specific)
Filtering Based on Conditions
# Filter rows where Math score is greater than 80
df_filtered_math = df[df['Math'] > 80]
print("\nDataFrame with rows where Math score > 80:")
print(df_filtered_math)

Filtering Rows Where Multiple Conditions Are Met


# Filter rows where Math score is greater than 80 and Science score is greater than 85
df_filtered_multiple = df[(df['Math'] > 80) & (df['Science'] > 85)]

print("\nDataFrame with rows where Math score > 80 and Science score > 85:")
print(df_filtered_multiple)
Removing Rows with Missing Values
# Assume some missing values for demonstration
df.loc[2, 'Math'] = None

# Remove rows with any missing values


df_no_missing = df.dropna()

print("\nDataFrame with rows containing missing values removed:")


print(df_no_missing)

Notes:
Summary of Filtering Methods
1. Removing Duplicate Rows:
o df.drop_duplicates(): Removes duplicate rows based on all columns.
o df.drop_duplicates(subset=['column_name']): Removes duplicates based on
specific columns.
2. Filtering Based on Conditions:
o df[df['column_name'] > value]: Filters rows based on a condition.
o df[(df['column1'] > value1) & (df['column2'] < value2)]: Filters rows based on
multiple conditions.
3. Removing Rows with Missing Values:
o df.dropna(): Removes rows with any missing values.
o df.dropna(subset=['column_name']): Removes rows with missing values in
specific columns.
Q6: Write python code to read ‘dept.csv’ file containing columns as (d_id, d_name and city )
and display all records separated by comma

Ans:

import csv
print('Details of csv file:')

with open('dept.csv') as csvfile:

data = csv.reader(csvfile, delimiter=' ')

for row in data:

print(', '.join(row))

Output is:

Details of csv file:

10,sales,kolkatta

20,marketing,delhi

Q7: Write a program to read the contents of “dept.csv” file using with open()

Ans:

import csv

with open('dept.csv','r') as csv_file:

reader=csv.reader(csv_file)

rows=[ ]

for rec in reader:

rows.append(rec)

print(rows)

Output is:

[['d_id d_name city '], ['10 sales kolkatta'], ['11 marketing delhi']]

Q8: Write a program to count the number of records present in “dept.csv” file

Ans:

import csv

f=open('dept.csv','r')

csv_reader=csv.reader(f)

columns=next(csv_reader)

c=0

for row in csv_reader:

c=c+1

print('No.of records are:',c)

Output is

No.of records are: 2

Use of next() = The function next() is used to directly point to this list of fields to read the
next line in the CSV file. .next() method returns the current row and advances the iterator to

the next row.

Q9: Write a program to search the record of a particular student from CSV file on the basis of

inputted name.

Ans:

import csv

f=open('student.csv','r')

csv_reader=csv.reader(f)

name=input('Enter name to be searched:')

for row in csv_reader:

if (row[1]==name):

print(row)

Output is:

Enter name to be searched:APARNA

['1', 'APARNA', 'XII']

Q10: Write a program to add (append) Employee records onto a csv file.

Ans:

import csv

with open('student.csv',mode='a') as csvfile:

mywriter=csv.writer(csvfile,delimiter=',')

ans='y'

while ans.lower()=='y':

rno=int(input('Enter roll no:'))

name=input('Enter student name:')

clas=input('Enter class:')

mywriter.writerow([rno,name,clas])

print('## Data Saved ##')

ans=input('Add More?:')

Output is:

Enter roll no:4

Enter student name:MANGESH

Enter class:XI
## Data Saved ##

Add More?:y

Enter roll no:5

Enter student name:PRIYANKA

Enter class:XII

## Data Saved ##

Add More?:n

Q11: Write a menu-driven program implementing user-defined functions to perform different

functions on a csv file “student” such as:

(a) Write a single record to csv

(b) Write all the records in one single go onto the csv.

(c) Display the contents of the csv file

Ans:

import csv

def crcsv1():

with open('student.csv','a') as csvfile:

fobj=csv.writer(csvfile)

while True:

rno=int(input('Enter Roll no:'))

name=input('Name:')

marks = float(input("Marks:"))

Line=[rno,name,marks]

fobj.writerow(Line)

ch=input('More (Y/N):')

if ch=='N':

break

def crcsv2():

with open('student.csv','w') as csvfile:

fobj=csv.writer(csvfile)

Lines=[]

while True:

rno=int(input('Enter Roll no:'))

name=input('Name:')
marks = float(input("Marks:"))

Lines.append([rno,name,marks])

fobj.writerow(Lines)

ch=input('More (Y/N):')

if ch=='N':

break

fobj.writerows(Lines)

def showall():

with open('student.csv') as csvfile:

csvobj=csv.reader(csvfile)

for line in csvobj:

print(','.join(line))

print('csv file handling:')

while True:

opt=input('1: Create 2: CreateCSVAll 3:Show CSV 4.Quit\n')

if opt=='1':

crcsv1()

elif opt=='2':

crcsv2()

elif opt=='3':

showall()

else:

break

Output is:

csv file handling:

1: Create 2: CreateCSVAll 3:Show CSV 4.Quit

Enter Roll no:1

Name:ARCHANA SAKAT

Marks:453

More (Y/N):Y

Enter Roll no:2

Name:SUNIL
Marks:433

More (Y/N):N

1: Create 2: CreateCSVAll 3:Show CSV 4.Quit

1,ARCHANA SAKAT,453.0

2,SUNIL,433.0

1: Create 2: CreateCSVAll 3:Show CSV 4.Quit

Enter Roll no:3

Name:AAKASH

Marks:455

More (Y/N):N

1: Create 2: CreateCSVAll 3:Show CSV 4.Quit

[3, 'AAKASH', 455.0]

1: Create 2: CreateCSVAll 3:Show CSV 4.Quit

>>>

Q12: Given the school result data, analyses the performance of the students on different
parameters, e.g subject wise or class wise.
coding
import pandas as pd
import matplotlib.pyplot as plt

# Load the CSV file into a DataFrame


df = pd.read_csv('school_results.csv')
print("DataFrame loaded:")
print(df)

# Subject-wise performance statistics


subject_stats = df[['Math', 'Science', 'English']].agg(['mean', 'median', 'std'])
print("\nSubject-wise performance statistics:")
print(subject_stats)

# Class-wise performance
class_performance = df.groupby('Class').agg({'Math': 'mean', 'Science': 'mean', 'English':
'mean'})
print("\nClass-wise performance:")
print(class_performance)

# Plot subject-wise performance


subject_stats.T.plot(kind='bar', figsize=(10, 6))
plt.title('Subject-wise Performance Statistics')
plt.xlabel('Subject')
plt.ylabel('Score')
plt.legend(title='Statistics')
plt.show()

# Plot class-wise performance


class_performance.plot(kind='bar', figsize=(10, 6))
plt.title('Class-wise Performance')
plt.xlabel('Class')
plt.ylabel('Average Score')
plt.legend(title='Subject')
plt.show()

You might also like