[go: up one dir, main page]

0% found this document useful (0 votes)
14 views8 pages

Pandas More

Uploaded by

gtecstudent795
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views8 pages

Pandas More

Uploaded by

gtecstudent795
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Pandas more

. Viewing Data
CSV File: marks.csv
Name,Maths,Science,English
Rahul,88,90,85
Priya,92,85,89
Anil,75,80,78
Sneha,95,91,92
Kiran,89,87,84
Meera,90,93,88

Python Code
import pandas as pd

# Step 1: Read the CSV file


df = pd.read_csv("marks.csv")

# Step 2: View the first 5 rows


print(" First 5 rows:")
print(df.head())

# Step 3: View the last 5 rows


print("\n Last 5 rows:")
print(df.tail())

# Step 4: View the number of rows and columns


print("\n Shape of the DataFrame (rows, columns):")
print(df.shape)

# Step 5: View column names


print("\n Column Names:")
print(df.columns)

# Step 6: Quick summary (data types, non-null count, memory usage)


print("\n DataFrame Info:")
print(df.info())

Selecting Columns or Rows – Full Example


We'll use the same CSV file:
marks.csv
cs
CopyEdit
Name,Maths,Science,English
Rahul,88,90,85
Priya,92,85,89
Anil,75,80,78
Sneha,95,91,92
Kiran,89,87,84
Meera,90,93,88

Python Code – Column and Row Selection


import pandas as pd

df = pd.read_csv("marks.csv")

# Select a single column (Series)


print(" Only Maths column:")
print(df["Maths"])

# Select multiple columns


print("\n Maths and Science columns:")
print(df[["Maths", "Science"]])

# Select a row by label/index using loc[]


print("\n First row using loc[0]:")
print(df.loc[0]) # Rahul's data

# Select a row by position using iloc[]


print("\n Second row using iloc[1]:")
print(df.iloc[1]) # Priya's data

# Select specific student’s data


print("\n Data for student named 'Sneha':")
print(df[df["Name"] == "Sneha"])

Filtering Data (like “if” conditions) – Full Example


CSV File: marks.csv (same as before)
Name,Maths,Science,English
Rahul,88,90,85
Priya,92,85,89
Anil,75,80,78
Sneha,95,91,92
Kiran,89,87,84
Meera,90,93,88
Python Code – Filtering Examples
import pandas as pd

df = pd.read_csv("marks.csv")

# 1. Students who scored more than 90 in Maths


print(" Maths > 90:")
print(df[df["Maths"] > 90])

# 2. Students who scored more than 85 in all subjects


print("\n Scored >85 in Maths, Science, and English:")
high_all = df[(df["Maths"] > 85) & (df["Science"] > 85) & (df["English"] > 85)]
print(high_all)

# 3. Students who scored less than 80 in English


print("\n English < 80:")
print(df[df["English"] < 80])

# 4. Students whose names start with ‘P’ or ‘R’


print("\n Name starts with P or R:")
filtered = df[df["Name"].str.startswith(('P', 'R'))]
print(filtered)

# 5. Students who scored between 85 and 90 in Science


print("\n Science between 85 and 90:")
print(df[(df["Science"] >= 85) & (df["Science"] <= 90)])
Adding New Columns – Full Example
CSV File: marks.csv
(We'll keep using the same data)
cs
CopyEdit
Name,Maths,Science,English
Rahul,88,90,85
Priya,92,85,89
Anil,75,80,78
Sneha,95,91,92
Kiran,89,87,84
Meera,90,93,88

Python Code – Adding Columns


python
CopyEdit
import pandas as pd

df = pd.read_csv("marks.csv")

# 1. Add Total column


df["Total"] = df["Maths"] + df["Science"] + df["English"]
print(" With Total Marks:\n", df)

# 2. Add Average Marks column


df["Average"] = df["Total"] / 3
print("\n With Average Marks:\n", df)

# 3. Add Result column: Pass if all marks >= 80


df["Result"] = df.apply(lambda row: "Pass" if (row["Maths"] >= 80 and row["Science"] >= 80 and
row["English"] >= 80) else "Fail", axis=1)
print("\n With Pass/Fail Result:\n", df)

# 4. Add Grade column based on Average


def get_grade(avg):
if avg >= 90:
return "A+"
elif avg >= 80:
return "A"
elif avg >= 70:
return "B"
else:
return "C"

df["Grade"] = df["Average"].apply(get_grade)
print("\n With Grade:\n", df)

Sorting Data – Full Example


CSV File: marks.csv
(We’re using the same file, now with extra columns like Total, Average, Grade, Result)

Python Code – Sorting Examples


python
CopyEdit
import pandas as pd

df = pd.read_csv("marks.csv")
# Add total and average again if not already present
df["Total"] = df["Maths"] + df["Science"] + df["English"]
df["Average"] = df["Total"] / 3

# 1. Sort by Maths marks (highest to lowest)


print(" Students sorted by Maths score (descending):")
sorted_maths = df.sort_values(by="Maths", ascending=False)
print(sorted_maths)

# 2. Sort by Total marks (highest first)


print("\n Students ranked by Total marks:")
sorted_total = df.sort_values(by="Total", ascending=False)
print(sorted_total)

# 3. Sort by Name (A–Z)


print("\n Sort by student names (alphabetical):")
sorted_name = df.sort_values(by="Name")
print(sorted_name)

# 4. Sort by Average (lowest to highest)


print("\n Sort by Average marks (ascending):")
sorted_avg = df.sort_values(by="Average")
print(sorted_avg)

9. Handling Missing Data (Null or Empty Values) – Full Example


CSV File: marks_missing.csv
csv
CopyEdit
Name,Maths,Science,English
Rahul,88,,85
Priya,92,85,89
Anil,,80,78
Sneha,95,91,
Kiran,89,87,84
Meera,,93,88

Python Code – Handling Missing Values


python
CopyEdit
import pandas as pd

df = pd.read_csv("marks_missing.csv")
# 1. Display rows with missing values
print(" Rows with missing data:")
print(df[df.isnull().any(axis=1)])

# 2. Fill missing values with 0 (assume absent)


df_fill_zero = df.fillna(0)
print("\n Missing values filled with 0:")
print(df_fill_zero)

# 4. Drop rows with *any* missing values


df_dropped = df.dropna()
print("\n Dropped rows with missing values:")
print(df_dropped)

0. GroupBy – Group and Summarize Data – Full Example


CSV File: classmarks.csv
csv
CopyEdit
Name,Class,Maths,Science,English
Rahul,10A,88,90,85
Priya,10A,92,85,89
Anil,10B,75,80,78
Sneha,10B,95,91,92
Kiran,10A,89,87,84
Meera,10B,90,93,88

Python Code – GroupBy Examples


python
CopyEdit
import pandas as pd

df = pd.read_csv("classmarks.csv")

# 1. Average Maths marks by Class


print(" Average Maths by Class:")
print(df.groupby("Class")["Maths"].mean())

# 2. Average of all subjects by Class


print("\n Subject-wise average by Class:")
print(df.groupby("Class")[["Maths", "Science", "English"]].mean())
# 3. Count of students in each Class
print("\n Student count per Class:")
print(df.groupby("Class")["Name"].count())

# 4. Maximum marks in English per Class


print("\n Highest English score per Class:")
print(df.groupby("Class")["English"].max())

# 5. Add Total marks and get class-wise average total


df["Total"] = df["Maths"] + df["Science"] + df["English"]
print("\n Average Total Marks per Class:")
print(df.groupby("Class")["Total"].mean())

11. Date and Time Handling – Full Example


CSV File: students.csv
csv
CopyEdit
Name,JoinDate
Rahul,2020-06-12
Priya,2019-04-20
Anil,2021-08-01
Sneha,2022-11-15
Kiran,2020-01-10
Meera,2021-03-05

Python Code – DateTime Parsing & Extraction


python
CopyEdit
import pandas as pd

# 1. Load the data


df = pd.read_csv("students.csv")

# 2. Convert 'JoinDate' to datetime format


df["JoinDate"] = pd.to_datetime(df["JoinDate"])

# 3. Extract Year, Month, Day


df["Year"] = df["JoinDate"].dt.year
df["Month"] = df["JoinDate"].dt.month
df["Day"] = df["JoinDate"].dt.day
# 4. Day name (Monday, Tuesday...)
df["DayName"] = df["JoinDate"].dt.day_name()

# 5. Full output
print(" Detailed Join Date Info:")
print(df)

You might also like