0% found this document useful (0 votes)

10 views6 pages

Experiment 5

Python code programming

Uploaded by

shubhamkumar052823

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views6 pages

Experiment 5

Python code programming

Uploaded by

shubhamkumar052823

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Experiment No.

- 05
Aim: To develop Python code for basic and advanced data analysis on a given dataset.

Software: Python 3.13 as Interpreter and PyCharm as Integrated Development Environment.

Theory:

Data analysis is the process of transforming raw data into meaningful insights. It helps in understanding
patterns, drawing conclusions and supporting decision-making. Data analysis is perfomed on
employee records using core Python constructs such as lists and dictionaries without relying on
external libraries.

The dataset includes employee attributes like name, age, salary and department. Representing such
data in Python can be done using a list of dictionaries, where each dictionary corresponds to one
record. With this structure, records can be iterated and applied on various data analysis techniques.

Mean, also called the average, is the sum of all values divided by the total number of values. For
instance, the average salary of employees can reveal overall compensation trends and helps HR plan
future salary structures. Median is the middle value in an ordered dataset. It is less affected by
extreme values and is used in understanding the central tendency when the data contains outliers.
Maximum and minimum values help identify the range of a dataset—for.

Outliers are data points that significantly deviate from other observations. Identifying outliers is
crucial in fraud detection, performance analysis or sensor error checking. Outliers can be detected by
calculating the interquartile range (IQR), which is the difference between the third quartile (Q3) and
the first quartile (Q1). The first quartile (Q1) is the median of the lower half of the dataset and the
third quartile (Q3) is the median of the upper half.

Data can also be grouped by categories such as department to compute department-wise averages,
which are useful in assessing salary parity and age demographics within functional teams. Sorting
data based on salary or age helps prioritize records.

Applications of these basic data analysis techniques include HR analytics (e.g., determining average
salary per department), finance (e.g., identifying highest-paid roles), operations (e.g., identifying
departments with younger employees) and general business intelligence.

1
Python Programming Lab Data Analysis

Program:

# ------------------------------
# Basic and Advanced Data Analysis in Python
# ------------------------------

# Data stored in parallel lists for Name, Age, Salary and Department
names = [’Alice’, ’Bob’, ’Charlie’, ’David’, ’Eva’, ’Frank’, ’Grace’,
’Hannah’, ’Ian’, ’Julia’]
ages = [24, 27, 22, 32, 29, 24, 30, 28, 26, 31]
salaries = [50000, 60000, 48000, 75000, 62000, 52000, 70000, 68000,
59000, 72000]
departments = [’HR’, ’IT’, ’HR’, ’Finance’, ’IT’, ’HR’, ’Finance’, ’IT’,
’IT’, ’Finance’]

# ------------------------------
# BASIC DATA ANALYSIS
# ------------------------------
print("----- BASIC DATA ANALYSIS -----\n")

# Print table header

print("Name\tAge\tSalary\tDepartment")

# Display the first 5 employee records in tabular format

for i in range(5):
print(f"{names[i]}\t{ages[i]}\t{salaries[i]}\t{departments[i]}")

# Calculate the average (mean) age and salary

mean_age = sum(ages) / len(ages)
mean_salary = sum(salaries) / len(salaries)

print(f"\nMean Age: {mean_age:.2f}")

print(f"Mean Salary: {mean_salary:.2f}")

Dr. D. K. Singh 2 National Fire Service College, Nagpur

Data Analysis Python Programming Lab

# Find and display minimum and maximum values for age and salary
print(f"Minimum Age: {min(ages)}")
print(f"Maximum Age: {max(ages)}")
print(f"Minimum Salary: {min(salaries)}")
print(f"Maximum Salary: {max(salaries)}")

# Count number of employees in each department using a dictionary

dept_count = {}
for dept in departments:
if dept in dept_count:
dept_count[dept] += 1
else:
dept_count[dept] = 1

# Print department-wise employee counts

print("\nDepartment-wise Employee Count:")
for dept, count in dept_count.items():
print(f"{dept}: {count}")

# ------------------------------
# ADVANCED DATA ANALYSIS
# ------------------------------
print("\n----- ADVANCED DATA ANALYSIS -----\n")

# Function to calculate correlation coefficient between two lists

def correlation(x, y):
n = len(x)
mean_x = sum(x) / n
mean_y = sum(y) / n

# Numerator: sum of product of deviations

num = sum((x[i] - mean_x) * (y[i] - mean_y) for i in range(n))

National Fire Service College, Nagpur 3 Dr. D. K. Singh

Python Programming Lab Data Analysis

# Denominator: product of standard deviations

den_x = sum((x[i] - mean_x) ** 2 for i in range(n)) ** 0.5
den_y = sum((y[i] - mean_y) ** 2 for i in range(n)) ** 0.5

return num / (den_x * den_y)

# Compute correlation between age and salary

corr = correlation(ages, salaries)
print(f"Correlation between Age and Salary: {corr:.4f}")

# Calculate average salary for each department

dept_sums = {} # To store total salary per department
dept_counts = {} # To store employee count per department

for i in range(len(departments)):
dept = departments[i]
salary = salaries[i]

if dept in dept_sums:
dept_sums[dept] += salary
dept_counts[dept] += 1
else:
dept_sums[dept] = salary
dept_counts[dept] = 1

# Display department-wise average salary

print("\nAverage Salary by Department:")
for dept in dept_sums:
avg_salary = dept_sums[dept] / dept_counts[dept]
print(f"{dept}: {avg_salary:.2f}")

# ------------------------------
# Outlier Detection Using IQR

Dr. D. K. Singh 4 National Fire Service College, Nagpur

Data Analysis Python Programming Lab

# ------------------------------
# Sort salary data to compute quartiles
sorted_salaries = sorted(salaries)
n = len(sorted_salaries)

# Function to compute median of a list

def median(data):
mid = len(data) // 2
if len(data) % 2 == 0:
return (data[mid - 1] + data[mid]) / 2
else:
return data[mid]

# Calculate Q1 (lower quartile) and Q3 (upper quartile)

Q1 = median(sorted_salaries[:n // 2])
Q3 = median(sorted_salaries[(n + 1) // 2:])
IQR = Q3 - Q1 # Interquartile Range

# Calculate lower and upper bounds for detecting outliers

lower = Q1 - 1.5 * IQR
upper = Q3 + 1.5 * IQR

print(f"\nIQR = {IQR}, Lower Bound = {lower}, Upper Bound = {upper}")

print("Outliers in Salary:")

# Identify and display salaries outside the IQR range

for i in range(n):
if salaries[i] < lower or salaries[i] > upper:
print(f"{names[i]}: {salaries[i]}")

National Fire Service College, Nagpur 5 Dr. D. K. Singh

Python Programming Lab Data Analysis

Program Output: Data Analysis

----- BASIC DATA ANALYSIS -----

Name Age Salary Department
Alice 24 50000 HR
Bob 27 60000 IT
Charlie 22 48000 HR
David 32 75000 Finance
Eva 29 62000 IT

Mean Age: 27.30

Mean Salary: 61600.00
Minimum Age: 22
Maximum Age: 32
Minimum Salary: 48000
Maximum Salary: 75000

Department-wise Employee Count:

HR: 3
IT: 4
Finance: 3

----- ADVANCED DATA ANALYSIS -----

Correlation between Age and Salary: 0.9701

Average Salary by Department:

HR: 50000.00
IT: 62250.00
Finance: 72333.33

IQR = 18000, Lower Bound = 25000.0, Upper Bound = 97000.0

Outliers in Salary:

Dr. D. K. Singh 6 National Fire Service College, Nagpur

Statistics IMP Questions and Answers
No ratings yet
Statistics IMP Questions and Answers
23 pages
Data Science
No ratings yet
Data Science
18 pages
Data Preprocessing & Visualization1
No ratings yet
Data Preprocessing & Visualization1
2 pages
L6 and 7-Data Preprocessing-Coding
No ratings yet
L6 and 7-Data Preprocessing-Coding
34 pages
IS5312 Mini Project-2
No ratings yet
IS5312 Mini Project-2
5 pages
Data Analytics Lab Manuals 2025-2026-1
No ratings yet
Data Analytics Lab Manuals 2025-2026-1
39 pages
Pds Record Document Ds II
No ratings yet
Pds Record Document Ds II
36 pages
Python Assignment-2
No ratings yet
Python Assignment-2
3 pages
Python Practice Questions
No ratings yet
Python Practice Questions
5 pages
Shubham Info Practical 3251
No ratings yet
Shubham Info Practical 3251
59 pages
Ids 1
No ratings yet
Ids 1
30 pages
Data Analysis Practical
No ratings yet
Data Analysis Practical
13 pages
Principles of AI Laboratory Varshadr
No ratings yet
Principles of AI Laboratory Varshadr
54 pages
ML Programs
No ratings yet
ML Programs
41 pages
Capstone Project Assignment
No ratings yet
Capstone Project Assignment
3 pages
Machine Learning Lab Word 12-1-2025. Document
No ratings yet
Machine Learning Lab Word 12-1-2025. Document
68 pages
ML LAB Manual-1
No ratings yet
ML LAB Manual-1
33 pages
NumPy and Pandas Salary Data Analysis
No ratings yet
NumPy and Pandas Salary Data Analysis
19 pages
Data Visualization & Preprocessing Guide
No ratings yet
Data Visualization & Preprocessing Guide
18 pages
Data Analysis Exam for CS Majors
No ratings yet
Data Analysis Exam for CS Majors
12 pages
AIDS - DM Using Python - Lab Programs
No ratings yet
AIDS - DM Using Python - Lab Programs
19 pages
Python Data Analysis Guide
100% (1)
Python Data Analysis Guide
36 pages
Kushal Kadayat
No ratings yet
Kushal Kadayat
33 pages
Answer Key For SET-1 TO 3
No ratings yet
Answer Key For SET-1 TO 3
7 pages
Practical Questions
No ratings yet
Practical Questions
7 pages
PRT 2 Q's
No ratings yet
PRT 2 Q's
7 pages
Practical 1
No ratings yet
Practical 1
10 pages
Python Pandas, Matplotlib, SQL Tasks
No ratings yet
Python Pandas, Matplotlib, SQL Tasks
6 pages
Mastering Pandas With 103 Practical Questions and Solution 1731584558
No ratings yet
Mastering Pandas With 103 Practical Questions and Solution 1731584558
48 pages
Divp Pyq 2023
No ratings yet
Divp Pyq 2023
7 pages
Aanik Info Practical 3261
No ratings yet
Aanik Info Practical 3261
61 pages
Lab 11,12
No ratings yet
Lab 11,12
7 pages
Data Science
No ratings yet
Data Science
30 pages
Practical File Questions
No ratings yet
Practical File Questions
2 pages
Data Analyst Nanodegree Program - Syllabus
50% (2)
Data Analyst Nanodegree Program - Syllabus
7 pages
Employee Performance & Salary Analysis
No ratings yet
Employee Performance & Salary Analysis
1 page
04 DS 2023
No ratings yet
04 DS 2023
63 pages
ML Ex2
No ratings yet
ML Ex2
7 pages
Geo Python Doc (1) 7,8 Bavesh
No ratings yet
Geo Python Doc (1) 7,8 Bavesh
9 pages
Python Programs
No ratings yet
Python Programs
8 pages
ADS LAB Merged
No ratings yet
ADS LAB Merged
86 pages
ML 1-11
No ratings yet
ML 1-11
27 pages
Data Science Experiments
No ratings yet
Data Science Experiments
31 pages
Exp1d
No ratings yet
Exp1d
6 pages
Financial Analytics With Python
100% (1)
Financial Analytics With Python
40 pages
Week2 Lab
No ratings yet
Week2 Lab
8 pages
3-DSEs UGCF CS (H) Approved Facultymay25
No ratings yet
3-DSEs UGCF CS (H) Approved Facultymay25
44 pages
Sandeep ML Record
No ratings yet
Sandeep ML Record
31 pages
Data Science & Analytics Lab Manual
No ratings yet
Data Science & Analytics Lab Manual
39 pages
Unit 1,2
No ratings yet
Unit 1,2
17 pages
1624106057@g.us
No ratings yet
1624106057@g.us
13 pages
Dsi237 Group 2
No ratings yet
Dsi237 Group 2
27 pages
Sowmi DS
No ratings yet
Sowmi DS
27 pages
Data Science Training Report
100% (1)
Data Science Training Report
26 pages
Term-I Practical Question Paper 2022-2023
No ratings yet
Term-I Practical Question Paper 2022-2023
8 pages
Class XII Informatics Practices
100% (1)
Class XII Informatics Practices
8 pages
Data Project
No ratings yet
Data Project
12 pages
FDSA Lab Manual 1
No ratings yet
FDSA Lab Manual 1
34 pages
Python Programming
No ratings yet
Python Programming
3 pages
Experiment 2
No ratings yet
Experiment 2
4 pages
Experiment 3
No ratings yet
Experiment 3
6 pages
Assignment 1
No ratings yet
Assignment 1
43 pages
Experiment 4
No ratings yet
Experiment 4
5 pages
Experiment 8 1
No ratings yet
Experiment 8 1
7 pages
Compal LA-G07EP Rev 1.0
No ratings yet
Compal LA-G07EP Rev 1.0
43 pages
Operation and Maintenance Manual For PZ61 DC Power System
No ratings yet
Operation and Maintenance Manual For PZ61 DC Power System
15 pages
Final Report Template
No ratings yet
Final Report Template
6 pages
SumatraPDF Settings
No ratings yet
SumatraPDF Settings
2 pages
Lists of Band Combination in Landsat 8
No ratings yet
Lists of Band Combination in Landsat 8
12 pages
Security+ Vulnerability Lab Guide
No ratings yet
Security+ Vulnerability Lab Guide
42 pages
Application Form PDF
No ratings yet
Application Form PDF
1 page
Solar PV Balance of System Guide
No ratings yet
Solar PV Balance of System Guide
30 pages
Engineering Design & The Design Process
No ratings yet
Engineering Design & The Design Process
16 pages
BD Controll A Door Prodigy Home Owner Manual
No ratings yet
BD Controll A Door Prodigy Home Owner Manual
20 pages
Syllabus Spring 2025
No ratings yet
Syllabus Spring 2025
12 pages
Algebra 2 Common Core Style Questions
No ratings yet
Algebra 2 Common Core Style Questions
17 pages
Typing Test English
No ratings yet
Typing Test English
1 page
Aidyn Blackburn - COMPARE - Types of Identity Theft
No ratings yet
Aidyn Blackburn - COMPARE - Types of Identity Theft
2 pages
Theory Support For Assignment 4: Communication Theory - 1 (EC5.203 - Spring 2020) March 27, 2020
No ratings yet
Theory Support For Assignment 4: Communication Theory - 1 (EC5.203 - Spring 2020) March 27, 2020
2 pages
Brutus - Authentication Engine Test Release 2
No ratings yet
Brutus - Authentication Engine Test Release 2
4 pages
Java Set-1 Answers
No ratings yet
Java Set-1 Answers
9 pages
HSST
No ratings yet
HSST
15 pages
Circuit Theory Syllabus PDF
No ratings yet
Circuit Theory Syllabus PDF
2 pages
Searching For Hoopes Prize Theses
No ratings yet
Searching For Hoopes Prize Theses
2 pages
Chip June11
No ratings yet
Chip June11
124 pages
Deus Ex Human Revolution
No ratings yet
Deus Ex Human Revolution
294 pages
UNIT-5: Managing Input / Output in JAVA
100% (1)
UNIT-5: Managing Input / Output in JAVA
58 pages
Intro to Set Theory for CS Students
No ratings yet
Intro to Set Theory for CS Students
19 pages
Restaurant Management System
No ratings yet
Restaurant Management System
39 pages
DIY Adriano Obstacle Avoiding Car
No ratings yet
DIY Adriano Obstacle Avoiding Car
6 pages
Analyzing Reliability in The Data Center Outline
No ratings yet
Analyzing Reliability in The Data Center Outline
5 pages
Venkata Ravi Resume
No ratings yet
Venkata Ravi Resume
4 pages
Construction Problem Solving Program
No ratings yet
Construction Problem Solving Program
4 pages
DM2100 Datasheet
No ratings yet
DM2100 Datasheet
19 pages