EDA LAB ASSIGNMENT2
EDA LAB ASSIGNMENT2
ASSIGNMENT 02:
customers_df = pd.DataFrame(customers_data)
print("Customers DataFrame:")
print(customers_df)
print("\nOrders DataFrame:")
print(orders_df)
OUTPUT:
Customers DataFrame:
customer_id customer_name region
0 1 Alice North
1 2 Bob South
2 3 Charlie East
Orders DataFrame:
order_id customer_id order_amount order_date
0 101 1 250 2023-01-01
1 102 2 150 2023-01-02
2 103 1 300 2023-01-03
3 104 3 200 2023-01-04
print("\nMerged DataFrame:")
print(merged_df)
Output:
print(melted_df)
OUTPUT:
print(pivot_df)
OUTPUT:
product Keyboard Laptop Monitor Phone Tablet
order_id
101 NaN 1000.0 NaN NaN NaN
102 NaN NaN NaN 500.0 NaN
103 NaN NaN NaN NaN 300.0
104 NaN NaN 200.0 NaN NaN
105 50.0 NaN NaN NaN NaN
The pivot() function rearranges the data so that products become columns and the price values are
filled accordingly.
Aggregating Data
We will calculate total sales per region.
print(sales_per_region)
OUTPUT:
region total_sales
0 East 1400
1 North 300
2 West 1150
# Customization
plt.title("Line Plot of sin(x)", fontsize=14)
plt.xlabel("X values", fontsize=12)
plt.ylabel("Y values", fontsize=12)
plt.legend()
plt.grid(True)
# Show plot
plt.show()
OUTPUT:
Scatter Plot
# Customization
plt.title("Scatter Plot of Random Data", fontsize=14)
plt.xlabel("X Axis", fontsize=12)
plt.ylabel("Y Axis", fontsize=12)
plt.grid(True)
# Show plot
plt.show()
Output:
A scatter plot with red circles representing random data points.
Histogram
A histogram is used to represent the distribution of a dataset.
# Generate random data
data = np.random.randn(1000) # 1000 data points following normal distribution
# Customization
plt.title("Histogram of Normally Distributed Data", fontsize=14)
plt.xlabel("Data Bins", fontsize=12)
plt.ylabel("Frequency", fontsize=12)
plt.grid(True)
# Show plot
plt.show()
OUTPUT:
# Customization
plt.title("Bar Chart Example", fontsize=14)
plt.xlabel("Categories", fontsize=12)
plt.ylabel("Values", fontsize=12)
plt.grid(axis='y', linestyle='--')
# Show plot
plt.show()
OUTPUT:
plt.figure(figsize=(8, 5))
plt.plot(x, y, label='cos(x)', color='purple')
# Customization
plt.title("Customized Line Plot", fontsize=14)
plt.xlabel("X values", fontsize=12)
plt.ylabel("Y values", fontsize=12)
plt.legend()
plt.grid(True)
# Show plot
plt.show()
OUTPUT:
Subplots
We can create multiple plots in a single figure.
# Adjust layout
plt.tight_layout()
plt.show()
OUTPUT:
3D Plotting
Matplotlib allows 3D visualization using Axes3D.
# Generate 3D data
x = np.linspace(-5, 5, 100)
y = np.linspace(-5, 5, 100)
X, Y = np.meshgrid(x, y)
Z = np.sin(np.sqrt(X**2 + Y**2))
# Customization
ax.set_title("3D Surface Plot")
ax.set_xlabel("X Axis")
ax.set_ylabel("Y Axis")
ax.set_zlabel("Z Axis")
# Show plot
plt.show()
OUTPUT:
Key Customizations
✔ Titles (title())
✔ Axis labels (xlabel(), ylabel())
✔ Legends (legend())
✔ Annotations (annotate())
✔ Grid (grid(True))
Matplotlib provides a flexible and powerful way to visualize data. By mastering these concepts, you
can create highly customized and informative plots for data analysis and presentation.