[go: up one dir, main page]

0% found this document useful (0 votes)
4 views10 pages

Nandini_matplotlib_ws

The document outlines two tasks involving data visualization using Python libraries. The first task involves creating line curves for three random series with specific styles and annotations, while the second task focuses on visualizing the Iris dataset through various plots including bar graphs, histograms, density plots, scatter plots, pair plots, and heat maps. Each visualization is accompanied by appropriate legends and axis labels to enhance clarity.

Uploaded by

U Soni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views10 pages

Nandini_matplotlib_ws

The document outlines two tasks involving data visualization using Python libraries. The first task involves creating line curves for three random series with specific styles and annotations, while the second task focuses on visualizing the Iris dataset through various plots including bar graphs, histograms, density plots, scatter plots, pair plots, and heat maps. Each visualization is accompanied by appropriate legends and axis labels to enhance clarity.

Uploaded by

U Soni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Q1) Create three random series objects containing floating point values, with names s1, s2 and

s3 respectively. Perform following operations on these three objects using appropriate libraries:
a. Draw a line curve for each of these series object and plot all these curves on the same subplot
with suitable legend.
b. Provide following style properties of the line curves-
i. Line curve for s1 should be of red color, should be drawn using dashed line and should display
the data points using diamond shaped markers.
ii. Line curve for s2 should be of green color, should be drawn using dotted line and should
display the data points using * shaped markers.
iii. Line curve for s3 should be of blue color, should be drawn using dash-dotted line and should
display the data points using O shaped markers.
c. Also add any one shape to the graph with a suitable annotation to it.

Code:
s1 = pd.Series(np.random.rand(10))
s2 = pd.Series(np.random.rand(10))
s3 = pd.Series(np.random.rand(10))
plt.figure(figsize=(10, 6))
plt.plot(s1, color='red', linestyle='--', marker='D', label='S1')
plt.plot(s2, color='green', linestyle=':', marker='*', label='S2')
plt.plot(s3, color='blue', linestyle='-.', marker='o', label='S3')
plt.gca().add_patch(plt.Rectangle((2, 0.2), 2, 0.5, color='gray', alpha=0.3))
plt.text(2.1, 0.6, 'Rectangled Zone', fontsize=10, color='black')
plt.title('Line Curves for Random Series and a shape')
plt.xlabel('Index')
plt.ylabel('Values')
plt.legend()
plt.tight_layout()
plt.show()
Output:
Q2) Download IRIS data from: https://archive.ics.uci.edu/ml/datasets/iris or import it from
sklearn datasets. Using the Iris data, plot the following with proper legend and axis labels:
a. Plot a horizontal bar graph to show the frequency of each class label in the data.
b. Plot two vertical bar charts in the same figure object, comparing the sepal length and sepal
width features (respectively) for all three species of Iris flower.
c. Plot separate histograms (within the same figure object) for each of the species to visualize the
distribution of their petal length.
d. Plot density curves (within the same subplot) to compare the petal width of Setosa &
Virginica flowers.
e. Draw two separate scatter-plots (within the same figure object) between petal length & petal
width and sepal length & sepal width for all species (with hue as species). Use the results to
infer, which pair of features form clearly separable clusters by species.
f. Draw a pair plot to show the pairwise bivariate distribution in the Iris Dataset.
g. Draw heat-map for the four numeric attributes.
h. Compute correlation coefficients between each pair of features and plot heat-map.

Code:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
import numpy as np
iris = load_iris()
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df['species'] = pd.Categorical.from_codes(iris.target, iris.target_names)
#A
plt.figure(figsize=(6, 4))
df['species'].value_counts().plot(kind='barh', color='skyblue')
plt.xlabel('Count')
plt.ylabel('Species')
plt.title('Frequency of Each Iris Species')
plt.tight_layout()
plt.show()
#B
plt.figure(figsize=(8, 5))
gpd = df.groupby('species')[['sepal length (cm)', 'sepal width (cm)']].mean()
x = np.arange(len(gpd.index))
width = 0.35
plt.bar(x - width/2, gpd['sepal length (cm)'], width, label='Sepal Length')
plt.bar(x + width/2, gpd['sepal width (cm)'], width, label='Sepal Width')
plt.xticks(x, gpd.index)
plt.ylabel('Mean (cm)')
plt.title('Sepal Length vs Width by Species')
plt.legend()
plt.tight_layout()
plt.show()
#C
setosa = df[df['species'] == 'setosa']
versicolor = df[df['species'] == 'versicolor']
virginica = df[df['species'] == 'virginica']
a=[setosa,versicolor,virginica]
plt.figure(figsize=(10, 4))
plt.subplot(1, 3, 1)
plt.hist(setosa['petal length (cm)'], bins=10, color='lightcoral', edgecolor='black')
plt.title('Setosa Petal Length')
plt.xlabel('Length (cm)')
plt.ylabel('Frequency')
plt.subplot(1, 3, 2)
plt.hist(versicolor['petal length (cm)'], bins=10, color='lightseagreen', edgecolor='black')
plt.title('Versicolor Petal Length')
plt.xlabel('Length (cm)')
plt.ylabel('Frequency')
plt.subplot(1, 3, 3)
plt.hist(virginica['petal length (cm)'], bins=10, color='cornflowerblue', edgecolor='black')
plt.title('Virginica Petal Length')
plt.xlabel('Length (cm)')
plt.ylabel('Frequency')
plt.tight_layout()
plt.show()
#D
plt.figure(figsize=(8, 5))
for species in ['setosa', 'virginica']:
sns.kdeplot(df[df['species'] == species]['petal width (cm)'], label=species)
plt.title('Density Plot of Petal Width (Setosa vs Virginica)')
plt.xlabel('Petal Width (cm)')
plt.ylabel('Density')
plt.legend()
plt.tight_layout()
plt.show()
#E
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
sns.scatterplot(data=df, x='petal length (cm)', y='petal width (cm)', hue='species', ax=axes[0])
axes[0].set_title('Petal Length vs Width')
sns.scatterplot(data=df, x='sepal length (cm)', y='sepal width (cm)', hue='species', ax=axes[1])
axes[1].set_title('Sepal Length vs Width')
plt.tight_layout()
plt.show()
#F
sns.pairplot(df, hue='species', height=2)
plt.suptitle('Pair Plot of Iris Dataset', y=1.02)
plt.show()
#G
plt.figure(figsize=(6, 4))
sns.heatmap(df.iloc[:, :-1])
plt.title('Heatmap of Iris Features (Raw Values)')
plt.tight_layout()
plt.show()
#H
plt.figure(figsize=(6, 4))
correlation = df.iloc[:, :-1].corr()
sns.heatmap(correlation, annot=True, cmap='coolwarm', fmt='.2f')
plt.title('Feature Correlation Heatmap')
plt.tight_layout()
plt.show()

Output:

You might also like