2.5. Introduction To Matplotlib - 2
2.5. Introduction To Matplotlib - 2
PYTHON PROGRAMMING
24CAH-606
In the above image, the green region represents the figure and the white region is
the axes area.
Figure class
Figure class
import matplotlib.pyplot as plt
# Create a figure
fig = plt.figure(figsize=(6,2), dpi=100, facecolor='red')
# Add a subplot (1 row, 1 column, 1st subplot)
ax = fig.add_subplot(111)
# Plot some data
ax.plot([1, 2, 3], [4, 5, 6])
ax.set_title("Example Plot")
ax.set_xlabel("X-axis")
ax.set_ylabel("Y-axis")
# Display the figure
plt.show()
Figure class
Creating a Figure with Grids of Subplots
import matplotlib.pyplot as plt
import numpy as np
# Create a 2x2 grid of subplots with various customization options
fig, axs = plt.subplots(2, 2, figsize=(7, 4), facecolor='lightgreen’)
# Super title for the entire figure
fig.suptitle('2x2 Grid of Subplots', fontsize='x-large')
# Display the Figure
plt.show()
7
Creating a Figure with Grids of Subplots
8
Matplotlib.pyplot.tight_layout() in Python
Here in axes([0.1, 0.1, 0.8, 0.8]), the first ‘0.1’ refers to the distance between the left
side axis and border of the figure window is 10%, of the total width of the figure
window. The second ‘0.1’ refers to the distance between the bottom side axis and
the border of the figure window is 10%, of the total height of the figure window.
The first ‘0.8’ means the axes width from left to right is 80% and the latter ‘0.8’
means the axes height from the bottom to the top is 80%.
11
Axes class
add_axes() function
Alternatively, you can also add the axes object to the figure by calling
the add_axes() method. It returns the axes object and adds axes at
position [left, bottom, width, height] where all quantities are in fractions
of figure width and height.
Syntax :
12
Axes class
import matplotlib.pyplot as plt
fig = plt.figure()
#[left, bottom, width, height]
ax = fig.add_axes([0, 0, 1, 1])
13
Box plot
• Box Plot : A Box Plot is also known as Whisker plot is created to
display the summary of the set of data values having properties like
minimum, first quartile, median, third quartile and maximum.
• In the box plot, a box is created from the first quartile to the third
quartile, a vertical line is also there which goes through the box at the
median. Here x-axis denotes the data to be plotted while the y-axis
shows the frequency distribution.
• Box plots are also called 'box and whiskers plots'.
14
Boxplot components
Median: This is the middle value of the data, represented by a line within
the box.
Boxes: These represent the data's Interquartile Range (IQR), which
represents the range between Q1 and Q3. The bottom and top edges
represent Q1 and Q3, respectively.
Whiskers: These are vertical lines that extend from either end of the box
to represent the minimum and maximum values, excluding any outliers.
Outliers: These are points outside the whiskers considered unusual or
extreme compared to the rest of the data.
Caps: These are horizontal lines at the ends of the whiskers, representing
the minimum and maximum values, including any outliers. 15
Matplotlib
16
Matplotlib
17
Matplotlib
18
Matplotlib
19
Matplotlib
20
Matplotlib
21
Matplotlib
22
Matplotlib
23
Matplotlib
24
Matplotlib
25
Matplotlib
26
Matplotlib
27
Matplotlib
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(10) # Generate random data for the box plot
data = [np.random.normal(0, std, 100) for std in range(1, 4)]
fig = plt.figure(figsize=(8, 6)) # Create a figure for the box plot
# Add a box plot to the figure
plt.boxplot(data, patch_artist=True, boxprops=dict(facecolor='lightblue'))
plt.title('Box Plot Example', fontsize=16) # Set title and labels
plt.xlabel('Data Groups', fontsize=12)
plt.ylabel('Values', fontsize=12)
plt.show() # Display the plot
28
Matplotlib
29
Matplotlib
The seed() method is used to initialize the random number generator.
The random number generator needs a number to start with (a seed value), to be able to
generate a random number.
By default the random number generator uses the current system time.
Use the seed() method to customize the start number of the random number generator.
Note: If you use the same seed value twice you will get the same random number twice. See
example below
Syntax
random.seed(a, version)
30
Matplotlib
Parameter Values
Parameter Description
a Optional. The seed value needed to generate a random number.
If it, the generator uses the current system time.
version An integer specifying his an integer it is used directly, if not it has to be
converted into an integer.
Demonstrate that if you use the same seed value twice, you will get the same random
number twice:
import random
random.seed(10)
print(random.random())
random.seed(10)
print(random.random()) 31
Bee swarm plot
Also known as a swarmplot, this type of graph displays data points
without overlap to create a "swarming" effect that looks like a swarm of
bees. It's a good way to show the distribution and density of data along a
numerical axis. Bee swarm plots are more detailed than histograms but
without too much visual complexity.
32
Bee swarm plot
Syntax: seaborn.swarmplot(x=None, y=None, data=None, hue_order=None,
palette=None, size=5, edgecolor=’gray’, linewidth=0, ax=None)
Parameters:
x, y, hue: Inputs for plotting long-form data.
data: Dataset for plotting.
color: Color for all of the elements
size: Radius of the markers, in points.
33
Bee swarm plot
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
data = {
'Category': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'D', 'D', 'D'],
'Values': [3, 5, 2, 6, 7, 8, 4, 9, 1, 3, 2, 5]
}
df = pd.DataFrame(data)
plt.figure(figsize=(8, 6)) # Set the figure size
sns.swarmplot(x='Category', y='Values', data=df)
plt.title('Bee Swarm Plot Example')
plt.xlabel('Category')
plt.ylabel('Values')
plt.show()
34
Bee swarm plot
import seaborn as sns
import matplotlib.pyplot as plt
# Load built-in dataset 'tips'
tips = sns.load_dataset('tips')
# Create a bee swarm plot
plt.figure(figsize=(8, 6)) # Set the figure size
sns.swarmplot(x='day', y='total_bill', data=tips)
plt.title('Bee Swarm Plot of Total Bill by Day')
plt.xlabel('Day of the Week')
plt.ylabel('Total Bill')
plt.show()
35
Bee swarm plot
Grouping data points on the basis of category, here as region and event.
import seaborn
seaborn.set(style='whitegrid')
fmri = seaborn.load_dataset("fmri")
seaborn.swarmplot(x="timepoint",
y="signal",
hue="region",
data=fmri)
36
Violin graph
A violin plot plays a similar activity that is pursued through whisker or
box plot do. As it shows several quantitative data across one or more
categorical variables.
It can be an effective and attractive way to show multiple data at several
units.
A “wide-form” Data Frame helps to maintain each numeric column
which can be plotted on the graph.
It is possible to use NumPy or Python objects, but pandas objects are
preferable because the associated names will be used to annotate the
axes.
37
violin graph
Syntax: seaborn.violinplot(x=None, y=None, hue=None, data=None,
order=None, gridsize=100, width=0.8, linewidth=None, color=None,
palette=None)
Parameters:
x, y, hue: Inputs for plotting long-form data.
data: Dataset for plotting.
scale: The method used to scale the width of each violin.
38
violin graph
import seaborn as sns
import matplotlib.pyplot as plt
# Load the built-in 'tips' dataset
tips = sns.load_dataset('tips')
# Create a violin plot
plt.figure(figsize=(8, 6))
sns.violinplot(x='day', y='total_bill', data=tips)
# Add labels and title
plt.title('Violin Plot of Total Bill by Day')
plt.xlabel('Day of the Week')
plt.ylabel('Total Bill')
plt.show()
39
violin graph
Basic visualization of “fmri” dataset using violinplot()
import seaborn
seaborn.set(style = 'whitegrid')
fmri = seaborn.load_dataset("fmri")
seaborn.violinplot(x ="timepoint",
y ="signal",
data = fmri)
40
violin graph
Grouping data points on the basis of category, here as region and event.
import seaborn
seaborn.set(style = 'whitegrid')
fmri = seaborn.load_dataset("fmri")
seaborn.violinplot(x ="timepoint",
y ="signal",
hue ="region",
style ="event",
data = fmri)
41
word cloud
A word cloud in Python is a graphical representation of text data, where
words from a text document are displayed in varying sizes, with the
most frequently occurring words appearing larger. Python libraries like
matplotlib and wordcloud can be used to create word clouds. It’s often
used for visualizing and gaining insights from text data, such as
identifying key terms in a document, website, or social media content.
42
word cloud
A python word cloud or Tag Cloud is a visualization technique
commonly used to display tags or keywords from websites. These single
words reflect the webpage’s context and are clustered together in the
Word Cloud. Words in the cloud vary in font size and color, indicating
their prominence. Larger font size implies higher importance relative to
other words. Word Clouds can take various shapes and sizes based on
the creator’s vision. However, the number of words is crucial; too many
can make it cluttered and hard to read.
Although, there are different ways by which python word cloud can be
created but the most widely used type is by using the Frequency of
Words in our corpus. And thus, we will be creating our Word Cloud by
using the Frequency type.
43
When to Use Python Word Cloud?
Word clouds are best used in specific scenarios where visualizing word
frequency or prominence is essential. Here are some situations when
using a word cloud is appropriate:
Word clouds quickly overview the most frequently occurring words in a
text corpus, helping researchers identify patterns and key themes.
When summarizing a large amount of text, a python word cloud can
effectively highlight the most relevant and important terms, making the
information more accessible to the audience.
44
When to Use Python Word Cloud?
Word clouds are valuable for analyzing sentiments, hashtags, or
trending topics on social media platforms, offering a concise
representation of popular themes.
They add visual appeal and engagement to reports, presentations, or
dashboards, making it easier for viewers to grasp important insights
from the data.
Word clouds can help identify similarities or differences in word
frequencies when comparing multiple text sources or documents.
45
word cloud
wc = WordCloud().generate(meta_text)
plt.imshow(wc)
47
Matplotlib
48
How To Create Word Cloud in Python?
Step 1: Import Necessary Libraries
Import the following libraries which are required to create a Python
Word Cloud:
import pandas as pd
import matplotlib.pyplot as plt
from wordcloud import WordCloud
49
How To Create Word Cloud in Python?
Step 2: Selecting the Dataset
Download the Dataset and save it in your current working directory for
hassle-free code implementation.
Import the dataset into a variable of your choice. Here our data is
imported to variable df.
Text for the Word Cloud does not need to be from a Dataset. To get a
meaningful text with fewer efforts, we are using the Dataset for our
example.
df = pd.read_csv("android-games.csv")
50
How To Create Word Cloud in Python?
Step 3: Selecting the Text and Amount of Text for Word Cloud
Selecting text for creating a Python Word Cloud is an important task.
One must check for various factors for the selection of Text such as:
Do we have Problem Statement?
Does the Selected Text have meaning in it?
Can we conclude the created Word Cloud?
Does our Text have an adequate amount of Text?
51
How To Create Word Cloud in Python?
Word Cloud requires text in an adequate amount. A large number of
words would hinder the visual appearance of Word Cloud and a lesser
number of words would make no sense.
We can use the .head() method of DataFrame to check the Columns and
the type of data present in them. In our example, we have taken the
column category as Text.
52
How To Create Word Cloud in Python?
Step 4: Check for NULL values
It is required to check for the null values in our dataset as while creating
the Word Cloud, it would not accept text with nan values.
df.isna().sum()
If our dataset had any NaN values, we need to treat the missing values
accordingly. Fortunately, this dataset has no NaN values, thus we can
move to the next step.
If there are very few NaN values, it is always advisable to remove such
rows as it would not affect the wordcloud in python to a larger extent.
53
How To Create Word Cloud in Python?
Step 5. Adding Text to a Variable
Based on the parameters from Step 3, add the Text Data to a variable of
your choice. Here, we are adding the data into variable text.
54
word cloud
Step 6: Creating the Word Cloud
Create an object of class WordCloud with the name of your choice and
call the generate() method. Here we have created the object with the
name word_cloud.
WordCloud() takes several arguments as per the need. Here we are
adding two arguments:
collocations = False, which will ignore the collocation words from the
Text
background_color = ‘White’, which will make the words look clearer
55
word cloud
The .generate() method takes one argument of the text we created. In
our case, we will give the text variable as an argument to .generate().
56
word cloud
import pandas as pd
from wordcloud import WordCloud
import matplotlib.pyplot as plt
from nltk.corpus import stopwords
# Step 1: Create a custom dataset
data = {
'text': [
"Python is a versatile programming language.", "Python is widely used for data science and machine learning.",
"Word clouds are great for visualizing text data.", "Data visualization is an essential part of data analysis.",
"Machine learning and artificial intelligence are hot topics.", "Python libraries like Pandas and Matplotlib are very
useful.",
"Visualization helps in understanding complex data.", "Data science combines statistics, programming, domain expertise.",
"Natural Language Processing (NLP) is a key area in data science.", "Python is popular among data scientists for its
simplicity."
]
57
}
word cloud
# Step 2: Create a DataFrame
df = pd.DataFrame(data)
# Step 3: Combine all text into a single string
text = ' '.join(df['text'])
# Step 4: Generate the word cloud
wordcloud = WordCloud(width=400, height=200, background_color='white',
max_words=50).generate(text)
# Step 5: Display the word cloud
plt.figure(figsize=(10, 5))
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off') # Turn off the axis
plt.title('Word Cloud from Custom Dataset', fontsize=20)
plt.show()
58
word cloud
59
THANK YOU
60