[go: up one dir, main page]

0% found this document useful (0 votes)
30 views72 pages

Matplotlib Seaborn Fundamentals (1)

Uploaded by

Prajwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views72 pages

Matplotlib Seaborn Fundamentals (1)

Uploaded by

Prajwal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 72

DATA VISUALIZATION fundamentals for EDA

By: Prashanth Kannadaguli,


Senior Corporate Data Science Trainer
MATPLOTLIB & SEABORN for data
visualization
By: Prashanth Kannadaguli,
Senior Corporate Data Science Trainer

The Matplotlib is a comprehensive library for creating static, animated, and interactive
visualizations in Python. This course covers the basic usage patterns and best practices to get
started with Matplotlib.

Installing Matplotlib

The anaconda stack installation has Matplotlib pre-installed and we are ready to use it.

We can update Matplotlib using:

conda update matplotlib


For newer environments one can install Matplotlib by using the following command in the
anaconda prompt:

conda install matplotlib

We import Matplotlib and check the version

In [1]: 1 import matplotlib


2 matplotlib.__version__

Out[1]: '3.5.1'

In Colab as well, the Matplotlib comes built in and we can check its version after importing.

The convension in the Python world is to import the Matplotlib by an alias name mpl:

In [2]: 1 import matplotlib as mpl


2 mpl

Out[2]: <module 'matplotlib' from 'C:\\Users\\prash\\anaconda3\\lib\\site-packages\\mat


plotlib\\__init__.py'>

To display all the contents of the matplotlib namespace, we can use this:

mpl.< TAB >


To display matplotlib built-in documentation, we use this:

In [3]: 1 mpl?

The link to Matplotlib documentation online:

https://matplotlib.org/stable/index.html (https://matplotlib.org/stable/index.html)
The matplotlib tutorials can be accessed using:

https://matplotlib.org/stable/tutorials/index.html (https://matplotlib.org/stable/tutorials/index.html)
The matplotlib examples can be accessed using:

https://matplotlib.org/stable/gallery/index.html (https://matplotlib.org/stable/gallery/index.html)
Matplotlib graphs data on Figures, each of which can contain one or more Axes, an area where
points can be specified in terms of x-y coordinates.
The simplest way of creating a Figure with an Axes is using pyplot. The matplotlib.pyplot is an
interface of matplotlib we use widely for plots. The pyplot functions help making changes to the
figures, as required. We start importing with the shorthands:

In [4]: 1 import matplotlib.pyplot as plt


2 import numpy as np

Setting styles: We use the plt.style directive to choose appropriate aesthetic styles for our
figures.
We will set the classic style, which ensures that the plots we create use the classic Matplotlib
style:

In [5]: 1 plt.style.use('classic')

Plotting from scripts: The plt.plot() draws figures inline when called in Jupyter notebooks.
In [6]: 1 x = np.linspace(0, 10, 100)
2 plt.plot(x, np.sin(x))
3 plt.plot(x, np.cos(x))

Out[6]: [<matplotlib.lines.Line2D at 0x19de6315af0>]

When we run in Ipython command shell we need to call plt.show() to view the figure in a new
window.
Interactive Plotting: To enter into the interactive mode we have to use the magic command
%matplotlib

Any plt.plot() command will cause a figure window to open, and further commands can be run to
update the plot. The changes such as modifying properties of lines that are already drawn will not
draw automatically; to force an update we have to use plt.draw().

In the IPython notebook, we have the option of embedding graphics directly in the notebook using
%matplotlib inline which leads to static images of the plot embedded into the notebook.

In [7]: 1 %matplotlib inline

Once we run this command (once per kernel/session), any cell within the notebook that creates a
plot will be embedded into a PNG image of the resulting graphics:
In [8]: 1 fig = plt.figure()
2 plt.plot(x, np.sin(x), '--')
3 plt.plot(x, np.cos(x), '*')

Out[8]: [<matplotlib.lines.Line2D at 0x19de6340cd0>]

Saving figures to file: Matplotlib has the ability to save figures into various image formats
determined by the following command.

In [9]: 1 fig.canvas.get_supported_filetypes()

Out[9]: {'eps': 'Encapsulated Postscript',


'jpg': 'Joint Photographic Experts Group',
'jpeg': 'Joint Photographic Experts Group',
'pdf': 'Portable Document Format',
'pgf': 'PGF code for LaTeX',
'png': 'Portable Network Graphics',
'ps': 'Postscript',
'raw': 'Raw RGBA bitmap',
'rgba': 'Raw RGBA bitmap',
'svg': 'Scalable Vector Graphics',
'svgz': 'Scalable Vector Graphics',
'tif': 'Tagged Image File Format',
'tiff': 'Tagged Image File Format'}

We can save a figure using the savefig() command.

Example: to save the previous figure as a PNG file:

In [10]: 1 fig.savefig('firstFig.jpg')

We have a file called firstFig.jpg in the current working directory. Lets check!!!
To confirm that it contains what we saved, let’s use the Python Image object to display the
contents of firstFig.jpg

In [11]: 1 from IPython.display import Image


2 Image('firstFig.jpg')

Out[11]:

Subplots - MATLAB like interface: Matplotlib was originally written as a Python alternative for
MATLAB and much of the syntax reflects the same. The MATLAB-style tools are contained in the
pyplot interface.

Example: the following code will create MATLAB like subplots.


In [12]: 1 plt.figure()
2 plt.subplot(2, 2, 1)
3 plt.plot(x, np.exp(x))
4 plt.subplot(2, 2, 2)
5 plt.plot(x, np.cos(x))
6 plt.subplot(2, 2, 3)
7 plt.plot(x, np.exp(-x))
8 plt.subplot(2, 2, 4)
9 plt.plot(x, np.sin(x))

Out[12]: [<matplotlib.lines.Line2D at 0x19de6471940>]

Object oriented interface: The object-oriented interface is available for complicated situations
when we want to have more control over the figure.

In the object-oriented interface the plotting functions are methods of explicit Figure and Axes
objects.

Example: To re-create the previous plot using this style of plotting


In [13]: 1 fig, ax = plt.subplots(2,2) # ax is an array of 2 by 2 axes objects
2 # Call plot() method on the appropriate object
3 ax[0,0].plot(x, np.exp(x))
4 ax[1,0].plot(x, np.exp(-x))
5 ax[0,1].plot(x, np.cos(x))
6 ax[1,1].plot(x, np.sin(x))

Out[13]: [<matplotlib.lines.Line2D at 0x19de660fe50>]

Simple Line Plots: The simplest of all plots is the visualization of a function y = f(x). We can start
by creating a figure and an axes:
In [14]: 1 fig = plt.figure()
2 ax = plt.axes()

In [15]: 1 plt.grid()

The figure is an instance of the class plt.Figure and it can be thought of as a single container that
contains all the objects representing axes, graphics, text, and labels.
The axes is an instance of the class plt.Axes and it is a bounding box with ticks and labels, which
will eventually contain the plot elements that make up our visualization.

In the Python world, the variable name fig is used to refer to a figure instance and ax is used to
refer to an axes instance or group of axes instances.

Once we have created an axes, we have to use the ax.plot() function to plot the data.

In [16]: 1 plt.style.use('seaborn-whitegrid')
2 fig = plt.figure()
3 ax = plt.axes()
4 ax.plot(x, np.sin(x))

Out[16]: [<matplotlib.lines.Line2D at 0x19de77fdc70>]

Alternatively, we can use the pylab interface and let the figure and axes be created in the
background:

In [17]: 1 plt.plot(x, np.sin(x))

Out[17]: [<matplotlib.lines.Line2D at 0x19de785d3a0>]

When we want to create a single figure with multiple line plots, we have to call the plot function
multiple times:

In [18]: 1 plt.plot(x, np.sin(x))


2 plt.plot(x, np.cos(x))

Out[18]: [<matplotlib.lines.Line2D at 0x19de7796d00>]

Line colours & styles: The plt.plot() function takes additional arguments that can be used to
specify colours and styles.

To adjust the color we use the color keyword which accepts a string argument representing
virtually any color.

In [19]: 1 plt.plot(x, np.cos(x), color='blue') # specify color by name


2 plt.plot(x, np.cos(x + 2.5), color='g') # short color code (rgbcmy
3 plt.plot(x, np.cos(x + 4.5), color='0.75') # Grayscale between 0 and
4 plt.plot(x, np.cos(x + 6.5), color='#FFDD44') # Hex code (RRGGBB from 0
5 plt.plot(x, np.cos(x + 8.5), color=(1.0,0.2,0.3)) # RGB tuple, values 0 and
6 plt.plot(x, np.cos(x + 10.5), color='chartreuse'); # all HTML color names sup

We can adjust the line style using the linestyle keyword:


In [20]: 1 plt.plot(x, 2*x, linestyle='solid')
2 plt.plot(x, 2*x + 10, linestyle='dashed')
3 plt.plot(x, 2*x + 20, linestyle='dashdot')
4 plt.plot(x, 2*x + 30, linestyle='dotted');
5 ​
6 # For short, we can use the following codes:
7 plt.plot(x, -2*x - 40, linestyle='-') # solid
8 plt.plot(x, -2*x - 50, linestyle='--') # dashed
9 plt.plot(x, -2*x - 60, linestyle='-.') # dashdot
10 plt.plot(x, -2*x - 70, linestyle=':'); # dotted

These linestyle and color codes can be combined into a single nonkeyword argument to the
plt.plot() function:

In [21]: 1 plt.plot(x, -3*x, '-g') # solid green


2 plt.plot(x, -3*x - 10, '--c') # dashed cyan
3 plt.plot(x, -3*x - 20, '-.k') # dashdot black
4 plt.plot(x, -3*x - 30, ':r'); # dotted red

Axes Limits: The axes limits can be adjusted by using plt.xlim() and plt.ylim() methods.
In [22]: 1 plt.plot(x, np.cos(x))
2 plt.xlim(-2, 12)
3 plt.ylim(-2.75, 2.5);

The plt.axis() method allows to set the x and y limits with a single call by passing a list that
specifies [xmin, xmax, ymin, ymax] .

In [23]: 1 plt.plot(x, np.cos(x))


2 plt.axis([-1.5, 15, -2.5, 3.5]);

The plt.axis('tight') method can automatically tighten the bounds around the current plot:
In [24]: 1 plt.plot(x, np.sin(2.5*x))
2 plt.axis('tight');

It allows an equal aspect ratio so that on the screen one unit in x is equal to one unit in y by using
plt.axis('equal'):

In [25]: 1 plt.plot(x, np.cos(3*x))


2 plt.axis('equal');

Labelling plots: Titles and axis labels can be produced by using the methods plt.title(),
plt.xlabel() and plt.ylabel()
In [26]: 1 plt.plot(x, np.tan(x))
2 plt.title("This is tan function")
3 plt.xlabel("x")
4 plt.ylabel("tan(x)")

Out[26]: Text(0, 0.5, 'tan(x)')

When multiple lines are being plotted on a single axes, it can be useful to create a plot legend that
labels each line type. Matplotlib has a built-in way of quickly creating such a legend by using
plt.legend() method. We can specify the label of each line by using the *label keyword.

In [27]: 1 plt.plot(x, np.cos(2*x), 'r.', label='cos(2x)')


2 plt.plot(x, np.tanh(x), '--b', label='tan(x)')
3 plt.axis('equal')
4 plt.legend();

We can use the following list of conversions between Matlab style plots and Object oriented plots.
In the object-oriented interface to plotting we can use the ax.set() method to set all properties
required at once:

In [28]: 1 ax = plt.axes()
2 ax.plot(x, np.cos(2.5*x))
3 ax.set(xlim=(-2, 12), ylim=(-2.5, 4),
4 xlabel='x', ylabel='cos(x)',
5 title='Plot of sinusoid');

Scatter plots: Can be created by using plt.scatter() function


In [29]: 1 x = np.linspace(0, 10, 25)
2 y = np.cos(x)
3 plt.scatter(x, y, marker= 'd')

Out[29]: <matplotlib.collections.PathCollection at 0x19de7c836d0>

The advantage of plt.scatter() is that it can be used to create scatter plots where the properties
such as size, face color, edge color of each point can be individually controlled. One can also use
the alpha keyword to adjust the transparency level of the data points.

Example: creating a random scatter plot with points of many colors and sizes.
In [30]: 1 x = np.random.randn(50)
2 y = np.random.randn(50)
3 colors = np.random.rand(50)
4 sizes = 500 * np.random.rand(50)
5 plt.scatter(x, y, c=colors, s=sizes, alpha=0.4, cmap='viridis')
6 plt.colorbar();

C:\Users\prash\AppData\Local\Temp\ipykernel_1112\123576764.py:6: MatplotlibDepr
ecationWarning: Auto-removal of grids by pcolor() and pcolormesh() is deprecate
d since 3.5 and will be removed two minor releases later; please call grid(Fals
e) first.
plt.colorbar();

To view the list colormaps supported we can use plt.cm.< TAB >

Errorbars: Visualization of data or results with errors by using plt.errorbar() function


In [31]: 1 x = np.linspace(1, 119, 15)
2 err = 0.5
3 y = np.cos(x) + err * np.random.randn(15)
4 plt.errorbar(x, y, yerr = err, fmt = 'rd')
5 ​
6 # Here the fmt is a format code controlling the appearance of lines and poin

Out[31]: <ErrorbarContainer object of 3 artists>

Using some additional options we can easily customize the aesthetics of your errorbar plot:

In [32]: 1 plt.errorbar(x, y, yerr = err, fmt ='s', color = 'red',


2 ecolor = 'lightgray', elinewidth = 6, capsize=8);

Density and Contour plots: It is useful to display the 3D data in two dimensions using contours or
color-coded regions. There are three Matplotlib functions that can be helpful for this task:
plt.contour() for contour plots, plt.contourf() for filled contour plots, and plt.imshow() for showing
images.

A contour plot can be created with the plt.contour() function. It takes three arguments: a grid of x
values, a grid of y values, and a grid of z values. The x and y values represent positions on the
plot, and the z values will be represented by the contour levels.

The straightforward way to prepare such data is to use the np.meshgrid() function, which builds
2D grids from 1D arrays:

In [33]: 1 def myFunction(x: float, y: float) -> float:


2 return np.cos(x) ** 10 - np.sin(20 + x * y) * np.sin(x)
3 ​
4 x = np.linspace(0, 5, 20)
5 y = np.linspace(0, 4, 30)
6 X, Y = np.meshgrid(x, y)
7 Z = myFunction(X, Y)
8 plt.contour(X, Y, Z, colors = 'black')

Out[33]: <matplotlib.contour.QuadContourSet at 0x19de7a2ac10>

Notice that by default when a single color is used, negative values are represented by dashed
lines, and positive values by solid lines. Alternatively, we can color-code the lines by specifying a
colormap with the cmap argument.
In [34]: 1 plt.contour(X, Y, Z, 20, cmap='RdGy')

Out[34]: <matplotlib.contour.QuadContourSet at 0x19de79422e0>

We can also specify that we want more lines to be drawn with color information in the form of
colorbarusing the plt.colorbar() command.

Example: 15 equally spaced intervals within the data range.


In [35]: 1 plt.contourf(X, Y, Z, 15, cmap='autumn')
2 plt.colorbar();

C:\Users\prash\AppData\Local\Temp\ipykernel_1112\513212699.py:2: MatplotlibDepr
ecationWarning: Auto-removal of grids by pcolor() and pcolormesh() is deprecate
d since 3.5 and will be removed two minor releases later; please call grid(Fals
e) first.
plt.colorbar();

The splotchiness of the graph can be reduced by using the plt.imshow() function:

plt.imshow() doesn’t accept an x and y grid, so we must manually specify the extent [xmin, xmax,
ymin, ymax] of the image on the plot.

plt.imshow() by default follows the standard image array where the origin is the upper left, not in
the lower left. This must be changed when showing gridded data.

plt.imshow() will automatically adjust the axis aspect ratio to match the input data; we can change
this by setting by plt.axis(aspect='image') to make x and y units match.
In [36]: 1 plt.imshow(Z, extent=[0, 5, 0, 5],
2 origin='lower', cmap='autumn')
3 plt.colorbar()
4 plt.axis('image')

C:\Users\prash\AppData\Local\Temp\ipykernel_1112\3509406218.py:3: MatplotlibDep
recationWarning: Auto-removal of grids by pcolor() and pcolormesh() is deprecat
ed since 3.5 and will be removed two minor releases later; please call grid(Fal
se) first.
plt.colorbar()

Out[36]: (0.0, 5.0, 0.0, 5.0)

We can plot useful information on contour plots by combining the contour plots and image plots.

To achieve this, we will use a partially transparent background image and over-plot contours with
labels on the contours using the plt.clabel() function):
In [37]: 1 contours = plt.contour(X, Y, Z, 5, colors = 'black')
2 plt.clabel(contours, inline = True, fontsize = 10)
3 plt.imshow(Z, extent = [0, 5, 0, 4], origin = 'lower',
4 cmap = 'Dark2', alpha = 0.4)
5 plt.colorbar()

C:\Users\prash\AppData\Local\Temp\ipykernel_1112\1712393327.py:5: MatplotlibDep
recationWarning: Auto-removal of grids by pcolor() and pcolormesh() is deprecat
ed since 3.5 and will be removed two minor releases later; please call grid(Fal
se) first.
plt.colorbar()

Out[37]: <matplotlib.colorbar.Colorbar at 0x19de7e51eb0>

Histograms: The plt.hist() function creates a histogram plot.


In [38]: 1 myData = np.random.randn(1000)
2 plt.hist(myData)

Out[38]: (array([ 12., 15., 64., 162., 212., 216., 201., 84., 23., 11.]),
array([-3.02858801, -2.42470463, -1.82082126, -1.21693789, -0.61305452,
-0.00917114, 0.59471223, 1.1985956 , 1.80247897, 2.40636235,
3.01024572]),
<BarContainer object of 10 artists>)

The plt.hist() function has many options to tune both the calculation and the display:
In [39]: 1 plt.hist(myData, bins = 40,
2 alpha = 0.7,
3 histtype = 'bar',
4 color = 'steelblue',
5 edgecolor = 'Red');

We can use the combination of histtype='stepfilled' along with a transparency alpha to compare
histograms of several distributions:

In [40]: 1 x1 = np.random.normal(0, 0.75, 500)


2 x2 = np.random.normal(-2.5, 1.5, 600)
3 x3 = np.random.normal(4, 1, 1000)
4 kwargs = dict(histtype='stepfilled', alpha=0.2, bins=40)
5 plt.hist(x1, **kwargs)
6 plt.hist(x2, **kwargs)
7 plt.hist(x3, **kwargs);
8 ​

If we would like to simply compute or count the number of points in a given bin and not display it,
the np.histogram() function is available:
In [41]: 1 counts, bin_edges = np.histogram(myData, bins = 4)
2 print(counts)

[ 54 411 466 69]

2D Histograms: We can create histograms in 2D by dividing points among 2D bins. First we


define x and y arrays drawn using a Gaussian distribution:

In [42]: 1 mean = [0, 0]


2 cov = [[1, 1], [1, 2]]
3 x, y = np.random.multivariate_normal(mean, cov, 6000).T

To plot a 2D histogram we use plt.hist2d() function:

In [43]: 1 plt.hist2d(x, y, bins=25, cmap='BuGn')


2 cb = plt.colorbar()
3 cb.set_label('counts in bin')

C:\Users\prash\AppData\Local\Temp\ipykernel_1112\4248966252.py:1: MatplotlibDep
recationWarning: Auto-removal of grids by pcolor() and pcolormesh() is deprecat
ed since 3.5 and will be removed two minor releases later; please call grid(Fal
se) first.
plt.hist2d(x, y, bins=25, cmap='BuGn')
C:\Users\prash\AppData\Local\Temp\ipykernel_1112\4248966252.py:2: MatplotlibDep
recationWarning: Auto-removal of grids by pcolor() and pcolormesh() is deprecat
ed since 3.5 and will be removed two minor releases later; please call grid(Fal
se) first.
cb = plt.colorbar()

The np.histogram2d() which can be used as follows:

In [44]: 1 counts, xedges, yedges = np.histogram2d(x, y, bins = 25)

Hexagonal binnings: The Matplotlib provides the plt.hexbin() routine, which represents a 2D
dataset binned within a grid of hexagons.
In [45]: 1 plt.hexbin(x, y, gridsize=30, cmap='RdYlGn_r')
2 cb = plt.colorbar(label='count in bin')
3 ​

C:\Users\prash\AppData\Local\Temp\ipykernel_1112\3618383650.py:2: MatplotlibDep
recationWarning: Auto-removal of grids by pcolor() and pcolormesh() is deprecat
ed since 3.5 and will be removed two minor releases later; please call grid(Fal
se) first.
cb = plt.colorbar(label='count in bin')

Plot Legends: Plot legends assign labels to the plot elements. A legend can be created with the
plt.legend() command.

In [46]: 1 x = np.linspace(0, 12, 555)


2 fig, ax = plt.subplots()
3 ax.plot(x, np.sin(x), 'b--', label = 'Sine')
4 ax.plot(x, np.cos(x), 'g:', label = 'Cosine')
5 ax.axis('tight')
6 leg = ax.legend();

We can specify the location and turn off the frame:


In [47]: 1 ax.legend(loc = 'lower left', frameon = False)
2 fig

Out[47]:

We can use the ncol command to specify the number of columns in the legend:
In [48]: 1 ax.legend(frameon = True, loc = 'center', ncol = 2)
2 fig

Out[48]:

We can use a rounded box by using the fancybox keyword and we can also add a shadow using
the shadow keyword and change the transparency of the frameusing framealpha keyword or
change the padding around the text by using borderpad keyword:

In [49]: 1 plt.style.use('classic')
2 ax.legend(fancybox = True, framealpha = 0.5,
3 shadow = True, borderpad = 1)
4 fig
5 ​

Out[49]:

Multiple subplots: groups of smaller axes that can exist together within a single figure. These
subplots might be insets, grids of plots, or other more complicated layouts.

We can create an axes is to use plt.axes() function. By default this function creates a standard
axes object that fills the entire figure.

But the same plt.axes() can take an optional argument which should be a list of four numbers
[bottom, left, width, height] within the figure coordinate system which range from 0 at the bottom
left of the figure to 1 at the top right of the figure.
In [50]: 1 plt.style.use('seaborn-white')

Example: To create an inset axes at the top-left corner of another axes by setting the x position to
0.5 (starting at 50% of the width) and y position to 0.5 (50% of the height of the figure) and the x
entends to 0.2 and y extents to 0.2 (the size of the axes is 20% of the width and 20% of the height
of the figure)

In [91]: 1 ax1 = plt.axes() # standard axes


2 ax2 = plt.axes([0.5, 0.5, 0.2, 0.2])

Equivqlently we can use the object-oriented interface using fig.add_axes().

Example: To create two vertically stacked axes


In [52]: 1 fig = plt.figure()
2 ax1 = fig.add_axes([0.2, 0.5, 0.9, 0.4],
3 xticklabels = [], ylim = (-1.2, 1.2))
4 ax2 = fig.add_axes([0.2, 0.1, 0.9, 0.4],
5 ylim = (-1.2, 1.2))
6 x = np.linspace(0, 10)
7 ax1.plot(np.sin(2*x))
8 ax2.plot(np.cos(3*x));
9 ​

Grids og subplorts: Aligned columns or rows of subplots can be created in Matplotlib by using
plt.subplot() which creates a single subplot within a grid.

This command takes three integer arguments—the number of rows, the number of columns and
the index of the plot to be created in this scheme which runs from the upper left to the bottom right.
In [53]: 1 for i in range(1, 9):
2 plt.subplot(4, 2, i)
3 plt.text(0.5, 0.5, str((4, 2, i)),
4 fontsize = 18, ha = 'center')

The command plt.subplots_adjust() can be used to adjust the spacing between these plots.

Example: Specifying the spacings by using the object oriented commands


In [54]: 1 fig = plt.figure()
2 fig.subplots_adjust(hspace = 0.5, wspace = 0.5)
3 for i in range(1, 9):
4 ax = fig.add_subplot(4, 2, i)
5 ax.text(0.5, 0.5, str((4, 2, i)),
6 fontsize = 18, ha = 'center')
7 ​

Complete grid at once: The function plt.subplots() is the easier tool to use which creates a full
grid of subplots in a single line and returns them in a NumPy array.
The arguments are the number of rows and number of columns and the optional keywords sharex
and sharey which allow to specify the relationships between different axes.

Example: To create a 5 by 4 grid of subplots where all axes in the same row share their y-axis
scale and all axes in the same column share their x-axis scale.

In [55]: 1 fig, ax = plt.subplots(5, 4, sharex='col', sharey='row')

The resulting grid of axes instances gets returned within a NumPy array, allowing a convenient
specification of the desired axes using standard array indexing notation.

The axes are in a two-dimensional array, indexed by [row, col]:


In [56]: 1 for i in range(5):
2 for j in range(4):
3 ax[i, j].text(0.5, 0.5, str((i, j)),
4 fontsize=18, ha='center')
5 fig

Out[56]:

Complex arrangements: To draw the subplots which span multiple rows and columns, the
function plt.GridSpec() is used.

Example: a gridspec for a grid of 4 rows and 4 columns with some specified width and height
space looks like this:
In [57]: 1 grid = plt.GridSpec(4, 4, wspace = 0.4, hspace = 0.3)
2 plt.subplot(grid[0, :3])
3 plt.subplot(grid[:, 3])
4 plt.subplot(grid[1:, 0])
5 plt.subplot(grid[1:3, 1:3])
6 plt.subplot(grid[3, 1:3])

Out[57]: <AxesSubplot:>

Stylesheets: As already discussed, one can check the styles available by using
plt.style.available() and switch to the one selected by us using plt.style.use().

But this will change the plot style of rest of the session! To avoid the same, we use the
plt.style.context() which sets style temporarily:

In [58]: 1 def myPlots():


2 fig, ax = plt.subplots(1, 2, figsize = (10, 5))
3 ax[0].hist(np.random.randn(500))
4 for i in range(2):
5 ax[1].plot(np.random.rand(5))
6 ax[1].legend(['AA', 'BB', 'CC'])

Matplotlib default style: Reset the runtime configurations first


In [59]: 1 pythonDefault = plt.rcParams.copy()
2 plt.rcParams.update(pythonDefault)
3 myPlots()

Set the plot styles temporarily:

In [60]: 1 with plt.style.context('fivethirtyeight'):


2 myPlots()
In [61]: 1 with plt.style.context('ggplot'):
2 myPlots()

In [62]: 1 with plt.style.context('bmh'):


2 myPlots()
In [63]: 1 with plt.style.context('dark_background'):
2 myPlots()

In [64]: 1 with plt.style.context('grayscale'):


2 myPlots()

3D plots: We have to enable the 3D plots by importing the mplot3d toolkit included with the main
Matplotlib installation.
We can create a 3D axes by passing the keyword projection='3d' to any of the normal axes
creation routines.

In [65]: 1 from mpl_toolkits import mplot3d


2 fig = plt.figure()
3 ax = plt.axes(projection='3d')

The 3D line or scatter plot can be created from sets of (x, y, z) triples and we can create them by
using the ax.plot3D() and ax.scatter3D() functions.
In [66]: 1 ax = plt.axes(projection='3d')
2 ​
3 # Data for a three-dimensional line
4 zline = np.linspace(0, 10, 100)
5 xline = np.sin(zline)
6 yline = np.cos(zline)
7 ​
8 ax.plot3D(xline, yline, zline, 'gray')
9 ​
10 # Data for three-dimensional scattered points
11 zdata = 10 * np.random.random(50)
12 xdata = np.sin(zdata) + 0.1 * np.random.randn(50)
13 ydata = np.cos(zdata) + 0.1 * np.random.randn(50)
14 ​
15 ax.scatter3D(xdata, ydata, zdata, c=zdata, cmap='Blues');
3D contour plot: By using the ax.contour3D() function.

In [67]: 1 def myFunction(x, y):


2 return np.sin(np.sqrt(x ** 2 + y ** 2))
3 ​
4 x = np.linspace(-5, 5, 50)
5 y = np.linspace(-5, 5, 50)
6 X, Y = np.meshgrid(x, y)
7 Z = myFunction(X, Y)
8 ​
9 fig = plt.figure()
10 ax = plt.axes(projection='3d')
11 ax.contour3D(X, Y, Z, 50, cmap='binary')
12 ax.set_xlabel('x')
13 ax.set_ylabel('y')
14 ax.set_zlabel('z');

The viewing angle can be set by using ax.view_init() to set elevation from xy plane and azimuth
about z axis.
In [68]: 1 ax.view_init(65, 25)
2 fig

Out[68]:

Wireframes: using ax.plot_wireframe() function.


In [69]: 1 fig = plt.figure()
2 ax = plt.axes(projection='3d')
3 ax.plot_wireframe(X, Y, Z, color='red')

Out[69]: <mpl_toolkits.mplot3d.art3d.Line3DCollection at 0x19dea207fd0>

Surfaceplot: by using ax.plot_surface() function.


In [70]: 1 ax = plt.axes(projection='3d')
2 ax.plot_surface(X, Y, Z, rstride=1, cstride=1,
3 cmap='magma', edgecolor='none')
4 ax.set_title('surface');
5 ​

Seaborn: Advanced data visualizatioon tool


In order to visualize data from a Pandas DataFrame, we must extract each Series and often
concatenate them together into the right format. The seaborn plotting library can intelligently use
the DataFrame labels in a plot.

Matplotlib vs Seaborn:
In [71]: 1 x = np.linspace(0, 10, 500)
2 y = np.cumsum(np.random.randn(500, 4), 0)
3 plt.plot(x, y)
4 plt.legend('WXYZ', ncol=2, loc='upper left');

Lets bring-in seaborn and set the stying using sb.set() method.
In [72]: 1 import seaborn as sb
2 sb.set()
3 plt.plot(x, y)
4 plt.legend('WXYZ', ncol=2, loc='upper left');

Histograms:
In [73]: 1 import pandas as pd
2 data = np.random.multivariate_normal([0, 0], [[5, 2], [2, 2]], size=2000)
3 data = pd.DataFrame(data, columns=['x', 'y'])
4 for col in 'xy':
5 plt.hist(data[col], alpha=0.4)

We can get a smooth estimate of the distribution using a kernel density estimation with
sb.kdeplot()
In [74]: 1 for col in 'xy':
2 sb.kdeplot(data[col], shade = False)

Histograms and KDE can be combined using sb.distplot()


In [75]: 1 sb.distplot(data['x'])
2 sb.distplot(data['y'])

C:\Users\prash\anaconda3\lib\site-packages\seaborn\distributions.py:2619: Futur
eWarning: `distplot` is a deprecated function and will be removed in a future v
ersion. Please adapt your code to use either `displot` (a figure-level function
with similar flexibility) or `histplot` (an axes-level function for histogram
s).
warnings.warn(msg, FutureWarning)
C:\Users\prash\anaconda3\lib\site-packages\seaborn\distributions.py:2619: Futur
eWarning: `distplot` is a deprecated function and will be removed in a future v
ersion. Please adapt your code to use either `displot` (a figure-level function
with similar flexibility) or `histplot` (an axes-level function for histogram
s).
warnings.warn(msg, FutureWarning)

Out[75]: <AxesSubplot:xlabel='y', ylabel='Density'>


If we pass the full 2D dataset to kdeplot, we will get a 2D visualization of the data:

In [76]: 1 sb.kdeplot(data.x, data.y)

C:\Users\prash\anaconda3\lib\site-packages\seaborn\_decorators.py:36: FutureWar
ning: Pass the following variable as a keyword arg: y. From version 0.12, the o
nly valid positional argument will be `data`, and passing other arguments witho
ut an explicit keyword will result in an error or misinterpretation.
warnings.warn(

Out[76]: <AxesSubplot:xlabel='x', ylabel='y'>

We can see the joint distribution and the marginal distributions together using sb.jointplot().
In [77]: 1 with sb.axes_style('white'):
2 sb.jointplot("x", "y", data)

C:\Users\prash\anaconda3\lib\site-packages\seaborn\_decorators.py:36: FutureWar
ning: Pass the following variables as keyword args: x, y, data. From version 0.
12, the only valid positional argument will be `data`, and passing other argume
nts without an explicit keyword will result in an error or misinterpretation.
warnings.warn(

There are other parameters that can be passed to jointplot—for example, we can use a
hexagonally based histogram:
In [78]: 1 with sb.axes_style('white'):
2 sb.jointplot("x", "y", data, kind='hex')

C:\Users\prash\anaconda3\lib\site-packages\seaborn\_decorators.py:36: FutureWar
ning: Pass the following variables as keyword args: x, y, data. From version 0.
12, the only valid positional argument will be `data`, and passing other argume
nts without an explicit keyword will result in an error or misinterpretation.
warnings.warn(

Pairplots: When we generalize joint plots to datasets of larger dimensions, we end up with pair
plots.

This is very useful for exploring correlations between multidimensional data, when we like to plot
all pairs of values against each other.
In [79]: 1 irisData = sb.load_dataset("iris")
2 irisData.head(10)

Out[79]:
sepal_length sepal_width petal_length petal_width species

0 5.1 3.5 1.4 0.2 setosa

1 4.9 3.0 1.4 0.2 setosa

2 4.7 3.2 1.3 0.2 setosa

3 4.6 3.1 1.5 0.2 setosa

4 5.0 3.6 1.4 0.2 setosa

5 5.4 3.9 1.7 0.4 setosa

6 4.6 3.4 1.4 0.3 setosa

7 5.0 3.4 1.5 0.2 setosa

8 4.4 2.9 1.4 0.2 setosa

9 4.9 3.1 1.5 0.1 setosa


In [80]: 1 sb.pairplot(irisData, hue = 'species', height = 2);

Faceted histograms: Sometimes the we may have to view data is via histograms of subsets.
Seaborn’s FacetGrid makes this extremely simple.
In [81]: 1 tipsData = sb.load_dataset('tips')
2 tipsData.head(10)

Out[81]:
total_bill tip sex smoker day time size

0 16.99 1.01 Female No Sun Dinner 2

1 10.34 1.66 Male No Sun Dinner 3

2 21.01 3.50 Male No Sun Dinner 3

3 23.68 3.31 Male No Sun Dinner 2

4 24.59 3.61 Female No Sun Dinner 4

5 25.29 4.71 Male No Sun Dinner 4

6 8.77 2.00 Male No Sun Dinner 2

7 26.88 3.12 Male No Sun Dinner 4

8 15.04 1.96 Male No Sun Dinner 2

9 14.78 3.23 Male No Sun Dinner 2


In [82]: 1 tipsData['tip_pct'] = 100 * tipsData['tip'] / tipsData['total_bill']
2 ​
3 grid = sb.FacetGrid(tipsData, row = "sex", col = "time",
4 margin_titles = True)
5 grid.map(plt.hist, "tip_pct", bins = np.linspace(0, 45, 20));
6 ​

Factor plots: Factor plots allows us to view the distribution of a parameter within bins defined by
any other parameter.
In [83]: 1 with sb.axes_style(style = 'ticks'):
2 g = sb.catplot(x = "day", y = "total_bill", hue = "sex",
3 data = tipsData, kind="box")
4 g.set_axis_labels("Day", "Total Bill")

Joint distributions: we can use sb.jointplot() to show the joint distribution between different
datasets, along with the associated marginal distributions:
In [84]: 1 with sb.axes_style('white'):
2 sb.jointplot(x = "total_bill", y = "tip",
3 data = tipsData, kind='hex')
4 ​

The joint plot can even do automatic kernel density estimation and regression:
In [85]: 1 sb.jointplot(x = "total_bill", y = "tip",
2 data = tipsData, kind = 'reg')

Out[85]: <seaborn.axisgrid.JointGrid at 0x19def2956a0>


Course instructor: Prashanth Kannadaguli
Corporate Technical Trainer & Freelancer

You might also like