0% found this document useful (0 votes)

19 views12 pages

Data Visualisation

The document discusses the importance of data visualization in data analysis, highlighting Python's matplotlib library for creating various types of plots. It covers how to create figures, subplots, bar plots, histograms, density plots, and scatter plots using matplotlib and pandas. Additionally, it introduces seaborn for enhanced visualizations, including pair plots for exploratory data analysis.

Uploaded by

hacksbank.net

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views12 pages

Data Visualisation

Uploaded by

hacksbank.net

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

DATA VISUALIZATION

Presenting informative in visualizations is one of the most important tasks in data analysis. It
may be a part of the exploratory process, for example, to help identify outliers or needed data
transformations, or as a way of generating ideas for models.

Python has many libraries for making static or dynamic visualizations. matplotlib is a desktop
plotting package designed for creating publication-quality plots. matplotlib supports various GUI
backends on all operating systems and can export visualizations to all the common
graphics formats (PDF, SVG, JPG, PNG, BMP, GIF, etc.). matplotlib has several add-on toolkits
for data visualization that use matplotlib for their underlying plotting. One of these is seaborn.

The matplotlib API in action

import matplotlib.pyplot as plt

import numpy as np
data = np.arange(10)
data
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
plt.plot(data)
[<matplotlib.lines.Line2D object at 0x000001F6BD392F10>]
plt.show()

Simple Line Figure

Figures and Subplots

Plots in matplotlib is part of Figure object. You can create a new figure with plt.figure:
fig = plt.figure()

The plt.figure has many options: figsize for example will guarantee the figure has a certain size
and aspect ratio if saved to disk.

You can’t make a plot with a blank figure. You must create one or more subplots using
add_subplot:

fig = plt.figure();
ax1 = fig.add_subplot(2, 2, 1);
ax2 = fig.add_subplot(2, 2, 2);
ax3 = fig.add_subplot(2, 2, 3);
ax1.hist(np.random.randn(100), bins=20, color='k', alpha=0.3);
ax2.scatter(np.arange(30), np.arange(30) + 3 * np.random.randn(30));
ax3.plot(np.random.randn(50).cumsum(), 'k--'); //plot a random line
fig.show() //dispay on the screen
fig.savefig('figpath.svg')//save the fig
Figure with four subplots and 3 of witch are displayed

plt.plot([1.5, 3.5, -2, 1.6])

You can use matplotlib with pandas and numpy

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10, 4).cumsum(0), \
columns=['A', 'B', 'C', 'D'],index=np.arange(0, 100, 10))
df.plot()
plt.show()

Bar Plots

The plot.bar() and plot.barh() create vertical and horizontal bar plots, respectively. In this case,
the Series or DataFrame index will be used as the x (bar) or y (barh) ticks:

fig, axes = plt.subplots(2, 1)

data = pd.Series(np.random.rand(16), index=list('abcdefghijklmnop'))
data.plot.bar(ax=axes[0], color='k', alpha=0.7)
data.plot.barh(ax=axes[1], color='k', alpha=0.7)
plt.show()
The options color='k' and alpha=0.7 set the color of the plots to black and transparency 0.7
respectively.

With a DataFrame, bar plots group the values in each row together in a group in bars, side by
side, for each value.

df = pd.DataFrame(np.random.rand(6, 4), index=['one', 'two', 'three', 'four', 'five', 'six'],

columns=pd.Index(['A', 'B', 'C', 'D'], name='Genus'))
df
Genus A B C D
one 0.036514 0.306275 0.020181 0.385821
two 0.800127 0.268534 0.056382 0.447547
three 0.878521 0.375050 0.025206 0.566525
four 0.235946 0.230388 0.730691 0.734806
five 0.914254 0.701028 0.626693 0.878269
six 0.076666 0.765472 0.211579 0.176230
df.plot.bar()
plt.show()
DataFrame bar plot

Note that the name “Genus” on the DataFrame’s columns is used to title the legend.

You can create stacked bar plots from a DataFrame by passing stacked=True, resulting in the
value in each row being stacked together.

df.plot.barh(stacked=True, alpha=0.5)
plt.show()

Staked plot created from random data DataFrame

Histograms and Density Plots

A histogram is a kind of bar plot that gives a discretized display of value frequency. The data
points are split into discrete, evenly spaced bins, and the number of data points in each bin is
plotted. Using the current CGPA of a given Level students in SAZU, we can make a histogram of
CGPA score grouped into .5 using the plot.hist method on the Series:

import matplotlib as plt

import pandas as pd
import numpy as np
import matplotlib.pyplot as ppl
students=pd.read_csv('C:/Users/adamn/Downloads/STATISTICAL_DATA/main.csv')
data=students[['cgpa', 'score']]
cgpa_hist = data['cgpa'].plot.hist(bins=[0.5, 1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0])
ppl.show()

Histogram showing students’ CGPA distribution

A related plot type is a density plot, which is formed by computing an estimate of a continuous
probability distribution that might have generated the observed data. The usual procedure is to
approximate this distribution as a mixture of “kernels”—that is, simpler distributions like the
normal distribution. Thus, density plots are also known as kernel density estimate (KDE) plots.
Using plot.kde makes a density plot using the conventional mixture-of-normals estimate
cgpa_density = data['cgpa'].plot.density()
ppl.show()
Density Plot for the same students’ CGPA scores

Scatter or Point Plots

Point plots or scatter plots can be a useful way of examining the relationship between two one-
dimensional data series. For example, here we load dataset containing students’, among others,
JAMB scores and current CGPA, select the two variables, then plot scatter diagram of the CGP
and the JAMB score:

import matplotlib as plt

import pandas as pd
import numpy as np
import matplotlib.pyplot as ppl
students =
pd.read_csv('C:/Users/adamn/Downloads/STATISTICAL_DATA/main_data.csv')
data=students[['cgpa', 'score']]
data
cgpa score
0 4.76 87
1 3.36 47
2 1.89 13
3 2.92 12
4 3.29 53
.. ... ...
734 3.16 26
735 1.49 36
736 1.87 14
737 1.68 17
738 2.69 22
We can then use seaborn’s regplot method, to draw the scatter plot and at the same time fits a
linear regression line:
sns.regplot(x='score', y='cgpa', data=data);
ppl.title('JAMB Score vs CGPA');
ppl.xlabel('JAMB Score');
ppl.ylabel('CGPA');
ppl.show() //displays the plot

University-wide
Bauchi State Students
Other States

In exploratory data analysis it’s helpful to be able to look at all the scatter plots among a group of
variables; this is known as a pairs plot or scatter plot matrix. Making such a plot from scratch is a
bit of work, so seaborn has a convenient pairplot function, which supports placing histograms or
density estimates of each variable along the diagonal:

sns.pairplot(data, diag_kind='kde', plot_kws={'alpha': 0.2});ppl.show()

Check out the seaborn.pairplot docstring for more granular configuration options.

Data Visualization
No ratings yet
Data Visualization
35 pages
Content From Jose Portilla's Udemy Course Learning Python For Data Analysis and Visualization Notes by Michael Brothers, Available On
No ratings yet
Content From Jose Portilla's Udemy Course Learning Python For Data Analysis and Visualization Notes by Michael Brothers, Available On
13 pages
Description of Data Visualization Tools
No ratings yet
Description of Data Visualization Tools
15 pages
Unit 05
No ratings yet
Unit 05
26 pages
Graphs Using Matplotlib
No ratings yet
Graphs Using Matplotlib
23 pages
Unit 5
No ratings yet
Unit 5
10 pages
Lecture 4
No ratings yet
Lecture 4
60 pages
Matplotlib in Python
No ratings yet
Matplotlib in Python
43 pages
DSV - Module-5 Exercise Problems
No ratings yet
DSV - Module-5 Exercise Problems
16 pages
Data Visualization
No ratings yet
Data Visualization
33 pages
Python
No ratings yet
Python
29 pages
Data Visualization with Matplotlib
No ratings yet
Data Visualization with Matplotlib
18 pages
19 Matplotlib
No ratings yet
19 Matplotlib
26 pages
Plotting Graph10072019
No ratings yet
Plotting Graph10072019
30 pages
Data Visualization
No ratings yet
Data Visualization
17 pages
Data Visualisation
No ratings yet
Data Visualisation
5 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
8 pages
Data Visualization Python Tutorial
100% (1)
Data Visualization Python Tutorial
9 pages
Matplotlib Starter: Import As Import As Import As
No ratings yet
Matplotlib Starter: Import As Import As Import As
24 pages
Chapter1.3 - Data Visualization
No ratings yet
Chapter1.3 - Data Visualization
27 pages
What Is Matplotlib
No ratings yet
What Is Matplotlib
4 pages
Data Visualization With Python
No ratings yet
Data Visualization With Python
36 pages
Data Analysis Graphs
No ratings yet
Data Analysis Graphs
9 pages
Aphical Representation
No ratings yet
Aphical Representation
12 pages
To Matplotlib: Anas Irtaza Ashmal
No ratings yet
To Matplotlib: Anas Irtaza Ashmal
15 pages
Data Visualization 1
No ratings yet
Data Visualization 1
66 pages
Introduction Tom at Plot Lib
No ratings yet
Introduction Tom at Plot Lib
38 pages
Visualization
No ratings yet
Visualization
18 pages
Introduction To Matplotlib Using Python For Beginners
No ratings yet
Introduction To Matplotlib Using Python For Beginners
14 pages
Week 6
No ratings yet
Week 6
40 pages
DataVisualization - 1 Surya Sir
No ratings yet
DataVisualization - 1 Surya Sir
51 pages
Lecture 2.3
No ratings yet
Lecture 2.3
25 pages
Datascienece
No ratings yet
Datascienece
18 pages
Session 7 - Data Visualization With Python
No ratings yet
Session 7 - Data Visualization With Python
17 pages
Data Visualization Using Matplotlib
No ratings yet
Data Visualization Using Matplotlib
30 pages
XII DataVisualization
No ratings yet
XII DataVisualization
34 pages
Unit 4 (2) Python
No ratings yet
Unit 4 (2) Python
27 pages
Pandas Cheat Sheet 2
No ratings yet
Pandas Cheat Sheet 2
12 pages
Data Visualization Using Python
No ratings yet
Data Visualization Using Python
3 pages
Data Visualization Using Matplotlib in Python
No ratings yet
Data Visualization Using Matplotlib in Python
15 pages
01 Matplotlib
No ratings yet
01 Matplotlib
2 pages
Wa0029.
No ratings yet
Wa0029.
16 pages
Matplotlib Functions
No ratings yet
Matplotlib Functions
32 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
22 pages
Unit 5
No ratings yet
Unit 5
16 pages
Python Data Visualization Guide
No ratings yet
Python Data Visualization Guide
17 pages
Data Visulation
No ratings yet
Data Visulation
8 pages
Data Visualization - 1 by Matplot Lib
No ratings yet
Data Visualization - 1 by Matplot Lib
19 pages
L34, 35 Matplotlib
No ratings yet
L34, 35 Matplotlib
4 pages
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
No ratings yet
NumPy, Pandas, MatplotLib, Seaborn, ScikitLearn (SkLearn)
14 pages
Seaborn Data Visualization Guide
No ratings yet
Seaborn Data Visualization Guide
49 pages
Python Unit 4.notes
No ratings yet
Python Unit 4.notes
50 pages
CHAPTER-2 Data Visualization
No ratings yet
CHAPTER-2 Data Visualization
4 pages
Matplotlib Bov
No ratings yet
Matplotlib Bov
12 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
18 pages
Matplot Lib Practicals
No ratings yet
Matplot Lib Practicals
24 pages
Mat Plot Lib
No ratings yet
Mat Plot Lib
12 pages
Data Visualization Using Matplotlib and Seaborn
No ratings yet
Data Visualization Using Matplotlib and Seaborn
28 pages
Lecture 3sazu Edu 204
No ratings yet
Lecture 3sazu Edu 204
5 pages
Numercals On Virtual Memory and Disk Scheduling - 071919
No ratings yet
Numercals On Virtual Memory and Disk Scheduling - 071919
2 pages
Numerical For Deadlock - 071921
No ratings yet
Numerical For Deadlock - 071921
2 pages
BCH 202 Lecture Note
No ratings yet
BCH 202 Lecture Note
20 pages
Logic and Design PDF 2
No ratings yet
Logic and Design PDF 2
33 pages
Logic and Design PDF 1
No ratings yet
Logic and Design PDF 1
16 pages
BCH 202 Ppt. - 083303
No ratings yet
BCH 202 Ppt. - 083303
14 pages
Engineering Students' Guide to Networks
No ratings yet
Engineering Students' Guide to Networks
86 pages
RL3-NAC Info
No ratings yet
RL3-NAC Info
11 pages
mtp3550 Tetra Portable Radio Manual
No ratings yet
mtp3550 Tetra Portable Radio Manual
10 pages
The Scalar Kalman Filter
100% (4)
The Scalar Kalman Filter
16 pages
Antriksh Internship Report
No ratings yet
Antriksh Internship Report
15 pages
Cache Memory Characteristics
No ratings yet
Cache Memory Characteristics
67 pages
Operations Research: Assignment Problems
No ratings yet
Operations Research: Assignment Problems
2 pages
PLC Operation - 8.1
No ratings yet
PLC Operation - 8.1
9 pages
Nptel: Pattern Recognition - Video Course
No ratings yet
Nptel: Pattern Recognition - Video Course
4 pages
Call Flow in GSM
No ratings yet
Call Flow in GSM
15 pages
Track Order Details - Celcom
100% (1)
Track Order Details - Celcom
2 pages
Two Marks Q&A Unit - I: CO Beo
No ratings yet
Two Marks Q&A Unit - I: CO Beo
21 pages
NV11 Manual Set: Downloaded From Manuals Search Engine
No ratings yet
NV11 Manual Set: Downloaded From Manuals Search Engine
167 pages
Moldflow Tips
No ratings yet
Moldflow Tips
29 pages
ED&C FloTHERM 12 Key Considerations in Enclosure Thermal Design A High-Level How To Guide
No ratings yet
ED&C FloTHERM 12 Key Considerations in Enclosure Thermal Design A High-Level How To Guide
11 pages
Essentials of College Algebra 12th Edition Margaret L Lial John Hornsby David I Schneider Callie Daniels Ebook and TestBank Bundle PDF Download
No ratings yet
Essentials of College Algebra 12th Edition Margaret L Lial John Hornsby David I Schneider Callie Daniels Ebook and TestBank Bundle PDF Download
353 pages
50 SAP ABAP ALE IDOC Interview Questions
50% (2)
50 SAP ABAP ALE IDOC Interview Questions
6 pages
Europe Agricultural Tractors Market 1722322389251
No ratings yet
Europe Agricultural Tractors Market 1722322389251
42 pages
Case Study: Distributed OS
No ratings yet
Case Study: Distributed OS
17 pages
Features Description: 3.0V To 5.5V RS-485 Transceivers
No ratings yet
Features Description: 3.0V To 5.5V RS-485 Transceivers
12 pages
Process at Cubet
No ratings yet
Process at Cubet
4 pages
UEBA Use Cases With Scenario Examples
No ratings yet
UEBA Use Cases With Scenario Examples
10 pages
PCM67,69A
No ratings yet
PCM67,69A
12 pages
Module 7 Euc
No ratings yet
Module 7 Euc
10 pages
Stuff 0221
No ratings yet
Stuff 0221
100 pages
Pradeep Chandra Matam Draft Angular
No ratings yet
Pradeep Chandra Matam Draft Angular
6 pages
Dolphin Ini
100% (1)
Dolphin Ini
3 pages
Sage Evolution 7 ERP Features Overview
No ratings yet
Sage Evolution 7 ERP Features Overview
8 pages
Important Question
No ratings yet
Important Question
4 pages
RightPower - Titan PRO Series Rackmount Type 1KR 10KR - PF0.9
No ratings yet
RightPower - Titan PRO Series Rackmount Type 1KR 10KR - PF0.9
4 pages