deep_learning_with_python_mini_course
deep_learning_with_python_mini_course
CHI
NELE
ARNI
NG
MA
STE
RY
Deep
Lear
ning
wi
thPyt
hon
1
4-DayMi
ni-
Cour
se
Aut
hor
s
Jas
onBr
ownl
ee
Adr
ianTam
ZheMi
ngChng
i
Disclaimer
The information contained within this eBook is strictly for educational purposes. If you wish to
apply ideas contained in this eBook, you are taking full responsibility for your actions.
The author has made every effort to ensure the accuracy of the information within this book was
correct at time of publication. The author does not assume and hereby disclaims any liability to any
party for any loss, damage, or disruption caused by errors or omissions, whether such errors or
omissions result from accident, negligence, or any other cause.
No part of this eBook may be reproduced or transmitted in any form or by any means, electronic or
mechanical, recording or by any information storage and retrieval system, without written
permission from the author.
Credits
Authors: Jason Brownlee, Adrian Tam, Zhe Ming Chng
Technical Reviewers: Darci Heikkinen, Jerry Yiu, Amy Lam
Copyright
Deep Learning with Python, Second Edition
© 2016–2022 MachineLearningMastery.com. All Rights Reserved.
Edition: v2.00
Deep learning is a fascinating field of study and the techniques are achieving world class results
in a range of challenging machine learning problems. It can be hard to get started in deep
learning. Which library should you use and which techniques should you focus on?
In this 14-part crash course you will discover applied deep learning in Python with the
easy to use and powerful Keras library. This mini-course is intended for Python machine
learning practitioners that are already comfortable with scikit-learn on the SciPy ecosystem
for machine learning. Let’s get started.
This is a long and useful guide. You might want to print it out.
like. A comfortable schedule may be to complete one lesson per day over a two week period.
Highly recommended. The topics you will cover over the next 14 lessons are as follows:
B Lesson 1: Introduction to TensorFlow.
B Lesson 2: Introduction to Keras.
B Lesson 3: Crash Course in Multilayer Perceptrons.
B Lesson 4: Develop Your First Neural Network in Keras.
B Lesson 5: Use Keras Models with scikit-learn.
B Lesson 6: Plot Model Training History.
B Lesson 7: Save Your Best Model During Training with Checkpointing.
B Lesson 8: Convergence and Activation Functions.
B Lesson 9: Reduce Overfitting with Dropout Regularization.
B Lesson 10: Lift Performance with Learning Rate Schedules.
B Lesson 11: Crash Course in Convolutional Neural Networks.
B Lesson 12: Handwritten Digit Recognition.
B Lesson 13: Object Recognition in Small Photographs.
B Lesson 14: Improve Generalization with Data Augmentation.
This is going to be a lot of fun. You’re going to have to do some work though, a little
reading, a little research and a little programming. You want to learn deep learning right?
Here’s a tip: All of the answers these lessons can be found on this blog
http://MachineLearningMastery.com. Use the search feature.
If you would like me to step you through each lesson in great detail (and much more), take a
look at my book: Deep Learning with Python, Second Edition:
MA
CHI
NELE
ARNI
NG
MA
STE
RY
Deep
Lear
ning
wi
thPyt
hon
Devel
opDeepLear ning
ModelswithTensor
Flow
andKeras
Aut
hor
s
Jas
onBr
ownl
ee
Adr
ianTam
ZheMi
ngChng
A small example of a TensorFlow program that you can use as a starting point is listed below:
Start to familiarize yourself with the Keras library ready for the upcoming lessons where
we will implement our first model. You can learn more about the Keras library on the Keras
homepage1 .
1
http://keras.io/
03
Lesson 03
Crash Course in Multilayer
Perceptrons
Artificial neural networks are a fascinating area of study, although they can be intimidating
when just getting started. The field of artificial neural networks is often just called neural
networks or Multilayer Perceptrons after perhaps the most useful type of neural network. The
building block for neural networks are artificial neurons. These are simple computational units
that have weighted input signals and produce an output signal using an activation function.
Neurons are arranged into networks of neurons. A row of neurons is called a layer and
one network can have multiple layers. The architecture of the neurons in the network is often
called the network topology. Once configured, the neural network needs to be trained on
your dataset. The classical and still preferred training algorithm for neural networks is called
stochastic gradient descent.
Outputs
Activations
Weights
Inputs
Figure 03.1: Model of a simple neuron
Your goal for this lesson is to become familiar with neural network terminology. Dig a
little deeper into terms like neuron, weights, activation function, learning rate and more.
Lesson 04
First Neural Net in Keras
04
Keras allows you to develop and evaluate deep learning models in very few lines of code. In this
lesson your goal is to develop your first neural network using the Keras library. Use a standard
binary (two-class) classification dataset from the UCI Machine Learning Repository, like the
Pima Indians1 or the ionosphere datasets2 . Piece together code to achieve the following:
1. Load your dataset using NumPy or Pandas.
2. Define your neural network model and compile it.
3. Fit your model to the dataset.
4. Estimate the performance of your model on unseen data.
To give you a massive kick start, below is a complete working example that you can use
as a starting point. It assumes that you have downloaded the Pima Indians dataset to your
current working directory with the filename pima-indians-diabetes.csv.
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# Load the dataset
dataset = np.loadtxt("pima-indians-diabetes.csv", delimiter=",")
X = dataset[:,0:8]
Y = dataset[:,8]
# Define and Compile
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy' , optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(X, Y, epochs=150, batch_size=10)
# Evaluate the model
scores = model.evaluate(X, Y)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
1
https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv
2
https://archive.ics.uci.edu/ml/datasets/Ionosphere
Lesson 04: First Neural Net in Keras 8
Now develop your own model on a different dataset, or adapt this example.
05
Lesson 05
Use Keras Models with
scikit-learn
The scikit-learn library is a general purpose machine learning framework in Python built on
top of SciPy. Scikit-learn excels at tasks such as evaluating model performance and optimizing
model hyperparameters in just a few lines of code. The package SciKeras provides a wrapper
class that allows you to use your deep learning models with scikit-learn. For example, an
instance of KerasClassifier class in SciKeras can wrap your deep learning model and be used
as an Estimator in scikit-learn.
When using the KerasClassifier class, you must specify the name of a function that the
class can use to create and compile your model. You can also pass additional parameters to
the constructor of the KerasClassifier class that will be passed to the model.fit() call later,
like the number of epochs and batch size. In this lesson your goal is to develop a deep learning
model and evaluate it using k-fold cross-validation. For example, you can define an instance
of the KerasClassifier and the custom function to create your model as follows:
Learn more about using your Keras deep learning models with scikit-learn on the SciKeras
webpage.1
1
https://www.adriangb.com/scikeras/stable/
Lesson 06
Plot Model Training History
06
You can learn a lot about neural networks and deep learning models by observing their
performance over time during training. Keras provides the capability to register callbacks
when training a deep learning model. One of the default callbacks that is registered when
training all deep learning models is the History callback. It records training metrics for each
epoch. This includes the loss and the accuracy (for classification problems) as well as the loss
and accuracy for the validation dataset, if one is set.
The history object is returned from calls to the fit() function used to train the model.
Metrics are stored in a dictionary in the history member of the object returned. Your goal for
this lesson is to investigate the history object and create plots of model performance during
training. For example, you can print the list of metrics collected by your history object as
follows:
You can learn more about the History object and the callback API in Keras.12
1
https://keras.io/api/callbacks/
2
https://keras.io/guides/writing_your_own_callbacks/
07
Lesson 07
Save Your Best Model During
Training with Checkpointing
Application checkpointing is a fault tolerance technique for long running processes. The Keras
library provides a checkpointing capability by a callback API. The ModelCheckpoint callback
class allows you to define where to checkpoint the model weights, how the file should be
named and under what circumstances to make a checkpoint of the model. Checkpointing can
be useful to keep track of the model weights in case your training run is stopped prematurely.
It is also useful to keep track of the best model observed during training.
In this lesson, your goal is to use the ModelCheckpoint callback in Keras to keep track
of the best model observed during training. You could define a ModelCheckpoint that saves
network weights to the same file each time an improvement is observed. For example:
1
https://keras.io/api/callbacks/model_checkpoint/
08
Lesson 08
Convergence and Activation
Functions
Neural network is just like other machine learning model, it needs data to train and it is useful
only after it is trained. We use gradent descent algorithm for training. The gradient derived
depends on the activation function we use. Changing the activation function may cause the
training go slower or faster. In the worst case, you may encounter the issue of vanishing
gradient or exploding gradient. The former is when the gradient is virtually zero hence we
cannot make any progress on training; while the latter is when it is too large that the training
becomes unstable and hence the model training cannot converge.
Often, the rectified linear unit (ReLU) is a good choice of activation function but
historically we used sigmoidal or hyperbolic tangent (tanh) functions as well. You can switch
the activation function in each layer by specifying the activation argument:
Try replacing the activation functions in your model and observe the differences in the
time required to train and the model accuracy.
You can learn more about activation functions in Keras.1
1
https://keras.io/api/layers/activations/
09
Lesson 09
Reduce Overfitting with Dropout
Regularization
A big problem with neural networks is that they can overlearn your training dataset. Dropout
is a simple yet very effective technique for reducing dropout and has proven useful in large
deep learning models. Dropout is a technique where randomly selected neurons are ignored
during training. They are dropped-out randomly. This means that their contribution to the
activation of downstream neurons is temporally removed on the forward pass and any weight
updates are not applied to the neuron on the backward pass.
You can add a dropout layer to your deep learning model using the Dropout layer class.
In this lesson your goal is to experiment with adding dropout at different points in your neural
network and set to different probability of dropout values. For example, you can create a
dropout layer with the probability of 20% and add it to your model as follows:
1
https://keras.io/api/layers/regularization_layers/dropout/
10
Lesson 10
Lift Performance with Learning
Rate Schedules
You can often get a boost in the performance of your model by using a learning rate schedule.
Often called an adaptive learning rate or an annealed learning rate, this is a technique where
the learning rate used by stochastic gradient descent changes while training your model. Keras
has a time-based learning rate schedule built into the implementation of the stochastic gradient
descent algorithm in the SGD class.
When constructing the class, you can specify the decay argument which is the amount
that your learning rate (also specified) will decrease each epoch. When using learning rate
decay you should bump up your initial learning rate and consider adding a large momentum
value such as 0.8 or 0.9. Your goal in this lesson is to experiment with the time-based learning
rate schedule built into Keras. For example, you can specify a learning rate schedule that
starts at 0.1 and drops by 0.0001 each epoch as follows:
You can learn more about the SGD class in Keras here.1
1
https://keras.io/api/optimizers/sgd/
11
Lesson 11
Crash Course in Convolutional
Neural Networks
Convolutional Neural Networks are a powerful artificial neural network technique. They expect
and preserve the spatial relationship between pixels in images by learning internal feature
representations using small squares of input data. Feature are learned and used across the
whole image, allowing for the objects in your images to be shifted or translated in the scene
and still detectable by the network. It is this reason why this type of network is so useful for
object recognition in photographs, picking out digits, faces, objects and so on with varying
orientation. There are three types of layers in a Convolutional Neural Network:
B Convolutional Layers comprised of filters and feature maps.
B Pooling Layers that downsample the activations from feature maps.
B Fully-Connected Layers that plug on the end of the model and can be used to make
predictions.
In this lesson you are to familiarize yourself with the terminology used when describing
convolutional neural networks. This may require a little research on your behalf. Don’t worry
too much about how they work just yet, just learn the terminology and configuration of the
various layers used in this type of network.
Lesson 12
Handwritten Digit Recognition
12
Handwriting digit recognition is a difficult computer vision classification problem. The MNIST
dataset is a standard problem for evaluating algorithms on the problem of handwriting digit
recognition. It contains 60,000 images of digits that can be used to train a model, and 10,000
images that can be used to evaluate it’s performance.
State-of-the-art results can be achieved on the MNIST problem using convolutional neural
networks. Keras makes loading the MNIST dataset dead easy. In this lesson your goal is
to develop a very simple convolutional neural network for the MNIST problem comprised of
one convolutional layer, one max pooling layer and one dense layer to make predictions. For
example, you can load the MNIST dataset in Keras as follows:
It may take a moment to download the files to your computer. As a tip, the Keras Conv2D
layer that you will use as your first hidden layer expects image data in the format width ×
height × channels, where the MNIST data has 1 channel because the images are grayscale
and a width and height of 28 pixels. You can easily reshape the MNIST dataset as follows:
You will also need to one hot encode the output class value, that Keras also provides a
handy helper function to achieve:
As a final tip, here is a model definition that you can use as a starting point:
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='valid', input_shape=(28, 28, 1),
activation='relu'))
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
You can learn more about the convolutional neural network layers API on the Keras
webpage.1
1
https://keras.io/api/layers/convolution_layers/
13
Lesson 13
Object Recognition in Small
Photographs
Object recognition is a problem where your model must indicate what is in a photograph.
Deep learning models achieve state-of-the-art results in this problem using deep convolutional
neural networks. A popular standard dataset for evaluating models on this type of problem
is called CIFAR-10. It contains 60,000 small photographs, each of one of 10 objects, like a
cat, ship or airplane.
As with the MNIST dataset, Keras provides a convenient function that you can use to
load the dataset, and it will download it to your computer the first time you try to load
it. The dataset is a 163 MB so it may take a few minutes to download. Your goal in this
Lesson 13: Object Recognition in Small Photographs 19
lesson is to develop a deep convolutional neural network for the CIFAR-10 dataset. I would
recommend a repeated pattern of convolution and pooling layers. Consider experimenting
with drop-out and long training times. For example, you can load the CIFAR-10 dataset in
Keras and prepare it for use with a convolutional neural network as follows:
Data preparation is required when working with neural network and deep learning models.
Increasingly data augmentation is also required on more complex object recognition tasks.
This is where images in your dataset are modified with random flips and shifts. This in
essence makes your training dataset larger and helps your model to generalize the position
and orientation of objects in images.
Keras provides an image augmentation API that will create modified versions of images
in your dataset just-in-time. The ImageDataGenerator class can be used to define the image
augmentation operations to perform which can be fit to a dataset and then used in place of
your dataset when training your model. Your goal with this lesson is to experiment with the
Keras image augmentation API using a dataset you are already familiar with from a previous
lesson like MNIST or CIFAR-10. For example, the example below creates random rotations
of up to 90 degrees of images in the MNIST dataset.
# Random Rotations
from tensorflow.keras.datasets import mnist
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# reshape to be [samples][pixels][width][height]
X_train = X_train.reshape((X_train.shape[0], 28, 28, 1))
X_test = X_test.reshape((X_test.shape[0], 28, 28, 1))
# convert from int to float
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
# define data preparation
datagen = ImageDataGenerator(rotation_range=90)
# fit parameters from data
datagen.fit(X_train)
# configure batch size and retrieve one batch of images
for X_batch, y_batch in datagen.flow(X_train, y_train, batch_size=9):
# create a grid of 3 * 3 images
for i in range(0, 9):
plt.subplot(330 + 1 + i)
plt.imshow(X_batch[i].reshape(28, 28), cmap=plt.get_cmap('gray'))
Lesson 14: Improve Generalization with Data Augmentation 21
Listing 14.1: Example Using the Keras Image Augmentation to Rotate MNIST
Images.
You can learn more about the Keras image augmentation API.1
1
https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator
Final Word Before You Go...
You made it. Well done! Take a moment and look back at how far you have come:
B You discovered deep learning libraries in Python including the powerful numerical
library TensorFlow and the easy to use Keras library for applied deep learning.
B You built your first neural network using Keras and learned how to use your deep
learning models with scikit-learn and how to retrieve and plot the training history for
your models.
B You learned about more advanced techniques such as dropout regularization and
learning rate schedules and how you can use these techniques in Keras.
B Finally, you took the next step and learned about and developed convolutional neural
networks for complex computer vision tasks and learned about augmentation of image
data.
Don’t make light of this, you have come a long way in a short amount of time. This is just
the beginning of your machine learning journey with Python. Keep practicing and developing
your skills.
If you would like me to step you through each lesson in great detail (and much more), take a
look at my book: Deep Learning with Python, Second Edition:
MA
CHI
NELE
ARNI
NG
MA
STE
RY
Deep
Lear
ning
wi
thPyt
hon
Devel
opDeepLear ning
ModelswithTensor
Flow
andKeras
Aut
hor
s
Jas
onBr
ownl
ee
Adr
ianTam
ZheMi
ngChng