Neural Network Regression with
Where can you get help?
“If in doubt, run the code”
• Follow along with the code
• Try it for yourself
• Press SHIFT + CMD + SPACE to read the docstring
• Search for it
• Try again
• Ask (don’t forget the Discord chat!)
“What is a regression
problem?”
Example regression problems
(75, 90)
(13, 90)
(13, 210)
Source: PyTorch blog (75, 210)
• “How much will this house sell for?”
• “How many people will buy this app?”
• “How much will my health insurance be?”
• “How much should I save each week for fuel?”
What we’re going to cover
(broadly)
• Architecture of a neural network regression model
• Input shapes and output shapes of a regression model (features and labels)
• Creating custom data to view and t
• Steps in modelling
• Creating a model, compiling a model, tting a model, evaluating a model
• Di erent regression evaluation methods
• Saving and loading models
👩🍳 👩🔬
(we’ll be cooking up lots of code!)
How:
ff
fi
fi
Regression inputs and outputs
🛏x4
🛁x2 $940,000
🚗x2
Actual output
(these are often called “input
features”)
[[0, 0, 0, 1],
[0, 1, 0, 0], $939,700
[0, 1, 0, 0],
…,
Predicted output
Numerical (comes from looking at lots
encoding (often already exists, if not,
of these)
you can build one)
Input and output shapes
🛏x4 [[0, 0, 0, 1],
🛁x2 [0, 1, 0, 0], $939,700
🚗x2 [0, 1, 0, 0],
…,
(represented as a tensor)
[bedroom, bathroom, garage] [939700]
Shape = [3] Shape = [1]
These will vary depending on the
problem you’re working on.
Anatomy of Neural Networks
Input layer Output layer
(data goes in here) (outputs learned representation or
# units/neurons = 2 prediction probabilities)
# units/neurons = 1
Hidden layer(s)
(learns patterns in data)
# units/neurons = 3
Note: “patterns” is an arbitrary term, you’ll often hear “embedding”, “weights”, “feature representation”,
“feature vectors” all referring to similar things.
(typical)
Architecture of a regression model
Also called
“output neurons”
Source: Adapted from page 293 of Hands-On Machine Learning with Scikit-Learn, Keras &
TensorFlow Book by Aurélien Géron
🛏x4
🛁x2 $940,000
🚗x2
Input and output shapes
🛏x4 [[0, 0, 0, 1],
🛁x2 [0, 1, 0, 0], $939,700
🚗x2 [0, 1, 0, 0],
…,
(represented as a tensor)
[bedroom, bathroom, garage] [939700]
Shape = [3] Shape = [1]
These will vary depending on the
problem you’re working on.
Steps in modelling with TensorFlow
Steps in modelling with TensorFlow
1. Construct or import a pretrained model relevant to your problem
2. Compile the model (prepare it to be used with data)
• Loss — how wrong your model’s predictions are compared to the
truth labels (you want to minimise this).
• Optimizer — how your model should update its internal patterns
to better its predictions.
• Metrics — human interpretable values for how well your model is
doing.
3. Fit the model to the training data so it can discover patterns
• Epochs — how many times the model will go through all of the
training examples.
4. Evaluate the model on the test data (how reliable are our model’s
predictions?)
Improving a model (from a model’s perspective)
Smaller model
Common ways to improve a deep model: Larger model
• Adding layers
• Increase the number of hidden units
• Change the activation functions
• Change the optimization function
• Change the learning rate (because you can alter each of
• Fitting on more data these, they’re hyperparameters)
• Fitting for longer
Three datasets (possibly the most important
concept in machine learning…)
Course materials Practice exam Final exam
(training set) (validation set) (test set)
The ability for a machine learning model to perform
Generalization well on data it hasn’t seen before.
(some common)
Regression evaluation metrics
Metric Name Metric Forumla TensorFlow code When to use
n tf.keras.losses.MAE( )
∑i=1 yi − xi As a great starter metric for
Mean absolute error (MAE) MAE =
or
tf.metrics.mean_absolute_error( ) any regression problem.
n
1 n tf.keras.losses.MSE( ) When larger errors are more
(Yi − Yî )2
n∑
Mean square error (MSE) MSE = or
tf.metrics.mean_square_error( ) signi cant than smaller errors.
i=1
1 2
Combination of MSE and MAE.
(y − f(x)) for | y − f(x) | ≤ δ,
Huber Lδ(y, f(x)) =
2
1 2
tf.keras.losses.Huber( ) Less sensitive to outliers than
δ | y − f(x) | − δ otherwise.
2
MSE.
fi
The machine learning explorer’s
motto
“Visualize, visualize, visualize”
Data
Model It’s a good idea to visualize
these as often as possible.
Training
Predictions
The machine learning practitioner’s
motto
“Experiment, experiment, experiment”
👩🍳 👩🔬
(try lots of things an
d see what
tastes good)
Regression inputs and outputs
🛏x4
🛁x2 $940,000
🚗x2
Actual output
(these are often called “input
features”)
[[0, 0, 0, 1],
[0, 1, 0, 0], $939,700
[0, 1, 0, 0],
…,
Predicted output
Numerical (comes from looking at lots
encoding (often already exists, if not,
of these)
you can build one)
Steps in modelling with TensorFlow
1. Turn all data into numbers (neural networks can’t handle strings)
2. Make sure all of your tensors are the right shape
3. Scale features (normalize or standardize, neural networks tend to prefer normalization)
Feature scaling
Scaling type What it does Scikit-Learn Function When to use
Converts all values to between
Scale (also referred to as Use as default scaler with
0 and 1 whilst preserving the MinMaxScaler
normalisation) neural networks.
original distribution.
Transform a feature to have
Removes the mean and
close to normal distribution
Standardization divides each value by the StandardScaler
(caution: this reduces the
standard deviation.
e ect of outliers).
Source: Adapted from Je Hale’s Scale, Standardize, or Normalize with Scikit-Learn article.
ff
ff