COMMON
ACTIVATION
FUNCTIONS
In Deep Learning - an Activation Function is
essentially some logic that determines whether
a neuron will be "activated" ...
...or if it will not.
In more mathematical terms - it determines
what value a neuron will pass onward to
neurons in a subsequent layer.
There are many different Activation Functions
that can be used - let's take a look!
SIGMOID
The Sigmoid Activation Function squashes an input (z)
to an output value somewhere between 0 and 1
This function is often used in the output layer of Binary
Classification and Multi-Label Classification problems
STEP
The Heaviside Step Function provides only two possible
output values, 1 or 0. If the input value is >= 0 then the
output is 1, otherwise the output is 0
Used in early perceptrons. This function is non-
differentiable at z = 0 and it's derivative is 0 everywhere else
meaning Gradient Descent can't make any progress in
optimising the weight values of the network
TANH
The Hyperbolic Tangent (Tanh) Function has a similar
"S shape" to the Sigmoid Function...
...however Tanh instead squashes an input (z) to an
output value somewhere between -1 and +1 rather than
0 and 1
RELU
The Rectified Linear Unit (ReLU) Function is very popular
for hidden layer neurons due to speed and performance
Negative input values are output as 0 while positive
input values remain as they are.
LEAKY RELU
Leaky ReLU is very similar to ReLU but allows some
small negative slope for input values that are negative
Leaky ReLU was created to deal with the "dying ReLU
problem" which comes from standard ReLU forcing all
negative inputs to be output with a value of 0
LINEAR
With a Linear Activation Function the output equals the
input - there is no capping or transforming
This is often applied in the output layer of a Regression
task where we require an unbounded numeric value
SOFTMAX
The Softmax Function turns a vector of numbers into a
vector of probabilities (that add up to a total of 1)
This is often applied in the output layer of a Multi-Class
Classification task where we would like a probability of
an obversation belonging to each class
COMMON USAGE
Below is a guide for commonly utilised Activation
Functions for different problem types...
Important Note: These are very simply commonly used
approaches - others can, and will be applied!
Want to land an
incredible role in the
exciting, future-proof,
and lucrative field of
Data Science?
"I landed my new role at Amazon totally thanks to
DSI. It's been the best academic choice of my life -
giving better results than two degrees"
- Andrea
"DSI is incredible - everything is taught in such a
clear and simple way, even the more complex
concepts!"
- Arianna
"The best program I've been a part of, hands down"
- Christian
"I had over 40 interviews without an offer. After DSI I
quickly got 7 offers including one at KPMG and my
amazing new role at Deloitte!"
- Ritesh
>> https://data-science-infinity.teachable.com