Cost Function:
In deep learning, the cost function (also known as the loss function or objective function) is a critical
component that measures how well a model's predictions match the actual target values. The
purpose of the cost function is to quantify the error in predictions, guiding the optimization process
to improve the model.
An important aspect of the design of a deep neural network is the choice of the cost function.
Parametric Models: In many deep learning applications, models define a distribution p(y∣x;θ), where
y is the target variable, x is the input, and θ represents the model parameters.
Maximum Likelihood Estimation (MLE): The principle of MLE involves finding the
parameters θ that maximize the likelihood of the observed data. In practical terms, this often
translates to minimizing the negative log-likelihood, which for classification problems is
typically expressed using the cross-entropy loss.
Cross-Entropy Loss:
Cost function
Learning Conditional Distribution
Learning Conditional Statistics
1. Learning Conditional Distributions with Maximum Likelihood
Most modern neural networks are trained using maximum likelihood. This means that the cost
function is simply the negative log-likelihood, equivalently described 178 CHAPTER 6. DEEP
FEEDFORWARD NETWORKS as the cross-entropy between the training data and the model
distribution. This cost function is given by
6.2.2 Output Units .
The choice of cost function is tightly coupled with the choice of output unit. The choice of how to
represent the output then determines the form of the cross-entropy function.
Linear Units for Gaussian Output Distributions