Function Approximation
In machine learning, function approximation refers to the technique of using a simpler model
to estimate the behavior of a more complex, unknown target function. These models can be
linear, polynomial, neural networks, or other types, and they learn to approximate the target
function based on training data.
1. Target function: This is the unknown function we want to learn about. It could represent
a real-world phenomenon, a physical system, or any relationship between input and
output data.
2. Model: This is a simpler function that we choose to represent the target function. It has
parameters that can be adjusted during the learning process.
3. Training data: This is a set of input-output pairs that the model uses to learn the
relationship between the input and the output of the target function.
4. Learning process: During this process, the model's parameters are adjusted to
minimize the difference between the predictions of the model and the actual values in
the training data. This difference is often measured using a loss function.
Essentially, function approximation tries to "fit" a simpler model to the complex real-world
phenomenon represented by the target function. This is useful because:
● It's computationally efficient: Working with simpler models is often faster and requires
less memory than directly dealing with complex functions.
● It can generalize to unseen data: By learning the underlying patterns in the training
data, the model can make predictions for new, unseen data points.
● It helps us understand the problem: By analyzing the learned model, we can gain
insights into the relationships between the input and output variables.
Here are some common types of function approximation models used in machine learning:
● Linear regression: This uses a straight line to approximate the target function.
● Polynomial regression: This uses polynomials of different degrees to capture more
complex relationships.
● Decision trees: These use a tree-like structure to make predictions based on a series of
yes/no questions.
● Neural networks: These are powerful models inspired by the human brain that can
learn complex non-linear relationships.
Definition of "Model":
In ML, a model refers to a representation of a system or phenomenon that allows us to make
predictions or understand its behavior. This representation can be based on data, assumptions,
or a combination of both.
2. Approximation in ML:
When using ML techniques like linear regression, decision trees, or neural networks, we
approximate the underlying relationship between input and output variables. This
approximation is the learned function that emerges from the training process.
3. Is the Approximation a Model?
● Arguments for "Yes":
○ The learned function clearly represents the relationship you're trying to model,
just in a simplified form.
○ This function can be used for predictions and insights, fulfilling the role of a
model.
○ We can even apply specific model metrics to evaluate its performance.
● Arguments for "No":
○ The learned function may not be explicitly expressed as an equation or formula,
unlike traditional models.
○ It's often an internal representation within the ML algorithm and not directly
interpretable.
○ The focus is on capturing patterns, not building an exact replica of the true
relationship.
If not represented as an equation or formula, what may a model be represented as?
While traditionally models in mathematics and science are often represented as equations or
formulas, in machine learning (ML), models can come in various forms beyond just equations.
Here are some common ways models are represented in ML:
1. Set of parameters: In many models like linear regression, neural networks, or decision trees,
the model is characterized by a set of parameters. These parameters adjust the model's
internal structure and behavior, and during training, they are tuned to best fit the data. For
example, a linear regression model has parameters for the slope and intercept of the line, while
a neural network has weights and biases associated with each connection between neurons.
2. Decision rules: Some models, like decision trees and rule-based systems, represent their
knowledge as a set of decision rules. These rules specify conditions based on input features,
and depending on whether those conditions are met, the model follows different paths to reach
a prediction. For example, a decision tree might have rules like "if the temperature is high and
the humidity is low, predict sunny weather."
3. Probabilistic distributions: Certain models, particularly probabilistic models, represent their
knowledge as probability distributions. These distributions express the likelihood of different
outcomes given specific input conditions. For instance, a Bayesian classifier might represent the
probability of an email being spam or not spam based on its features.
4. Black box functions: In some complex models like deep neural networks, the internal
workings can be so intricate that it's not feasible to express them as simple equations or rules.
These models act as black boxes, taking inputs and producing outputs based on their learned
internal representation. While the internal calculations aren't transparent, their behavior can still
be studied and analyzed.
5. Graphical representations: For certain models like decision trees or graphical models, their
structure can be visually represented as graphs or diagrams. These diagrams show how the
input features are processed and transformed to reach the final prediction, offering some degree
of interpretability.
6. Combinations: Often, models combine different aspects of these representations. For
instance, a neural network might have parameters determining its structure (weights and
biases), but its behavior can also be understood through activation functions and probability
distributions.
Models Representable as Functions:
● Many common machine learning models can be viewed as functions that map inputs to
outputs.
○ Linear regression: Represents a linear function y = mx + b for predicting
continuous values.
○ Neural networks: Composed of layers of functions transforming inputs through
various activation functions.
○ Decision trees: Represent a series of conditional functions leading to a final
output.
○ Support Vector Machines (SVMs): Utilize decision functions to classify data
points based on margins.
Models Beyond Simple Functions:
● Some models don't map directly to a single function, but still represent relationships
between inputs and outputs.
○ Probabilistic models: Use probability distributions to express the likelihood of
different outcomes.
○ Bayesian networks: Represent relationships between variables as a directed
acyclic graph with conditional probabilities.
○ Ensemble methods: Combine multiple models (functions) to create a more
robust prediction.
Key Distinction:
● The crucial difference lies in whether the model can be expressed as a single
mathematical function that takes an input and produces a deterministic output.
● Models involving conditional rules, probability distributions, or ensemble methods don't fit
this definition strictly.