0% found this document useful (0 votes)

22 views15 pages

Unit 1 Neural Network Basics

The document provides an overview of neural networks, explaining their structure, function, and applications in supervised learning, particularly in predicting housing prices. It discusses the importance of data selection, the role of different types of neural networks, and the recent advancements that have propelled deep learning's popularity. Key concepts such as forward and backward propagation, logistic regression, and loss functions are also introduced to illustrate how neural networks learn from data.

Uploaded by

itsbzaemon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views15 pages

Unit 1 Neural Network Basics

Uploaded by

itsbzaemon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Unit 1 Neural Network Basics

What is Neural Network?

Understanding Neural Networks: A Simple Explanation

Deep learning often involves training neural networks, which are computational models inspired by the
human brain. But what exactly is a neural network? Let’s break it down with an example.

Housing Price Prediction: A Starting Point

Imagine you have data about six houses:

• The size of each house in square feet or square meters.

• The price of each house.

Your goal is to predict the price of a house based on its size.

1. Linear Regression Approach

You might fit a straight line to the data, where the line represents the relationship between size
and price. But prices can’t be negative, so a more realistic approach would be to adjust the curve
so it flattens at zero for very small sizes.

This curve becomes the function for predicting the price of a house based on its size.

2. The Neural Network Perspective

This simple function can be seen as a very basic neural network with:

o Input: The size of the house (x).

o Output: The predicted price (y).

The Simplest Neural Network

A neural network is made up of nodes, also called neurons. For the housing example:

• A single neuron takes the size (x) as input.

• It applies a mathematical operation to compute the price (y).

• To ensure prices are realistic (e.g., non-negative), the neuron uses a specific function called ReLU
(Rectified Linear Unit).

o The ReLU function outputs 0 if the input is negative and outputs the input value
otherwise. This ensures the curve remains realistic, resembling the function we fitted
earlier.

Building Larger Neural Networks

Now, what if the price of a house depends on more features, like:

• Number of bedrooms (#bedrooms).

• Family size (can the house fit your family?).

• Zip code or postal code (which might indicate walkability or school quality).

Here’s how we handle this:

1. Adding More Inputs

Instead of just size, we now include other features as inputs:
x = [size, bedrooms, zip code, wealth of the neighborhood].

2. Hidden Layers

o Between the input (x) and output (y), we add hidden units (neurons).

o Each hidden unit processes all input features and computes something useful, like family
size, walkability, or school quality.

3. Connections

o Every input is connected to every neuron in the hidden layer. This is called a dense
connection.

o The neural network doesn’t need you to define specific roles (e.g., "this neuron
calculates family size"). It learns what’s important from the data.

4. Output

o The network combines all the information from the hidden layer to predict the price of a
house.

Why Neural Networks Are Powerful

• Neural networks can find complex patterns in data.

• They are most useful in supervised learning, where you map input (x) to output (y), as in our
housing price example.

• With enough data, a neural network can learn very accurate functions for predictions.

Key Takeaways for Students

1. Basic Unit: A single neuron takes inputs, applies a function (like ReLU), and produces an output.

2. Bigger Networks: Combine multiple neurons to handle more complex data with multiple
features.

3. Learning: The network figures out the best way to combine inputs to predict outputs using
training data.
By stacking these neurons together, you can solve increasingly complex problems—just like stacking
LEGO bricks to build something amazing!

Supervised Learning with Neural Network

The Hype Around Neural Networks

Neural networks have garnered a lot of attention recently, and much of this hype is justified given their
impressive performance across various domains. However, the majority of the economic value created
by neural networks to date stems from one specific type of machine learning: supervised learning.

What is Supervised Learning?

In supervised learning, the goal is to map an input xxx to an output yyy. For instance, in a housing price
prediction task, the input might include features like the size of a house, the number of bedrooms, and
its location. The output, yyy, is the estimated price of the house.

Here are some practical examples of where neural networks excel:

1. Online Advertising:
Perhaps the most lucrative application of deep learning today is online advertising. Neural
networks predict whether a user will click on an ad based on details about the ad and user
behavior. This application has significantly improved the revenue of major advertising companies
by personalizing ads for users.

2. Computer Vision:
Neural networks, particularly deep learning models, have revolutionized computer vision. For
example, an input image can be processed to output a label or index representing one of many
possible objects (e.g., identifying objects in a photo for tagging).

3. Speech Recognition:
Deep learning has enabled neural networks to convert audio clips into accurate text transcripts.

4. Machine Translation:
Neural networks can translate sentences from one language (e.g., English) to another (e.g.,
Chinese) with remarkable accuracy.

5. Autonomous Driving:
Neural networks process images from cameras and radar data to identify the positions of
vehicles and obstacles, serving as a key component in autonomous driving systems.

Selecting x and y for Applications

The effectiveness of supervised learning often depends on selecting appropriate inputs x and outputs y
for the problem at hand. Once identified, these components can fit into larger systems, such as
autonomous vehicles.

Different types of neural networks are suited for specific applications:

• Standard Neural Networks:

Suitable for structured data tasks, such as predicting housing prices or online ad performance.

• Convolutional Neural Networks (CNNs):

Ideal for image-related tasks like photo tagging or object recognition.
• Recurrent Neural Networks (RNNs):
Effective for sequence data, such as audio, language, or temporal sequences. For example,
processing speech or translating text often requires advanced RNN variants.

In more complex scenarios like autonomous driving, a hybrid network architecture combining CNNs and
other components might be necessary.

Structured vs. Unstructured Data

Neural networks handle two broad categories of data:

1. Structured Data:
Examples include databases containing well-defined features like the size of a house, the number
of bedrooms, or a user’s age. Supervised learning models use these features to make
predictions.

2. Unstructured Data:
This includes raw audio, images, or text. Historically, analyzing unstructured data was challenging
for computers, but deep learning has changed this. Neural networks now excel at recognizing
patterns in audio, identifying objects in images, and processing natural language.

While the media often highlights neural networks’ success with unstructured data (e.g., recognizing a cat
in a picture), their economic value in structured data applications—such as improving advertising
systems and processing large databases—cannot be understated.

Why Are Neural Networks Thriving Now?

The core technical concepts behind neural networks have existed for decades. However, only recently
have they become powerful tools, thanks to advances in computational power, data availability, and
algorithmic innovations.
Why is Deep Learning taking off?
The key reasons behind the recent rise of deep learning and its ongoing progress. Here's a summary of
the main points:

1. Availability of Data:

• Historical limitation: Traditional algorithms like SVMs and logistic regression plateaued in
performance with limited data.

• Modern abundance: The digitization of society has resulted in massive amounts of data from
digital activities, mobile apps, IoT sensors, cameras, etc.

• Deep learning thrives in this "big data" regime, where performance scales with data volume.

2. Scale of Neural Networks:

• Performance improves significantly with larger neural networks (more parameters and hidden
units).

• However, this requires a substantial amount of data and computational resources.

3. Computation Advances:

• Specialized hardware: GPUs and other hardware innovations have accelerated the training of
large networks.

• Faster computation enables rapid experimentation, shortening the feedback loop for developing
and refining neural network architectures.

4. Algorithmic Innovations:

• Techniques like replacing sigmoid activation functions with ReLU (Rectified Linear Unit) have
sped up gradient descent and training.

• Such innovations make training faster and more efficient, allowing researchers to build larger
and better-performing networks.

5. Iterative Development:

• Faster training cycles empower researchers to test and refine ideas quickly, fostering rapid
innovation in deep learning.

6. Optimism for the Future:

• Data growth: Society continues to generate more digital data.

• Improved hardware: Faster and more specialized computational resources are being developed.

• Ongoing research: The deep learning research community consistently delivers new algorithms,
ensuring continued progress.

These factors—data scale, computational power, and algorithmic advancements—are synergistically

driving the rise of deep learning and will likely sustain its growth in the foreseeable future.
Data Processing in Neural Networks

When working with training datasets, you might think of using a loop to process each example
individually. However, this is computationally expensive for large datasets. Neural networks handle this
by performing operations on the entire dataset at once using matrix operations. This approach leverages
the efficiency of linear algebra libraries, making it faster and more scalable.

2. Forward and Backward Propagation

• Forward Propagation: This step calculates predictions based on the current parameters of the
model.
• Backward Propagation: This step updates the model's parameters by minimizing the error in
predictions using techniques like gradient descent.

These steps are the building blocks of how a neural network learns from data.

3. Logistic Regression

Logistic regression is used as an introduction to neural networks because it shares similarities in

structure but is simpler. It predicts binary outcomes, such as:

• 1: Cat (True)

• 0: Not-cat (False)

4. How Images Are Represented

Images are stored as matrices corresponding to three color channels: Red, Green, and Blue (RGB). For a
64×64 image:

• Each channel is a 64×64 matrix.

• Flatten these into a single feature vector: x = [All red values, All green values, All blue values]

• The length of x is 64×64×3=12,288.

5. Notation and Representation

• Single Training Example: (x, y)

o x: Input feature vector (e.g., 12,288 pixel values).

o y: Output label (binary: 1 or 0).

• Training Set:

o Contains m examples: (x(1),y(1)),…,(x(m),y(m)).

o m is the number of training samples.

• Matrix Representation:

o X: Input feature matrix, nx × m, where nx is the size of x.

▪ Each column is one training example.

o Y: Output label matrix, 1×m1.

▪ Each column is the label for the corresponding training example in X.

This stacking in columns simplifies operations like forward and backward propagation.
Logistic Regression

Logistic regression is a supervised learning algorithm used for binary classification problems. The goal is
to predict an output Y (0 or 1) given an input X (e.g., an image) and determine the probability that Y is 1,
denoted as 𝑌̂.

Steps and Key Concepts

1. Input Representation:

o X: Feature vector representing input data.

o Y: Output label (0 or 1).

2. Parameters:

o W: Weight vector (same dimension as X).

o b: Bias term (a single scalar).

3. Linear Combination:

o Z=WTX+b: A linear combination of input features and parameters.

o This step is similar to linear regression but isn't sufficient for probabilities because Z can
take any value, including values outside [0,1].

4. Sigmoid Function:

o Used to map Z to a probability value between 0 and 1:

1
σ(𝑍) =
1 + 𝑒 −𝑍
o Properties:

▪ σ(Z)→1 as Z→∞.

▪ σ(Z)→0 as Z→−∞.

▪ σ(0)=0.5.

5. Output:

o 𝑌̂ = 𝜎(𝑍) : The predicted probability that Y = 1.

6. Learning Parameters:

o Adjust W and b to minimize the error in predictions.

o The method to achieve this (gradient descent) and the cost function will be discussed
later.
Logistic Regression Loss and Cost Functions
In this video, the process of defining the loss function and cost function for logistic regression is
explained. These are essential for training the parameters W and b of the logistic regression model.

Key Points:

1. Prediction Setup:

o Logistic regression outputs 𝑦̂ = 𝜎(𝑊 𝑇 𝑥 + 𝑏), where:

1
▪ 𝜎(𝑧) = 1+𝑒 −𝑧 is the sigmoid function.

▪ 𝑥 (𝑖) and 𝑦 (𝑖) refer to the input features and label for the i-th training example.

▪ 𝑧 (𝑖) = 𝑊 𝑇 𝑥 (𝑖) + 𝑏 is the linear combination of weights and inputs.

2. Loss Function:

o Measures how well the model's prediction 𝑦̂ matches the actual label 𝑦 for a single
training example.

o Defined as: ℒ(𝑦̂, 𝑦) = −𝑦 log(𝑦̂) − (1 − 𝑦) log(1 − 𝑦̂)

o Interpretation:

▪ If 𝑦 = 1: The function minimizes − log(𝑦̂), pushing 𝑦̂ towards 1.

▪ If 𝑦 = 0: The function minimizes − log(1 − 𝑦̂), pushing 𝑦̂ towards 0.

3. Cost Function:

o Measures the average performance across the entire training set:

𝑚
1
𝐽(𝑊, 𝑏) = ∑ ℒ (𝑦̂
(𝑖) , 𝑦 (𝑖) )
𝑚
𝑖=1

o Expanded:
𝑚
1
𝐽(𝑊, 𝑏) = − ∑ [𝑦 (𝑖) log (𝑦̂
(𝑖) ) + (1 − 𝑦 (𝑖) ) log (1 − 𝑦̂
(𝑖) )]
𝑚
𝑖=1

4. Optimization Goal:

o Train the logistic regression model by finding 𝑊 and 𝑏 that minimize the cost function
𝐽(𝑊, 𝑏).

Why Use This Loss Function?

• The chosen loss function ensures convex optimization, making it easier to find the global
minimum.
• Using alternatives, such as squared error, can lead to non-convex optimization problems with
multiple local minima.

W1 Ann
No ratings yet
W1 Ann
3 pages
1.1. Introduction To Deep Learning
No ratings yet
1.1. Introduction To Deep Learning
26 pages
Introduction To Neural Network - Deep Learning
No ratings yet
Introduction To Neural Network - Deep Learning
17 pages
Introduction To Neural Network
No ratings yet
Introduction To Neural Network
17 pages
Neural Networks
No ratings yet
Neural Networks
40 pages
Chapter 6 - Neural Networks (Part 1)
No ratings yet
Chapter 6 - Neural Networks (Part 1)
29 pages
Expanded Deep Learning Document-1
No ratings yet
Expanded Deep Learning Document-1
11 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Deep Learning by AndrewNG Tutorial Notes
No ratings yet
Deep Learning by AndrewNG Tutorial Notes
298 pages
A Guide To Deep Learning and Neural Networks
No ratings yet
A Guide To Deep Learning and Neural Networks
15 pages
Neural Networks
No ratings yet
Neural Networks
16 pages
Neural Networks: Uses and Types
No ratings yet
Neural Networks: Uses and Types
3 pages
Lect 2 Common Architectural Principles of Deep Networks
No ratings yet
Lect 2 Common Architectural Principles of Deep Networks
20 pages
Deep Learning Computer Vision
No ratings yet
Deep Learning Computer Vision
47 pages
UNIT - 5 Lecture 2
No ratings yet
UNIT - 5 Lecture 2
26 pages
Physics 12
No ratings yet
Physics 12
33 pages
Deep Learning (Handout)
No ratings yet
Deep Learning (Handout)
11 pages
Deep Learning UNIT 1
No ratings yet
Deep Learning UNIT 1
22 pages
ML06 Neural-Network 2024-2025
No ratings yet
ML06 Neural-Network 2024-2025
78 pages
Unit 5 Neural Network
No ratings yet
Unit 5 Neural Network
31 pages
Neural Networks and Deep Learning
No ratings yet
Neural Networks and Deep Learning
22 pages
Deep Learning Fundamentals
No ratings yet
Deep Learning Fundamentals
19 pages
Don T Be Scared of Neural Network 1731079571
No ratings yet
Don T Be Scared of Neural Network 1731079571
18 pages
Neural Networks - Advantages and Applications - MarkTechPost
No ratings yet
Neural Networks - Advantages and Applications - MarkTechPost
3 pages
Deep Learning 1687744660
No ratings yet
Deep Learning 1687744660
26 pages
Unit I
No ratings yet
Unit I
10 pages
Computational Methods and Techniques
No ratings yet
Computational Methods and Techniques
15 pages
Introduction To Neural Networks
No ratings yet
Introduction To Neural Networks
51 pages
Neural Networks: CMR Technical Campus
No ratings yet
Neural Networks: CMR Technical Campus
30 pages
ML Unit 4
No ratings yet
ML Unit 4
16 pages
Neural Network Explained To Beginners
No ratings yet
Neural Network Explained To Beginners
16 pages
DL Notes
No ratings yet
DL Notes
97 pages
ML Unit 3 Part1
No ratings yet
ML Unit 3 Part1
30 pages
DL Unit-3 (CDS)
No ratings yet
DL Unit-3 (CDS)
32 pages
DL Unit 1
No ratings yet
DL Unit 1
200 pages
Neural Networks in Machine Learning11
No ratings yet
Neural Networks in Machine Learning11
11 pages
Reading+10+ +Introduction+to+Deep+Learning
100% (1)
Reading+10+ +Introduction+to+Deep+Learning
21 pages
Neural Networks: A Deep Learning Guide
No ratings yet
Neural Networks: A Deep Learning Guide
13 pages
Neural Networks
No ratings yet
Neural Networks
12 pages
Module 2
No ratings yet
Module 2
84 pages
Unit 4
No ratings yet
Unit 4
27 pages
Unit-1 and 2 Deep Learning
No ratings yet
Unit-1 and 2 Deep Learning
22 pages
Unit IV Artificial Neural Networks
No ratings yet
Unit IV Artificial Neural Networks
25 pages
ML QB 4
No ratings yet
ML QB 4
69 pages
Unit 1
No ratings yet
Unit 1
16 pages
Deep Learning
No ratings yet
Deep Learning
21 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
21 pages
Neural Net2
No ratings yet
Neural Net2
24 pages
Artificial Neural Network
No ratings yet
Artificial Neural Network
37 pages
Deep Learning
No ratings yet
Deep Learning
20 pages
Beginner Guide To Neutral Network
No ratings yet
Beginner Guide To Neutral Network
6 pages
SHAI - Task 3 - NN
No ratings yet
SHAI - Task 3 - NN
10 pages
Deep Learning for Tech Enthusiasts
No ratings yet
Deep Learning for Tech Enthusiasts
20 pages
8.2.1: Introduction To Neural Networks: Objectives
No ratings yet
8.2.1: Introduction To Neural Networks: Objectives
11 pages
NNML Full
No ratings yet
NNML Full
19 pages
Eng PPT Tech
No ratings yet
Eng PPT Tech
18 pages
Executive Summary
No ratings yet
Executive Summary
17 pages
Science 6
100% (1)
Science 6
6 pages
JNTUH USED 07-11-2020 AM: (Common To ME, MIE)
No ratings yet
JNTUH USED 07-11-2020 AM: (Common To ME, MIE)
1 page
PSIM Simulation Guide for Power Electronics Lab
100% (1)
PSIM Simulation Guide for Power Electronics Lab
5 pages
Geography Grade 10 Term Four Week Two
No ratings yet
Geography Grade 10 Term Four Week Two
7 pages
Culture and Society Lesson 2.3
No ratings yet
Culture and Society Lesson 2.3
40 pages
Multiscale Transformer and Attention Mechanism For Magnetic Spatiotemporal Sequence Localization
No ratings yet
Multiscale Transformer and Attention Mechanism For Magnetic Spatiotemporal Sequence Localization
16 pages
SLM RESEARCH 7 Week 5 Classification of Observable Properties of Matter
No ratings yet
SLM RESEARCH 7 Week 5 Classification of Observable Properties of Matter
5 pages
CJC - Maths 2008 JC2 H1 Prelim Exam
No ratings yet
CJC - Maths 2008 JC2 H1 Prelim Exam
4 pages
Interfacing Case-Isolated Two Wire Devices To Bently Nevada's 3500 Monitoring System
No ratings yet
Interfacing Case-Isolated Two Wire Devices To Bently Nevada's 3500 Monitoring System
8 pages
English Intonation for Linguists
No ratings yet
English Intonation for Linguists
79 pages
Female Hormonal Cycle Lesson
No ratings yet
Female Hormonal Cycle Lesson
9 pages
What Is The Difference Between LCD, LED and PLASMA
No ratings yet
What Is The Difference Between LCD, LED and PLASMA
5 pages
Summer International Internships
No ratings yet
Summer International Internships
9 pages
Social Skills Training in ASD
100% (3)
Social Skills Training in ASD
34 pages
Modern Power Station Practice Electrical Systems and Equipment
No ratings yet
Modern Power Station Practice Electrical Systems and Equipment
1,053 pages
Jason Samson Thesis
100% (3)
Jason Samson Thesis
5 pages
Types of Solutions
No ratings yet
Types of Solutions
12 pages
Handbook of Insect Morphology Taxonomy Physiology
No ratings yet
Handbook of Insect Morphology Taxonomy Physiology
165 pages
Contemporary Topics Unit 9 - VOCABULARY REVISION Yta
No ratings yet
Contemporary Topics Unit 9 - VOCABULARY REVISION Yta
3 pages
PBIS Tier One Workbook
100% (1)
PBIS Tier One Workbook
102 pages
1you Are Investigating The Ability of People of Your Age To Estimate Masses of Everyday Obje
No ratings yet
1you Are Investigating The Ability of People of Your Age To Estimate Masses of Everyday Obje
6 pages
Wago 750 346 Manual
No ratings yet
Wago 750 346 Manual
164 pages
Numerical and Experimental Characterization of Vortex Mixer For Micromixing Study
No ratings yet
Numerical and Experimental Characterization of Vortex Mixer For Micromixing Study
2 pages
Sri Chaitanya IIT Academy, India: KEY Sheet Physics
No ratings yet
Sri Chaitanya IIT Academy, India: KEY Sheet Physics
9 pages
Module 2.1 - Industrial Robots
No ratings yet
Module 2.1 - Industrial Robots
46 pages
2-Barking Up The Wrong Tree
No ratings yet
2-Barking Up The Wrong Tree
10 pages
Affair Notes Jan
No ratings yet
Affair Notes Jan
150 pages
اللّسانيات الغربيَّة وإشكاليَّة التَّلقي في الوطن العربيِّ
No ratings yet
اللّسانيات الغربيَّة وإشكاليَّة التَّلقي في الوطن العربيِّ
15 pages
Rosenzweig Thelen The Presence of The Past
No ratings yet
Rosenzweig Thelen The Presence of The Past
14 pages

Unit 1 Neural Network Basics

Uploaded by

Unit 1 Neural Network Basics

Uploaded by

Unit 1 Neural Network Basics

What is Neural Network?

Housing Price Prediction: A Starting Point

Imagine you have data about six houses:

• The size of each house in square feet or square meters.

• The price of each house.

Your goal is to predict the price of a house based on its size.

1. Linear Regression Approach

2. The Neural Network Perspective

o Input: The size of the house (x).

o Output: The predicted price (y).

The Simplest Neural Network

• A single neuron takes the size (x) as input.

• It applies a mathematical operation to compute the price (y).

Building Larger Neural Networks

Now, what if the price of a house depends on more features, like:

• Number of bedrooms (#bedrooms).

Here’s how we handle this:

1. Adding More Inputs

Why Neural Networks Are Powerful

• Neural networks can find complex patterns in data.

Key Takeaways for Students

Supervised Learning with Neural Network

What is Supervised Learning?

Here are some practical examples of where neural networks excel:

Selecting x and y for Applications

Different types of neural networks are suited for specific applications:

• Standard Neural Networks:

• Convolutional Neural Networks (CNNs):

Structured vs. Unstructured Data

Neural networks handle two broad categories of data:

Why Are Neural Networks Thriving Now?

2. Scale of Neural Networks:

• However, this requires a substantial amount of data and computational resources.

6. Optimism for the Future:

• Data growth: Society continues to generate more digital data.

These factors—data scale, computational power, and algorithmic advancements—are synergistically

2. Forward and Backward Propagation

Logistic regression is used as an introduction to neural networks because it shares similarities in

4. How Images Are Represented

• Each channel is a 64×64 matrix.

• The length of x is 64×64×3=12,288.

5. Notation and Representation

• Single Training Example: (x, y)

o x: Input feature vector (e.g., 12,288 pixel values).

o y: Output label (binary: 1 or 0).

o Contains m examples: (x(1),y(1)),…,(x(m),y(m)).

o m is the number of training samples.

o X: Input feature matrix, nx × m, where nx is the size of x.

▪ Each column is one training example.

o Y: Output label matrix, 1×m1.

▪ Each column is the label for the corresponding training example in X.

Steps and Key Concepts

o X: Feature vector representing input data.

o Y: Output label (0 or 1).

o W: Weight vector (same dimension as X).

o b: Bias term (a single scalar).

o Z=WTX+b: A linear combination of input features and parameters.

o Used to map Z to a probability value between 0 and 1:

o 𝑌̂ = 𝜎(𝑍) : The predicted probability that Y = 1.

o Adjust W and b to minimize the error in predictions.

o Logistic regression outputs 𝑦̂ = 𝜎(𝑊 𝑇 𝑥 + 𝑏), where:

▪ 𝑧 (𝑖) = 𝑊 𝑇 𝑥 (𝑖) + 𝑏 is the linear combination of weights and inputs.

o Defined as: ℒ(𝑦̂, 𝑦) = −𝑦 log(𝑦̂) − (1 − 𝑦) log(1 − 𝑦̂)

▪ If 𝑦 = 1: The function minimizes − log(𝑦̂), pushing 𝑦̂ towards 1.

▪ If 𝑦 = 0: The function minimizes − log(1 − 𝑦̂), pushing 𝑦̂ towards 0.

o Measures the average performance across the entire training set:

Why Use This Loss Function?

You might also like