Department of Electronics and Communication Engineering: Seminar On Subject: Artificial Neural Network (21EC641)
Department of Electronics and Communication Engineering: Seminar On Subject: Artificial Neural Network (21EC641)
Department of Electronics and Communication Engineering: Seminar On Subject: Artificial Neural Network (21EC641)
Seminar on
“Gradient Descent, SGD, Adam, RMSProp”
Definition: Neural network optimization refers to techniques used to minimize the error
or loss function during training.
Visual Aid: Illustration showing how Gradient Descent progresses towards minimizing
loss.
Stochastic Gradient Descent (SGD)
Overview: SGD updates parameters using gradients computed from a subset (batch) of
training data.
Advantages: Faster convergence and scalability to large datasets.
Mini-batch SGD: Balances efficiency and accuracy by processing data in batches.
Momentum and Nesterov Accelerated Gradient (NAG)
Concept: Momentum enhances SGD by adding a fraction of the previous update vector
to the current update.
NAG: Improves upon Momentum by considering future gradient estimates to update
parameters.
Benefits: Faster convergence, especially in the presence of noise or high curvature.
Adam Optimization
Overview: Adam adapts learning rates for each parameter by estimating first and second
moments of gradients.
Advantages: Effective for sparse gradients and varying magnitudes, often outperforming
traditional methods.
Considerations: Sensitivity to hyperparameters and computational cost.
RMSProp
Concept: RMSProp adjusts the learning rate based on the average of recent magnitudes
of gradients for each parameter.