See how different optimizers move across a loss landscape. Adjust learning rate, momentum, and other hyperparameters and observe their effect on convergence.
Gradient Descent is an iterative method to minimize a loss function by moving parameters in the direction of steepest descent (negative gradient). Each update nudges the parameters toward lower loss.
Update (conceptual): θ ← θ − η · ∇L(θ)
In convex problems it reaches the global minimum; in non-convex ones it often finds good solutions. Choosing the right learning rate and optimizer is crucial.
This playground shows how optimization algorithms (SGD, Momentum, RMSProp, Adam) follow the gradient to minimize a loss function. The contour plot is a map of the loss: darker/bluer regions are lower values, brighter/yellow regions are higher values.
Every coffee helps keep the servers running. Every book sale funds the next tool I'm dreaming up. You're not just supporting a site — you're helping me build what developers actually need.
This visualizer simulates gradient descent (batch and stochastic variants) on configurable cost surfaces. It computes gradients analytically or numerically and updates parameters with a user‑selected learning rate and momentum. Plots are rendered in your browser.