SGD Classifier and Regressor (page 1 of 9) |
Gradient descent is an algorithm that mathematically estimates where a function outputs its lowest values, which means finding the local minima. Gradient descent approximates the local minima solution with numbers instead of using symbols.
If we had a simple formula like f(x) = x^2 - 4x, then we could easily solve this equation using symbols ∇f=0 and find that x=2 minimizes f(x). As an alterative, we could use gradient descent to get a numerical approximation of the minimium, which could be x≈1.99999967. Both strategies arrive at roughly the same answer.
If our function has hundreds, thousands, or millions of observations across thousands of features, then manipulating symbols isn't feasible. That's when gradient descent provides a valuable solution to provide estimates no matter how complex our function is.