SGD Classifier and Regressor (page 9 of 9) |
With the SGDRegressor and SGDClassifier, the gradient of the loss is estimated for each observation one at a time and the model is updated along the way with a decreasing strength schedule (aka learning rate). The SGD algorithms allow minibatch learning via the partial_fit method. For SGD, the data should have zero mean and unit variance (i.e., standardize the scales). SGD can easily scale to problems with more than 10^5 training examples and more than 10^5 features.
Inclass Scenario: Predicting Exam Performance