How to do various types of gradient descent?

I’m a newbie user for pytorch. I know the various types of gradient descent like batch-gd, minibatch-gd, stochastic-gd. So how we can do batch-gd, minibatch, and stochastic-gd using pytorch module?


The torch.optim modules gives you access to different types of optimizer to choose the step to take.
For the gradient computation, you have all the freedom to do it in your code. optimizer.zero_grad() can be used to reset the gradients stored for every parameters. Calls to .backward() will accumulate gradients and optimizer.step() will actually apply one step of the given optimizer.
To do batch-gd, you need to zero_grad at the beginning of your epoch, then forward-backward all your samples and at the end do one step.
For a minibatch version, zero_grad, forward-backward a subset of your samples then step and repeat until the end of the epoch.
Finally stochastic GD is the same where you backward a single sample every time.

So can we use torch.optim.sgd for the optimizer? and for the type of gradien descent, it depends to how many data that we take for data training.
If we take the entire data that means we do batch gd, if we just take small group of the data that means we do minibatch gd. If we take one sample that means we do stochastic gradient descent. Am i right?

Yes this is exactly the idea !

1 Like

Oh okay thx for the explanation :slight_smile: