What is batch GD in PyTorch?

It seems SGD is mini-batch GD. Is there an actual batch GD in PyTorch, where the full set of training samples is used to update the parameters every epoch?


There is a slight abuse of notation when talking about stochastic gradient descent here because it’s the user responsibility to compute the gradients. And the update is the same for the batch GD and the mini-batch GD. The only difference is if you have gradients from the whole dataset or a minibatch.
So if you forward/backward the whole dataset, then SGD will actually do batch GD.
If you forward/backward only one minibatch, then SGD will do mini-batch GD.

1 Like