SGD and batch_size in data.DataLoader()

When we use SGD algorithm to update the parameters in the model, we use only one sample to feed the model each time, it it right?
But in PyTorch, even when we use SGD, we have to set batch_size and its value can not be 1.
What puzzles me is, when use SGD and batch_size is not 1, what exactly is the algorithm of the program? Is it still SGD?

This statement is not true, e.g. 1 is a default batch_size of So to get “true” SGD, you are free to use 1 as batch_size.

When batch_size is greater than 1, the algorithm is called “mini-batch” gradient descent, and when batch_size is equal to len(dataset) we are talking about “batch” gradient descent.


I see. :thinking: :thinking:

Thanks for the detailed explanation. :smiley: :smiley: