SGD vs Batch size 1

Hello everyone,
I am currently training my network in mini batch size using Adam optimizer.
However, because I’ve read some posts saying if i choose the right lr and lr_schedule parameters,
sgd could perform better tahn adams, I am planning to try SGD.
From my understanding, SGD updates the parameters in each sample which has batch size of 1 since batch size greater than 1 is just a normal GD.
But, I am seeing lots of pytorch code using SGD with batch size greater than 1 (e.g. 16).
How does pytorch SGD then work and update the parameters?
Also, can I still use batch normalization when i use SGD? SGD is when batch size is 1, so surely batch normalization will either not work or perform really badly.

First of all, batch size greater than 1 is min batch instead of a normal GD.
The number of batch size determines the type of GD(SGD with batch size=1 or mini-batch with batch size greater than 1).
In addition,I personally recommend you use Adam and mini-batch.Since the two of them outperform much steady and robust training.
Finally, you can’t use batch normalization with SGD-batch size=1, because batch normalization compute the average over a batch, since your batch size=1, there is no need to compute average value over a single value.
I hope I would make this clear.
If you have any more questions, please let me know.
Have a nice day : )

Thank you for the reply!
However, my question is that I see some codes in pytorch which they use batch size greater than 1 with SGD and this means it is not SGD really anymore. So, how does pytorch calculate SGD when batch size is greater than 1?

Sorry,after searching the wiki,turns out I was wrong at(SGD is batch size=1).
Really sorry!
SGD is randomly chosen an example in the batch as the gradient.
Use randomly picked gradient to replace the gradient computed over all examples or batch.
So It’s reasonable to use batch size that greater than 1.
However, from my experience, It’s very unstable using SGD,which would cause your loss changes dramatically.
If you want to know more, please click here and check out.

Hope I didn’t make you confused: )

1 Like