How SGD works in pytorch

You are right. SGD optimizer in PyTorch actually is Mini-batch Gradient Descent with momentum.

2 Likes