How to degenerate SGD to primitive GD

how to degenerate SGD to primitive GD? Or actually SGD(momentum=0, dampening=0, weight_decay=0, nesterov=False, maximize=False, foreach=None)=GD, where there’s no stochasticity?

The “stochastic” part refers to the mini-batch usage, so you could pass the entire dataset to the model and should get GD.