Adam.step() is a single step or multiple

I wonder if adam.step() is a single step or multiple steps towards convergence=minimization of loss? If it’s the latter, how is convergence judged?

It’s a single weight update step as seen here and here.

1 Like

thank you! If so, why is adam.step() called only once for each sample point in minibatch training? Don’t we need convergence/minimization of loss making full use of each sample point?

I’m not sure I understand the “each sample point” reference, but mini-batch training is a common approach in machine learning. While some ML approaches are able to use the full training dataset for a single update, it’s usually not feasible in deep learning due to the size of the model, data etc.

If we talk about gradient descent where minibatch=fullsample, iterations=1. Then epochs determine convergence=min(loss) as the following:
for epoch in epochs
adam.step()
end

Then in minibatch training, once again epochs determine convergence=min(loss):
for epoch in epochs
for i in iterations
adam.step()
end
end

Correct?

Assuming fullsample represents the entire dataset, then yes, the second approach is the common one and represents the mini-batch training.
Take a look at the Optimization chapter of the Deep Learning Book and in particular 8.1.3 Batch and Minibatch Algorithms for more information.