How to manually initialize optimizer state?

Mehran_Shakerinava · November 30, 2017, 3:36pm

Hi.

It seems like the optimizer states are created lazily on the first call of step.
I have written a multi-threaded code that share an optimizer, and on the first call to step some threads jump ahead while the state dictionary of a parameter is being created and try to use keys that don’t yet exist and cause a KeyError.
Currently I have fixed this by created the threads with bigger time delays in between, so they don’t catch up with the thread that’s creating the states.

How can I manually force the states to be created for an optimizer like rms_prop?

Thanks.

SimonW · November 30, 2017, 4:50pm

Run step once and then fork threads?

Mehran_Shakerinava · December 1, 2017, 4:55pm

I’ve already tried to run step once, but it seems like step doesn’t do anything if there are no gradients.
Everything seems to be based on laziness in Pytorch

SimonW · December 1, 2017, 6:16pm

Yeah, I meant running one iteration. Sorry for not being clear. Alternatively you can also set .grad of all parameters to zeros and then step once. Let me know if that works

Mehran_Shakerinava · December 2, 2017, 7:12pm

How can I set grads to zero? simply param.grad = 0 or maybe param.grad = Variable(0)? Could you provide a short code?

SimonW · December 2, 2017, 9:42pm

param.grad = Variable(para.data.new(para.size()).zero_())

seungjun · September 16, 2022, 3:36am

Thanks @SimonW
The following snippet worked for me.

for group in optimizer.param_groups:
    for p in group['params']:
        p.grad = p.data.new(p.size()).zero_()
        p.grad.requires_grad_(False)
optimizer.step()