Loading optimizer in a distributed setting

Konstantin_Solomatov · April 1, 2019, 3:29pm

I want to load snapshot from a file on one of the machines running in a distributed setting. From what I see, optimizers aren’t broadcast among machines in such a case. Is there any easy way to do it?

pietern · June 24, 2019, 12:03pm

There is no common way to expose optimizer state AFAIK. If you know how you can access the state of your optimizer then you’ll be able to synchronize it by using torch.distributed collectives directly, e.g. broadcast.