Loading optimizer in a distributed setting

I want to load snapshot from a file on one of the machines running in a distributed setting. From what I see, optimizers aren’t broadcast among machines in such a case. Is there any easy way to do it?

There is no common way to expose optimizer state AFAIK. If you know how you can access the state of your optimizer then you’ll be able to synchronize it by using torch.distributed collectives directly, e.g. broadcast.