Sharing optimizer between processes

delgado · January 24, 2017, 10:14pm

I am wondering if it is possible to share optimizer between different threads. To be specific, when optimizer.step() is applied, modified state of the optimizer should be available for all processes.

smth · January 24, 2017, 10:30pm

i cant think of a way to do this, because the optimizer also has python scalars.

What are you trying to do?

delgado · January 24, 2017, 10:41pm

I am trying benchmark A3C algo with shared RMSProp optimizer vs seperate one for each thread, as described in this paper, p11

apaszke · January 24, 2017, 10:55pm

No, I think most of the optimizer state is kept in Tensors, so it should be possible to share it. I’m not 100% now, I’ll need to take a look at RMSprop and can confirm it tomorrow.

delgado · January 24, 2017, 11:02pm

Thanks. I will also look at the code and see if I can find some hack.

ypxie · March 15, 2017, 2:19am

Glad to know that you were looking into this.
Is the current optimizer thread safe?
Thanks!

apaszke · March 15, 2017, 11:02pm

It’s not. You need to guard it with a mutex yourself

mimoralea · December 20, 2019, 7:07pm

Just a quick follow-up for those wanting to implement A3C.

A3C uses a Hogwild! update style, which means updates are made lock-free. Workers possibly overwrite each other updates at times, but that’s OK.

The only thing needed for Adam or RMSprop to be shared with other processes is override the classes and call the share_memory method on the important variables. One can also cleverly work around the step counter which is not a tensor, so it can’t be shared.

But other than that, it is doable, and not too hard.