Average each weight of two models

Hello

I use this averaging after each optimizer step, I agree that using multiprocessing utilities from python can be a good idea, however in general this kind of utilities creates jobs pool on the cpu, then execute them, and I am not sure if this the best possible way of doing it. I am wondering if there is a possibility to create the jobs pool on the gpu directly. I believe that it won’t make any significant difference, but I am not sure about it.