Copying nn.Modules without shared memory

lenny · January 21, 2017, 11:54pm

You can use the share_memory() function on an nn.Module so that the same parameters can be accessed from multiple processes (using the multiprocessing module). If I do not want these parameters to be shared, and I want each subprocess to get an independent copy of the parameters to work with, will simply not calling share_memory() provide this behavior?

smth · January 22, 2017, 12:52am

no, what you need to do is to send them model to the new process, and then do:

import copy
model = copy.deepcopy(model)

share_memory() only shares the memory ahead of time (in case you want to reuse the shared memory for example)

apaszke · January 22, 2017, 9:52am

Or, assuming your model is a module called MyModel, you can create a separate instance in each process, send a state_dict from a single one to everyone else, and have them load_state_dict. The parameters in the state_dict will be shared among the processes, but load_state_dict only copies the content, without reassigning tensors, so your model won’t be shared.

However, if you want to duplicate models within a single process, I definitely recommend the way proposed by @smth.

Juna · October 14, 2018, 1:04pm

Then could you give me an example of using load_state_dict() to copy a module?

JosueCom · July 21, 2021, 2:19pm

He is referring to saving the state_dict (for instance into a file), sending it over to another process, then loading it with load_state_dict