Are there any recommended methods to clone a model?

Royi · May 1, 2018, 6:13pm

What happens if I do this:

hNetModel = Model()
    for trainBatch, trainLabels in hTrainLoader:
        <Train the Model by a Function>
        modelEvaluationMetric = hNetModel(Validation)
        if(modelEvaluationMetric < bestModelMetric):
            hBestModel = hNetModel

Namely I run the model trhough the optimization and if its performance are the best so far I use hBestModel = hNetModel.
At the end I save the dictionary of hBestModel.
Does it makes sense or is it just another reference to the same net always?

albanD · May 2, 2018, 11:34am

It’s just a reference to the same net, so it will be changed when you keep optimizing.
You’ll need to use deepcopy as suggested.

Royi · May 2, 2018, 4:20pm

Even if the training happens in a different function?
Something like:

hNet in NetList
hNet = TrainNet(hNetModel)
modelEvaluationMetric = hNetModel(Validation)
        if(modelEvaluationMetric < bestModelMetric):
            hBestModel = hNet

I thought at least when something gets back from a function it is a different copy of it (Yea, I’m not so experienced with Python).

Brando_Miranda · May 2, 2018, 8:14pm

just to make your answer clear you mean:

new_mdl = copy.deepcopy(model)

right?

Brando_Miranda · May 2, 2018, 8:21pm

why is deep copy not working for you? in what way is it not working compared to what u expected?

Brando_Miranda · May 2, 2018, 8:43pm

Does something inspired from:

or

not work for you?

X-czh · July 27, 2018, 6:47pm

Hi, copy.deepcopy(model) works fine for me in previous PyTorch versions, but as I’m migrating to version 0.4.0, it seems to break. It seems to have something to do with torch.device. How should I do cloning properly in version 0.4.0?

The traceback is as follows:
(I run
device = torch.device(‘cuda’)
generator = Generator(args.vocab_size, g_embed_dim, g_hidden_dim, device).to(device)
previously, and when I replace device with string ‘cuda’, it works then)

Traceback (most recent call last):
File “main.py”, line 304, in
rollout = Rollout(generator, args.update_rate)
File “/home/x-czh/SeqGAN-PyTorch/rollout.py”, line 14, in init
self.own_model = copy.deepcopy(model)
File “/usr/lib/python3.5/copy.py”, line 182, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File “/usr/lib/python3.5/copy.py”, line 297, in _reconstruct
state = deepcopy(state, memo)
File “/usr/lib/python3.5/copy.py”, line 155, in deepcopy
y = copier(x, memo)
File “/usr/lib/python3.5/copy.py”, line 243, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File “/usr/lib/python3.5/copy.py”, line 182, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File “/usr/lib/python3.5/copy.py”, line 292, in _reconstruct
y = callable(*args)
File “/usr/lib/python3.5/copyreg.py”, line 88, in newobj
return cls.new(cls, *args)
TypeError: Device() received an invalid combination of arguments - got (), but expected one of:

(torch.device device)
(str type, int index)

kuzand · January 11, 2019, 2:08pm

Deepcopy is not working for me.

I have a function train(model) which returns the trained model, model_trained = train(model_untrained). However as result both are trained at the end, but I want the model_untrained to be unchanged. So I tried to deep-copy the model_untrained inside the function before the training loop, but It is not working – the model is not trained correctly. Any idea why is it happening?

ptrblck · January 12, 2019, 12:46am

Are you trying to train the copied or the original model?
In the first case I assume the optimizer doesn’t have the references to the appropriate parameters, thus probably no model is trained.
Could you check it?

kuzand · January 12, 2019, 9:05am

Yes I am training the copied model. You are right about the optimizer, I was passing the original model parameters to it. Thanks for spotting it!

shivamsaboo17 · January 31, 2019, 6:35am

I was trying to copy a model where the forward function is using @torch.jit.script_method so that I can load it later in C++.
But when I am using deepcopy it gives the error:
can't pickle BaseModel objects
where BaseModel is classname of my model. The same code is working correctly without using jit decorator. This could be something trivial but I am unable to find a workaround.

52153d3b7bd2d9066500 · June 9, 2019, 11:23am

import pickle
copyed_model = pickle.loads(pickle.dumps(model))

disreputableDog · August 9, 2019, 1:15pm

Can confirm, deepcopy does not work (changes to original still reflected in copy) but pickle does work.

yong_xu · February 1, 2020, 7:41am

classifier = pickle.loads(pickle.dumps(self.classifier))
TypeError: can’t pickle module objects

Kamil_Wojcicki · February 21, 2020, 10:55pm

Using Adam’s suggestion:

threw:

TypeError: can't pickle dict_keys objects

for the model I am working with.

I am using python 3.7 and the model was trained on multiple GPUs.

Has anyone run into this issue with their models? Any ideas how to fix it?

Searching online, I found similar issue with deepcopy (but not in the context of PyTorch):

Apparently in python3 you have to wrap dict.keys() in list() — otherwise the deepcopy issue appears.

github.com/neuropycon/ephypype

deepcopy error when running workflow in python3

opened 03:27PM - 29 Dec 17 UTC

closed 12:08PM - 31 Dec 17 UTC

dmalt

Can't make nipype work under python3. I've created a simple pipeline to disent…angle command-line code from nipype and ephypype. When I run the workflow I get the following error: ```Traceback (most recent call last): File "test_cli.py", line 91, in <module> workflow.run(plugin='Linear') File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/site-packages/nipype/pipeline/engine/workflows.py", line 570, in run flatgraph = self._create_flat_graph() File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/site-packages/nipype/pipeline/engine/workflows.py", line 830, in _create_flat_graph workflowcopy = deepcopy(self) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 180, in deepcopy y = _reconstruct(x, memo, *rv) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 280, in _reconstruct state = deepcopy(state, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 150, in deepcopy y = copier(x, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 240, in _deepcopy_dict y[deepcopy(key, memo)] = deepcopy(value, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 180, in deepcopy y = _reconstruct(x, memo, *rv) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 280, in _reconstruct state = deepcopy(state, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 150, in deepcopy y = copier(x, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 240, in _deepcopy_dict y[deepcopy(key, memo)] = deepcopy(value, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 150, in deepcopy y = copier(x, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 240, in _deepcopy_dict y[deepcopy(key, memo)] = deepcopy(value, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 180, in deepcopy y = _reconstruct(x, memo, *rv) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 280, in _reconstruct state = deepcopy(state, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 150, in deepcopy y = copier(x, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 240, in _deepcopy_dict y[deepcopy(key, memo)] = deepcopy(value, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 150, in deepcopy y = copier(x, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 215, in _deepcopy_list append(deepcopy(a, memo)) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 150, in deepcopy y = copier(x, memo) File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 220, in _deepcopy_tuple y = [deepcopy(a, memo) for a in x] File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 220, in <listcomp> y = [deepcopy(a, memo) for a in x] File "/home/dmalt/Code/python/neuropycon/npyc3/lib/python3.6/copy.py", line 169, in deepcopy rv = reductor(4) TypeError: can't pickle dict_keys objects``` It seems to me that something goes wrong with looped links inside the workflow object. The error appears when nipype tries to deepcopy the workflow instance. In python2.7 the same works fine. I tried to google for similar problems but I got only one google groups issue like that without responses. Have you guys seen a problem like this?

Kamil_Wojcicki · February 24, 2020, 5:21pm

The answer turned out to be pretty simple. The instance attributes of your model have to be picklable. In my particular case, storing dict_keys caused the issue. Converting those to list, resolved the issue:

model.attribute = list(model.attribute)  # where attribute was dict_keys
model_clone = copy.deepcopy(model)

syomantak · July 13, 2020, 8:18am

If I just want to copy the state dict then would temp = model.state_dict() work or do I need deep copy for state_dict as well? I later keep training so would the temp variable change?

albanD · July 13, 2020, 1:44pm

Hi,

Yes you need to deepcopy it if you want a deep copy.
If you just do this, the temp value will be changed when you update the model.

Lin_Jia · August 29, 2020, 5:53pm

I use pytorch C++ interface. I need to do deep copy for modules. I think I am going to go with this route: 1) dump one module onto the disk using torch save, 2) load the dumped file into a new module class.

fulltopic · September 11, 2020, 2:58am

What’s the corresponding methods of C++ API?