Are there any recommended methods to clone a model?

I’m interested to clone a model for various reasons (makes it easy to stack untied versions of the same model for instance). Any recommended methods for doing so?

7 Likes

How about copy.deepcopy(model)

31 Likes

Thanks! Seems like a reasonable approach.

1 Like

Ended up going with something like:

model_clone = model_cls(**kwargs)
model_clone.load_state_dict(copy.deepcopy(original_model.state_dict()))

Works well for me!

4 Likes

No need to deepcopy the state dict. Its contents won’t be assigned, but copied into the clone.

8 Likes

I have a situation where I am copying weights back and forth between two instances of the same model. Unfortunately, copy.deepcopy is not working for me. I am having to do:

mp = list(model.parameters())
mcp = list(model_copy.parameters())
n = len(mp)
for i in range(0, n):
    mp[i].data[:] = mcp[i].data[:]

While this is fine, I wonder why deepcopy function is not working.

3 Likes

Did you figure it out? I am having the same problem

I did not find a different way when I had this problem. But more recently, I used the “copy_” operation to copy weights layer by layer from one model to another:

mydict = mymodel.state_dict()
layer_names = list(mydict)

# Now, to copy values for a particular layer using the name or index of it:

mydict[layer_names[index_of_layer]].copy_(some_data_with_matching_shape)

If there is a better way, I would be happy to learn.

2 Likes

What happens if I do this:

hNetModel = Model()
    for trainBatch, trainLabels in hTrainLoader:
        <Train the Model by a Function>
        modelEvaluationMetric = hNetModel(Validation)
        if(modelEvaluationMetric < bestModelMetric):
            hBestModel = hNetModel

Namely I run the model trhough the optimization and if its performance are the best so far I use hBestModel = hNetModel.
At the end I save the dictionary of hBestModel.
Does it makes sense or is it just another reference to the same net always?

It’s just a reference to the same net, so it will be changed when you keep optimizing.
You’ll need to use deepcopy as suggested.

Even if the training happens in a different function?
Something like:

hNet in NetList
hNet = TrainNet(hNetModel)
modelEvaluationMetric = hNetModel(Validation)
        if(modelEvaluationMetric < bestModelMetric):
            hBestModel = hNet

I thought at least when something gets back from a function it is a different copy of it (Yea, I’m not so experienced with Python).

just to make your answer clear you mean:

new_mdl = copy.deepcopy(model)

right?

why is deep copy not working for you? in what way is it not working compared to what u expected?

Does something inspired from:

or

not work for you?

Hi, copy.deepcopy(model) works fine for me in previous PyTorch versions, but as I’m migrating to version 0.4.0, it seems to break. It seems to have something to do with torch.device. How should I do cloning properly in version 0.4.0?

The traceback is as follows:
(I run
device = torch.device(‘cuda’)
generator = Generator(args.vocab_size, g_embed_dim, g_hidden_dim, device).to(device)
previously, and when I replace device with string ‘cuda’, it works then)

Traceback (most recent call last):
File “main.py”, line 304, in
rollout = Rollout(generator, args.update_rate)
File “/home/x-czh/SeqGAN-PyTorch/rollout.py”, line 14, in init
self.own_model = copy.deepcopy(model)
File “/usr/lib/python3.5/copy.py”, line 182, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File “/usr/lib/python3.5/copy.py”, line 297, in _reconstruct
state = deepcopy(state, memo)
File “/usr/lib/python3.5/copy.py”, line 155, in deepcopy
y = copier(x, memo)
File “/usr/lib/python3.5/copy.py”, line 243, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File “/usr/lib/python3.5/copy.py”, line 182, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File “/usr/lib/python3.5/copy.py”, line 292, in _reconstruct
y = callable(*args)
File “/usr/lib/python3.5/copyreg.py”, line 88, in newobj
return cls.new(cls, *args)
TypeError: Device() received an invalid combination of arguments - got (), but expected one of:

  • (torch.device device)
  • (str type, int index)

Deepcopy is not working for me.

I have a function train(model) which returns the trained model, model_trained = train(model_untrained). However as result both are trained at the end, but I want the model_untrained to be unchanged. So I tried to deep-copy the model_untrained inside the function before the training loop, but It is not working – the model is not trained correctly. Any idea why is it happening?

Are you trying to train the copied or the original model?
In the first case I assume the optimizer doesn’t have the references to the appropriate parameters, thus probably no model is trained.
Could you check it?

2 Likes

Yes I am training the copied model. You are right about the optimizer, I was passing the original model parameters to it. Thanks for spotting it!

1 Like

I was trying to copy a model where the forward function is using @torch.jit.script_method so that I can load it later in C++.
But when I am using deepcopy it gives the error:
can't pickle BaseModel objects
where BaseModel is classname of my model. The same code is working correctly without using jit decorator. This could be something trivial but I am unable to find a workaround.

import pickle
copyed_model = pickle.loads(pickle.dumps(model))

:grinning:

7 Likes