Are there any recommended methods to clone a model?


(Andrew Drozdov) #1

I’m interested to clone a model for various reasons (makes it easy to stack untied versions of the same model for instance). Any recommended methods for doing so?


(Adam Paszke) #2

How about copy.deepcopy(model)


(Andrew Drozdov) #3

Thanks! Seems like a reasonable approach.


(Andrew Drozdov) #4

Ended up going with something like:

model_clone = model_cls(**kwargs)
model_clone.load_state_dict(copy.deepcopy(original_model.state_dict()))

Works well for me!


(Adam Paszke) #5

No need to deepcopy the state dict. Its contents won’t be assigned, but copied into the clone.


(RK) #6

I have a situation where I am copying weights back and forth between two instances of the same model. Unfortunately, copy.deepcopy is not working for me. I am having to do:

mp = list(model.parameters())
mcp = list(model_copy.parameters())
n = len(mp)
for i in range(0, n):
    mp[i].data[:] = mcp[i].data[:]

While this is fine, I wonder why deepcopy function is not working.


Weight initilzation
Does deepcopying optimizer of one model works across the model? or should I create new optimizer every time?
(David Martinez) #7

Did you figure it out? I am having the same problem


(RK) #8

I did not find a different way when I had this problem. But more recently, I used the “copy_” operation to copy weights layer by layer from one model to another:

mydict = mymodel.state_dict()
layer_names = list(mydict)

# Now, to copy values for a particular layer using the name or index of it:

mydict[layer_names[index_of_layer]].copy_(some_data_with_matching_shape)

If there is a better way, I would be happy to learn.


(Royi) #9

What happens if I do this:

hNetModel = Model()
    for trainBatch, trainLabels in hTrainLoader:
        <Train the Model by a Function>
        modelEvaluationMetric = hNetModel(Validation)
        if(modelEvaluationMetric < bestModelMetric):
            hBestModel = hNetModel

Namely I run the model trhough the optimization and if its performance are the best so far I use hBestModel = hNetModel.
At the end I save the dictionary of hBestModel.
Does it makes sense or is it just another reference to the same net always?


(Alban D) #10

It’s just a reference to the same net, so it will be changed when you keep optimizing.
You’ll need to use deepcopy as suggested.


(Royi) #11

Even if the training happens in a different function?
Something like:

hNet in NetList
hNet = TrainNet(hNetModel)
modelEvaluationMetric = hNetModel(Validation)
        if(modelEvaluationMetric < bestModelMetric):
            hBestModel = hNet

I thought at least when something gets back from a function it is a different copy of it (Yea, I’m not so experienced with Python).


(MirandaAgent) #12

just to make your answer clear you mean:

new_mdl = copy.deepcopy(model)

right?


(MirandaAgent) #13

why is deep copy not working for you? in what way is it not working compared to what u expected?


(MirandaAgent) #14

Does something inspired from:

or

not work for you?


(Zhanghao Chen) #15

Hi, copy.deepcopy(model) works fine for me in previous PyTorch versions, but as I’m migrating to version 0.4.0, it seems to break. It seems to have something to do with torch.device. How should I do cloning properly in version 0.4.0?

The traceback is as follows:
(I run
device = torch.device(‘cuda’)
generator = Generator(args.vocab_size, g_embed_dim, g_hidden_dim, device).to(device)
previously, and when I replace device with string ‘cuda’, it works then)

Traceback (most recent call last):
File “main.py”, line 304, in
rollout = Rollout(generator, args.update_rate)
File “/home/x-czh/SeqGAN-PyTorch/rollout.py”, line 14, in init
self.own_model = copy.deepcopy(model)
File “/usr/lib/python3.5/copy.py”, line 182, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File “/usr/lib/python3.5/copy.py”, line 297, in _reconstruct
state = deepcopy(state, memo)
File “/usr/lib/python3.5/copy.py”, line 155, in deepcopy
y = copier(x, memo)
File “/usr/lib/python3.5/copy.py”, line 243, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
File “/usr/lib/python3.5/copy.py”, line 182, in deepcopy
y = _reconstruct(x, rv, 1, memo)
File “/usr/lib/python3.5/copy.py”, line 292, in _reconstruct
y = callable(*args)
File “/usr/lib/python3.5/copyreg.py”, line 88, in newobj
return cls.new(cls, *args)
TypeError: Device() received an invalid combination of arguments - got (), but expected one of:

  • (torch.device device)
  • (str type, int index)

(Andreas) #16

Deepcopy is not working for me.

I have a function train(model) which returns the trained model, model_trained = train(model_untrained). However as result both are trained at the end, but I want the model_untrained to be unchanged. So I tried to deep-copy the model_untrained inside the function before the training loop, but It is not working – the model is not trained correctly. Any idea why is it happening?


#17

Are you trying to train the copied or the original model?
In the first case I assume the optimizer doesn’t have the references to the appropriate parameters, thus probably no model is trained.
Could you check it?


(Andreas) #18

Yes I am training the copied model. You are right about the optimizer, I was passing the original model parameters to it. Thanks for spotting it!


(Shivam Saboo) #19

I was trying to copy a model where the forward function is using @torch.jit.script_method so that I can load it later in C++.
But when I am using deepcopy it gives the error:
can't pickle BaseModel objects
where BaseModel is classname of my model. The same code is working correctly without using jit decorator. This could be something trivial but I am unable to find a workaround.