Hi,
How can I correctly create multiple copies of a model and backpropagate the loss correctly? So I have a model
and I am simply did the following:
model1 = model
model2 = model
model3 = model
Firstly, is this correct or should I use deepcopy (or how is deepcopy different from the above)?
Secondly, after I create/copy model1, model2, and model3, when I execute the line:
model.to(device)
I notice that all the four models are moved to device
. So if I move the model
to gpu
, the other 3 models also moved to gpu
and same for cpu
. Why is that?
Thirdly, I try to delete the model
(or the copied models) by calling del model
, but it is still not deleted and 'model' in locals()
return True
(same True
is returned for other models even when I delete them via del
). Any explanation for this behavior?
Lastly, and importantly, I want to compute the loss of each model independently. Something like:
loss1 = criterion(outputs1, labels)
loss2 = criterion(outputs2, labels)
loss3 = criterion(outputs3, labels)
How can I aggregate the loss (is loss = loss1 + loss2 + loss3
okay?) such that this total loss
is backpropagated to all the models. Also, how to call _.backward()
and _.step()
functions in this case.
Any help on this will be much appreciated.