I’m trying to dynamically keep adding new modules to my network after N steps as in Progressive Growing of GAN.
But I came across this info from @Carl about PyTorch from here:
Dynamically adding/removing modules is relatively easy in Tensorflow/Keras, since a graph of the model is available on the python side. In PyTorch, you cannot traverse the graph of your model to insert modules.
Dynamically adding new params to optimizer seems to be harder that I though in pytorch. Even though, I found a solution to do it, here. This post author itself doesn’t recommend using it.
However, here’s the reason why you should probably never update your optimizer like this, but should instead re-initialize from scratch, and just accept the loss of state information
So - to sum it up: I’d really recommend to try to keep it simple, and to only change a parameter as conservatively as possible, and not to touch the optimizer.
I tried re-initialize optimizer from scratch whenever a new module is added to the network and I saw huge drop in validation which I cant sacrifice. Is there a better way to solve this problem?
You can use model.add_module, but would also need to change the forward method or use this new module manually.
optimizer.add_param_group can be used to add new parameters.
This might work, but I would recommend to double check, if this layer is really used (e.g. via forward hooks) and make sure the parameters are added to the optimizer.
Thank you. This is how I want to add new blocks to the network dynamically. Can I use add_module as shown below?
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.modules = [convblock1, convblock2, convblock3] # some list of modules
self.convblock0 = self.conv(...)
self.add_module("convblock1", self.modules[0]) # this doesn't look like the recommend way to use it
self.add_module("convblock2", self.modules[1])
self.add_module("convblock3", self.modules[2])
def forward(self,x, epoch_num):
x = self.convblock0(x)
if epoch_num >= 2:
x = self.convblock1(x)
if epoch_num >= 4:
x = self.convblock2(x)
if epoch_num >= 6:
x = self.convblock3(x)
return x
OR
class MyModel(nn.Module):
def __init__(self):
super(MyModel, self).__init__()
self.modules = [convblock1, convblock2, convblock3] # some list of modules
self.convblock0 = self.conv(...)
def forward(self,x, epoch_num):
x = self.convblock0(x)
if epoch_num == 2: # can I use add_module inside forward?
self.add_module("convblock1", self.modules[0])
if epoch_num >= 2:
x = self.convblock1(x)
if epoch_num == 4:
self.add_module("convblock2", self.modules[1])
if epoch_num >= 4:
x = self.convblock2(x)
if epoch_num == 6:
self.add_module("convblock3", self.modules[2])
if epoch_num >= 6:
x = self.convblock3(x)
return x
Is this how I should update the optimizer after added new module to the model?
model = MyModel()
optimizer = torch.optim.Adam(model.convblock0.parameters(), learning_rate)
for epoch in range(10):
if epoch == 2:
new_par_dict = dict()
new_par = model.modules[0].parameters()
new_par_dict["params"] = new_par
optimizer.add_param_group(new_par_dict)
if epoch == 4:
new_par_dict = dict()
new_par = model.modules[1].parameters()
new_par_dict["params"] = new_par
optimizer.add_param_group(new_par_dict)
if epoch == 6:
new_par_dict = dict()
new_par = model.modules[2].parameters()
new_par_dict["params"] = new_par
optimizer.add_param_group(new_par_dict)
output = model(input, epoch)
loss = loss_fun(output)
optimizer.zero_grad()
loss.backward()
optimizer.step()
But add_param_group expect dict. Is there a faster way to get param group dict of a specific layer? And also does this preserve previous state of the optimizer?