Allow subset of layer weights to be updated

RTC · September 1, 2020, 9:57pm

I have a network and after a few conv layers, it has 3 heads: self.fc_pi, self.fc_q and self.fc_beta. I have frozen all of the conv layer weights and am allowing updates only to self.fc_q and self.fc_beta by adding only their weights to the optimizer as:

self.optimizer = config.optimizer_fn(self.network.fc_q.parameters())
self.optimizer.add_param_group({'params': self.network.fc_beta.parameters()})

The self.fc_pi is a linear layer that connects 200 neurons to 16 neurons so when I do print(list(self.network.fc_pi.parameters())[0].shape), I get [16,200] and when I do print(list(self.network.fc_pi.parameters())[1].shape), I get [16], both of which respectively represent the weights and biases of this linear layer.

I would like to add a subset of these weights and biases to the optimizer so that only the weights connected to the first 4 (out of the 16 neurons), and their biases, are updated. I have tried this:

self.optimizer.add_param_group({'params': list(self.network.fc_pi.parameters())[0][0:4]})
self.optimizer.add_param_group({'params': list(self.network.fc_pi.parameters())[1][0:4]})

but this raises the error:

ValueError: can't optimize a non-leaf Tensor

How can I correctly add a subset of the layer weights and biases to the optimizer so that only a subset of weights is updated and not all the weights in the layer.

mariosasko · September 1, 2020, 10:51pm

After the backward call, and before the optimizier update, you can access the gradient tensor (param.grad) and assign zero to the parts that you want to freeze/not update. This way these weights are left unchanged after the optim.step call.