Gradient becomes None, after manually updating the weight (Using Federated Learning)

codeflux · April 22, 2022, 3:01pm

Hi,
what I am trying to do is speeding up the learning process using federated reinforcement learning.
In this case, I want to share the average gradient to all users and update the weights using this average value.

I do the update of the weights manually with the following syntax:

with torch.no_grad():
data_input = model.input_layer.weight.data + learning_rate * average_gradient.data
model.input_layer.weight = torch.nn.Parameter(data_input)

Now I thought that using torch.no_grad() would ensure, that this operation has no influence on the weight update.
As it seems, the gradient is set to None after using model.input_layer.weight = torch.nn.Parameter(data_input)

ptrblck · April 22, 2022, 11:02pm

In your code snippet you are assigning a new nn.Parameter to a weight attribute, which would then be missing e.g. from the optimizer unless you add it to the optimizer as a new param_group.
The proper way to update the parameters is to use an inplace operation such as .sub_() or .copy_() in the no_grad() context.