Consider the following code:
n_inputs = 2
n_outputs = 2
model = nn.Sequential(
nn.Linear(n_inputs,32),
nn.ReLU(inplace=True),
nn.Linear(32, 64),
nn.ReLU(inplace=True),
nn.Linear(64, 32),
nn.ReLU(inplace=True),
nn.Linear(32, n_outputs),
nn.Tanh(),)
output = state(x)
print(output.grad_fn)
Rightfully prints: <TanhBackward0 at 0x...>
But, now when I use some other function F (could be a neural network) to generate weights for the model and use load_state_dict or any other inplace weights assignment the gradient flow is broken. For example:
new_weights = F(some_input)
new_state_dict = OrderedDict()
for name,layer in model.state_dict().items():
end += np.prod(layer.shape)
new_state_dict[name] = new_weights[start:end].view(layer.shape)
start = end
model.load_state_dict(new_state_dict)
output = state(x)
output.grad_fn
None.
May be load_state_dict is not a part of autograd, which is understandable, but I can not think of an non-equivalent way to reset weights manually through another gradient-requiring process. I can do it for a simple neural network explicitly defining the layers but still it doesnt scale.
Any help? Thanks