Consider the following code:
n_inputs = 2 n_outputs = 2 model = nn.Sequential( nn.Linear(n_inputs,32), nn.ReLU(inplace=True), nn.Linear(32, 64), nn.ReLU(inplace=True), nn.Linear(64, 32), nn.ReLU(inplace=True), nn.Linear(32, n_outputs), nn.Tanh(),) output = state(x) print(output.grad_fn)
<TanhBackward0 at 0x...>
But, now when I use some other function F (could be a neural network) to generate weights for the model and use load_state_dict or any other inplace weights assignment the gradient flow is broken. For example:
new_weights = F(some_input) new_state_dict = OrderedDict() for name,layer in model.state_dict().items(): end += np.prod(layer.shape) new_state_dict[name] = new_weights[start:end].view(layer.shape) start = end model.load_state_dict(new_state_dict) output = state(x) output.grad_fn
May be load_state_dict is not a part of autograd, which is understandable, but I can not think of an non-equivalent way to reset weights manually through another gradient-requiring process. I can do it for a simple neural network explicitly defining the layers but still it doesnt scale.
Any help? Thanks