I have a similar doubt. What I have found so far is that nn.Parameters are LEAF nodes with no history. So, the Gradient Descent (GD) operations that you are trying to perform for the inner loop (adapted_params[key] = val - meta_step_size * grad) won’t be recorded in the computation graph.
The following may help you understand why updating model parameters like the way you have won’t work in PyTorch in more detail:
In fact, you may want to use the package “HIGHER” or refer to the following codes for a workaround (look for MetaModule in the repositories):
- AdrienLE (Adrien Ecoffet) · GitHub
- https://github.com/xjtushujun/meta-weight-net/blob/bd1fd3e297c59df7c5264fb636c67d8ee03bcf0d/resnet.py#L240
I hope that helps you understand the problem with your method.