Hi, I am working on a problem where I need to cache the model parameters (weights) for the last k iterations. In the next iteration, my model needs to use the parameters (randomly picked from the cached values) to compute the gradients.

I tried the following.

```
model = torch.nn.Sequential(
torch.nn.Linear(1000, 100),
torch.nn.ReLU(),
torch.nn.Linear(100, 10),
)
queue.push(model.parameters)
delayed_params = queue.pop()
```

However, I am unable to make the `model`

use `delayed_params`

for computing the gradients. Is there any way to solve this?