Hey,
I am just a little unclear on the specific details of how autograd works here - specifically, whether the gradients of model weights with respect to input/output Variables are saved within the model’s parameters or within the Variable objects themselves.
I am trying to train a model for use in a reinforcement learning task, and to do so I need to compute the forward pass many times for different inputs before I can get the reward values to perform the backward pass. I am wondering whether I can just do something like e.g.
Y = Variable(torch.zeros(N))
for i in range(N):
Y[i] = model(X[i])
for i in range(N)
optim.zero_grad()
l = loss(Y[i], r)
loss.backward()
optim.step()
or whether I need to do something more sophisticated.