I am, for whatever reason, attempting to convert a simple
pytorch-1.0.1 training script to pytorch version 0.3.0.
As far as I can tell,
torch.no_grad() doesn’t exist in 0.3.0.
My use case is that (in the training loop) I run the model
(predict, based on my input data), and then do some sanity
checks and collect some statistics. I wrap these latter two
with torch.no_grad(): block. I don’t really understand
what I am doing (but it works …), but I suppose I avoid
cluttering up my graph with cruft (efficiency) and/or changing
the results of calling
What should I do in 0.3.0? Can I just leave out the
with torch.no_grad() (because my additional calculations
come after calculating the loss, so the gradients won’t be
affected)? Should I clone / detach any tensors I use in my
sanity / statistics calculations? Is there something in 0.3.0
no_grad() that I should be using (but with a different
name or semantics)?
Thanks for any help.
0.3 you would use
Variable(torch.tensor(...), volatile=True) to avoid creating the computation graph.
Let me know, if that works for you.
Thank you for your reply.
To clarify in my concrete case:
The output of running the forward pass, for training, of my model
torch.autograd.variable.Variable. Should I retrieve its
FloatTensor by using its
data property, wrap that tensor in a
Variable, a la:
torch.autograd.variable.Variable (preds.data, volatile = True)
and then use this volatile
Variable for any additional calculations?
this work flow would detach the output from the model and all following operation put in a "
preds would still hold to the computation graph. The
preds.data is detached and Autograd does not track any operations on it.
Is this the use case?
Thank you for your clarifying reply.
Yes, I believe that it is.
I will follow up when I’ve I tried this out (and if and when I think I
understand it …).
Thank you again. Yes, this works. With this (plus a couple of
other 0.3.0 tweaks) my 0.3.0 training run appears to be statistically
identical to the 1.0.1 version, so it looks like I’m not corrupting
the gradients, or anything.