Why is it recommended to wrap your data with `Variable` each step of the iterations rather than before training starts?

I recently asked on the pytorch beginner forum if it was good practice to wrap the data with Variable each step or pre-wrap the data before training starts. It seems that its better (for some unknown reason to me) to wrap each step rather than before the training starts. Do people know why thats true?

Context, saw the practice here:


Few points:

  • As soon as you wrap a Tensor in a Variable, it will start saving all computation history for it. So each operation is slightly more costly (less and less true with recent changes). Moreover, when you call backward, it will go through all the history, and so the bigger it is, the longer it’s gonna take.
  • It is quite easy to do an operation on your Variable outsite of the training loop (moving to gpu for example). And in that case, you will end up with and error saying that at the second iteration, you try to backpropagate through the graph a second time.

That being said, Variables are going to disapear soon as they will be merged with Tensors, so don’t overthing this :smiley:

when are they disappearing?

For the 0.4 release if I am not mistaken. Not sure when this is due though.

Sorry to resurrect this topic. I’m new to PyTorch, switching from TF, and trying to learn it.
So is this still the case? Something like
inputs = Variable(inputs, requires_grad=False) is still considered best practice?


Not at all :slight_smile:
Variable doesn’t exist anymore (it’s a noop).
What do you want to do?
If you want a new Tensor that does not share history with the current one, you can use inputs = inputs.detach().

Also feel free to open a new thread if you have question about how to write specific stuff or if you want to double check with us that you’re doing things the right way!

1 Like