Tensor and tensor.data

Hi, I am trying to reproduce the DCGAN tutorial code.
In the code, there is weights_init() method to initialize weights of model.

m.weight.data.normal_(0.0, 0.02)

As I know, in 0.4.0 the Variable is replaced by the Tensor.
So I delete the data cause its already Tensor.

m.weight.normal_(0.0, 0.02)

Then I got an error below.

RuntimeError: Leaf variable was used in an in-place operation

It seems like I still use Variable under Tensor.

  • Could anyone help to understand current relationship of Tensor and Variable?
  • Why m.weight.normal_(0.0, 0.02) without .data doesn’t work?

Thanks.

1 Like

A leaf variable is a variable, which you created directly and which is not a result of an operation.
So in your case the parameters of your model are leaf variables and shouldn’t be modified in-place.
You can check if with:

m.weight.is_leaf

Using .data is still the way to go to initialize your model. @tom has a nice explanaition in this post.

.data wasn’t removed in the latest version and still has similar semantics. Have a look at the Migration Guide.

1 Like

So, when I understand correctly, both

    self.linear = torch.nn.Linear(num_x, num_y)
    self.linear.weight.data.zero_()
    self.linear.bias.data.zero_()

and

    self.linear = torch.nn.Linear(num_x, num_y)
    tmp = self.linear.weight.detach()
    tmp.zero_()

    tmp = self.linear.bias.detach()
    tmp.zero_()

would do the same thing in PyTorch 0.4, but detach() is generally recommended (the lower approach), since modifying .data in some different cases would lead to weird results (i.e., changing the .data of leaf variables during backpropagation and then calculating the gradient incorrectly, whereas detach() would result in a “safer” error instead)?

Just curious, what’s a use-case of .data now that Variables have been deprecated, is this purely kept due to backwards-compatibility reasons?

Thanks for helps!
So the key is that somehow Pytorch blocks in-place operation on a leaf variable and .data is way to be a non-leaf variable.

Is there special reason for stricter blocking in case of the leaf variable?
I read that in-place operations are discouraged but not clear which of two is relevant with the leaf variable case.

1 Like

on a leaf variable and .data is way to be a non-leaf variable.

It would still be a leaf variable, but as far a I understand, .data would allow you to workaround the fact that it is a leaf variable and would allow an in-place modification during a forward pass nonetheless, which can be dangerous (since users may be unaware that in certain cases that this would lead to “unintended” gradient computations).

Is there special reason for stricter blocking in case of the leaf variable?
I read that in-place operations are discouraged but not clear which of two is relevant with the leaf variable case.

Have a look at the “What about .data?” section in the PyTorch 0.4 migration guide: https://pytorch.org/2018/04/22/0_4_0-migration-guide.html

Thanks for making my wrong words right.
Yes it will definitely stay as a leaf node.

About the migration part, I read and understand .data could be dangerous so detach() is recommended if in-place is necessary.

But when read it, I thought the part was saying about normal inner node cases. And there might be more critical issue for the leaf node case because pytorch not gave the strict “leaf variable was used …” error in this normal case.

Now it seems I made too much assumption.

Thanks for help!

Now that you pinged me, I prefer to initialize with with torch.no_grad():. :slight_smile:
But yeah, lots of places use .data to modify parameters (initialization, optimizers).

Best regards

Thomas

2 Likes

Could you post param init code with

with torch.no_nograd():

?

Most of the stock init functions use it these days, see torch/nn/init.py.

Best regards

Thomas

Can you give a detailed example? what’s your meaning is like as follows?

with torch.no_grad():
        m.weight.normal_(0.0, 0.02)

is it right?

Yeah, it’s the way to go. Previously, the underlying .data was used to initialize the weights, but it seems all was merged to syntax in your example.