In v0.4, is there any reason to still use tensor.data?


(Wenliang Dai) #1

For example, here is a snippet of code of old version of pytorch:

# classifier is the classifier of a torchvision pre-trained model
fc6 = nn.Conv2d(512, 4096, kernel_size=7)
fc6.weight.data.copy_(classifier[0].weight.data.view(4096, 512, 7, 7))
fc6.bias.data.copy_(classifier[0].bias.data)

Is there a better way to write this in version 0.4?

Thanks in advance!


(Jerome R) #2

You simply need to remove the .data


(Simon Wang) #3

operations on .data are hidden from autograd. in this case, if you used weight and bias in a graph before this segment, and don’t do .data or .detach or with torch.no_grad(), autograd may complain about necessary tensors being modified inplace.


(Artyom) #4

As I understood migration guide correctly, we can just simply replace code with .data to .detach if operation was not inplace and torch.no_grad() if it was inplace operation. Like in this example: Detach and .data


(Simon Wang) #5

Yes you can generally do that, unless you are doing some hacks that you want hidden from autograd :wink:


(dashesy) #6

I have a layer that is forward-only and using .data was the simplest way to implement is.

    second_loss(func(x.data), y)

I could use no_grad context, but that would have been not as elegant because I do want grad for second_loss, just not for func.


(Artyom) #7

why not to use x.detach() ?


(dashesy) #8

What happens if I have second_loss(func(x.detach()), func2(x.detach()), y), then it would only matter if x.detach(), being a function that I do not know its implementation, has an overhead or not.


(Thomas V) #9

Given that x.data is a property that does function calls in the background to re-wrap the underlying stuff in a new variable, I would not know the overhead of x.data either (but believe both are similar, after looking at the implementation).

The migration guide fairly clearly advises

Best regards

Thomas