I am confused when to use conv.weight.data VS conv.weight. For example the following code uses,
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
but I also see at many places
nn.init.kaiming_normal_(m.weight.DATA, mode='fan_out', nonlinearity='relu')
Whic one to use? I am using PYTORCH 1.3.1.
To investigate I checked the source code on GITHUB for v1.3.1 (https://github.com/pytorch/pytorch/blob/v1.3.1/torch/nn/init.py#L353)
fan = _calculate_correct_fan(tensor, mode)
gain = calculate_gain(nonlinearity, a)
std = gain / math.sqrt(fan)
return tensor.normal_(0, std)
Since the function accepts a TENSOR I expect not to use .DATA and using it should throw a runtime error. But surprisingly no error comes. Hence I am confused which one to use (especially for v1.3.1)?
Don’t use the
.data attribute as it might yield unwanted side effects.
While you will be able to manipulate the underlying data without raising an error, Autograd won’t be able to track these operations and you might run into a variety of issues later (we had quite a few of these issues already here in the forum ).
This attribute is also removed step by step as seen in this PR by @albanD.
Thankyou for the explanation,
but then how to do the following operation,
conv_shuffle is an instance of nn.Conv2d. I want to explicitly state its weights using
However, this results in the following error
RuntimeError: a leaf Variable that requires grad has been used in an in-place operation.
but is rectified using the following.
So now that
.data is discouraged, what other alternative do I have?
You could explicitly wrap this manipulation in a
conv = nn.Conv2d(3, 3, 3, 1, 1)