Initializing parameters with weight or weight.data?

I would avoid using the .data attribute anymore and instead wrap the code in a

with torch.no_grad():
    ...

block if necessary.
Autograd cannot warn you, if you manipulate the underlying tensor.data, which might lead to wrong results.

3 Likes