Since 0.4, using nn.init.eye_ disables gradient?

krylea · June 20, 2018, 4:03pm

Hello,

I was mostly just wondering if this was intended behaviour and I am using something wrong, or whether this is a bug. I have noticed that since upgrading to 0.4, using the weight initialization method nn.init.eye or nn.init.eye_ in torch.nn.init sets requires_grad to False on the appropriate weight tensor.

For example, try running:

L = nn.Linear(5,5)
print(L.weight.requires_grad)
nn.init.eye_(L.weight)
print(L.weight.requires_grad)

richard · June 20, 2018, 4:13pm

This is not intended behavior. Thank you for the bug report.

For now you can do

import torch.nn as nn
L = nn.Linear(5,5)
nn.init.eye_(L.weight)
L.weight.requires_grad_()  # Enable requires_grad

as a workaround.

richard · June 20, 2018, 4:15pm

I’ve opened a github issue here: https://github.com/pytorch/pytorch/issues/8692

krylea · June 20, 2018, 4:20pm

Yes, that is what I have been doing. Thank you very much!