Hi,
New here. This may be silly but I am unsure about the default value of requires_grad for parameters of a nn object.
For example:
linear = nn.Linear(10,10)
linear.train()
for params in linear.parameters():
print('requires_grad of parameters -',params.data.requires_grad)
print('requires_grad of output of layer - ',linear(torch.tensor([0.]*10)).requires_grad)
break
Output:
requires_grad of parameters - False
requires_grad of output of layer - True
Why do the parameters of a layer in train mode have requires_grad as false? Why does the output of the same layer have requires_grad true? I am confused. Just started pytorch so I am unsure if I am making some instantiation mistake.
Also, nothing changes when the layer is being evaluated:
linear = nn.Linear(10,10)
linear.eval()
for params in linear.parameters():
print('requires_grad of parameters -',params.data.requires_grad)
print('requires_grad of output of layer - ',linear(torch.tensor([0.]*10)).requires_grad)
break
Output:
requires_grad of parameters - False
requires_grad of output of layer - True
Thanks for the suggestion. It all makes sense now!
Also, if I wanted to manually set the values of the weights, is .data still not the way to go? I have been doing something like this in my code to manually update the weights.
for param in DNN1.parameters():
param.data = theta_t[start:end].view(param.shape)
theta_t is just a tensor I generate from another process.