I have a question concerning two different possibilities of initializing a nn.Parameter. In particular, would you expect a different behaviour of the following two possible initializations:
a = nn.Parameter(torch.zeros(5, requires_grad=True))
a = nn.Parameter(torch.zeros(5), requires_grad=True)
Thank you in advance!
would you expect a different behaviour during training for these two cases?
nn.Parameter will enable gradients by default, so the right way to achieve the same thing is just
a = nn.Parameter(torch.zeros(5)).
Your first variant works, but it is wrong - it pretends that by changing to
requires_grad=False there would be a difference.
The second variant just spells out the default parameter.