Here is a short code snippet.
In : import torch In : t1 = torch.rand(2,4) In : t1.requires_grad Out: False In : t1.requires_grad = True In : t1.requires_grad Out: True In : l = torch.nn.Linear(4,2) In : l.weight.data.requires_grad Out: False In : l.weight.data.requires_grad = True In : l.weight.data.requires_grad Out: False In : l.weight.requires_grad Out: True In : print(type(t1), type(l.weight), type(l.weight.data)) <class 'torch.Tensor'> <class 'torch.nn.parameter.Parameter'> <class 'torch.Tensor'>
I am confused about some output of the code in terms of these things:
- What’s the difference between
l)? E.g., How do they influence behaviors in calculating the gradients?
- Why setting
Trueworks but setting
Truedoes not, considering that
l.weight.dataare both an object of class
- What the relationship between
l.weight.requires_grad? The two seems not synchronized.
I read related documents, source codes and forum discussions, but I found it hard to form a comprehensive understandings in mind to perfectly explain the questions mentioned above.
I really appreciate it if someone helps.