I am very very confused about the gradients when we do back-propagation. According to the Pytorch’s Autograd webpage, it says that if we want a tensor/variable to contain gradient information, we need to set
require_grad=True. I have gone through many examples on the web, for instance: reinforce algorithm example , or policy gradient example, they didn’t set
I am wondering, in that case, when they do
loss.backward(), and they didn’t even specify a single instance that
require_grad=True, what exactly is happening? Do they event do back-propagation?
Hi，we do not need the gradient of input(In most cases, they are useless, unless some special works like neural style transfer, where we only iteratively change the input to optimize the total loss). Usually, we only want to get the model trained. Parameters in each layer are default to be
So there is no worry about it.
Thanks Naruto for your input. Are you saying that, if we are building network in the following fashion, we don’t need to worry about the setting the any tensor to have
require_grad = True?
I am very confused because I was reading the tutorial from here: autograd tutorial, and it was emphasizing about the
require_grad= True, where as in the example below, we don’t care about this.
So in short, my question is, we only use
require_grad=True when we are building network from scratch (like the pytorch tutorial link). If we are building network in the fashion given by below’s code, we don’t need to worry about this?
self.affine1 = nn.Linear(4, 128)
self.affine2 = nn.Linear(128, 2)
self.saved_log_probs = 
self.rewards = 
def forward(self, x):
x = F.relu(self.affine1(x))
action_scores = self.affine2(x)
return F.softmax(action_scores, dim=1)
policy = Policy()
optimizer = optim.Adam(policy.parameters(), lr=1e-2)
eps = np.finfo(np.float32).eps.item()
This is a snippet of
Linear layer, as you can see. The learnable weights are registered as
Parameter which is default as
requires_grad=True, see here. Input of the networks needs no gradient(They are useless in most cases). So everything is fine.
def __init__(self, in_features, out_features, bias=True):
self.in_features = in_features
self.out_features = out_features
self.weight = Parameter(torch.Tensor(out_features, in_features))
self.bias = Parameter(torch.Tensor(out_features))
Thanks Naruto. That clarifies things a lot for me.
One more question, I see many examples just
Autograd.Variable, but now,
Variable is being deprecated, do you see the need of using
I cannot think up one case, it is beyond me…
wow, i had the same questions after following tutorials, thanks for clarifying. I was scratching my head thinking what was my NN network doing with no requires_grad= True anywhere.