I’m using the autograd.grad
of torch
to perform analysis on my network. But I met with the problem that, when I run grad() on different data items, I occassionally get None for some layers on some certain data items, which will further cause errors.
I’m confused why this happens. Why does the same network get None grad on some data items, while does not on other data items? There is not noise items in the dataset, no missing values.
Some part of my network is as follows (to make it short, I did not post the whole network, but it’s enough to tell the problem):
def forward(self, input):
x = input
out = self.backbone(x)
predict_up_conv = self.relu2(self.bn2(self.conv_up_conv(out)))
predict_down_conv = self.relu3(self.bn3(self.conv_down_conv(out)))
predict_cls_conv = self.relu4(self.bn4(self.conv_cls_conv(out)))
predict_up = self.conv_up(predict_up_conv)
predict_down = self.conv_down(predict_down_conv)
predict_cls = self.conv_cls(predict_cls_conv)
if self.phase == 'test':
predict_cls_softmax = self.softmax(predict_cls)
return predict_up, predict_down, predict_cls_softmax
else:
return predict_up,predict_down,predict_cls
the code I use to calculate the gradient is:
grads = grad(y, w, retain_graph=True, create_graph=True, allow_unused=True)
y
is the target value, and w
is the list of all parameters of the network.
grads
is the resulting list of gradients. In normal cases, it is a list of pytorch tensors with shapes matching the feature map of each layer in the network. However, on some data items when I run this code, the grads
has some elements which is None
. And the associated layers are always the self.conv_up_conv
, self.conv_down_conv
, self.conv_up
, self.conv_down
.
I’m confused what’s wrong with my code.
Thank you all for helping me!!!