I’m using the
torch to perform analysis on my network. But I met with the problem that, when I run grad() on different data items, I occassionally get None for some layers on some certain data items, which will further cause errors.
I’m confused why this happens. Why does the same network get None grad on some data items, while does not on other data items? There is not noise items in the dataset, no missing values.
Some part of my network is as follows (to make it short, I did not post the whole network, but it’s enough to tell the problem):
def forward(self, input): x = input out = self.backbone(x) predict_up_conv = self.relu2(self.bn2(self.conv_up_conv(out))) predict_down_conv = self.relu3(self.bn3(self.conv_down_conv(out))) predict_cls_conv = self.relu4(self.bn4(self.conv_cls_conv(out))) predict_up = self.conv_up(predict_up_conv) predict_down = self.conv_down(predict_down_conv) predict_cls = self.conv_cls(predict_cls_conv) if self.phase == 'test': predict_cls_softmax = self.softmax(predict_cls) return predict_up, predict_down, predict_cls_softmax else: return predict_up,predict_down,predict_cls
the code I use to calculate the gradient is:
grads = grad(y, w, retain_graph=True, create_graph=True, allow_unused=True)
y is the target value, and
w is the list of all parameters of the network.
grads is the resulting list of gradients. In normal cases, it is a list of pytorch tensors with shapes matching the feature map of each layer in the network. However, on some data items when I run this code, the
grads has some elements which is
None. And the associated layers are always the
I’m confused what’s wrong with my code.
Thank you all for helping me!!!