Zero output from forward pass

class My_Model(nn.Module):
	def __init__(self):
		super(My_Model, self).__init__()
		self.conv1 = nn.Conv2d(1, 8, 3,padding = (1,1)) #input_ch,out_ch,
		self.conv2 = nn.Conv2d(8, 4, 3,padding=(1,1))
		self.conv3 = nn.Conv2d(4, 2, 1)
		self.conv4 = nn.Conv2d(4,1,3) #,padding = (1,1))

	def forward(self, x):
		x = F.relu(self.conv1(x))
		x = F.relu(self.conv2(x))
		x = F.relu(self.conv3(x))

		bilinear =   torch.nn.modules.upsampling.UpsamplingBilinear2d(scale_factor=2)   #nn.UpsamplingBilinear2D(scale_factor = 2)
		nearest = torch.nn.modules.upsampling.UpsamplingNearest2d(scale_factor = 2)
		x_bilinear = bilinear(x)
		x_nearest = nearest(x)
		x = torch.stack((x_bilinear,x_nearest),dim=2) 
		x = x.view(34,4,34,34)
		x = F.relu(self.conv4(x))
		return x

This is the network I designed for some image inputs. Below is how I am training it.

    model = My_Model()

    criterion = torch.nn.MSELoss(size_average=False)
    optimizer = torch.optim.SGD(model.parameters(), lr=0.0002, momentum=0.9)

    for loop over data ...
   	x_batch = torch.from_numpy(x_batch)
    y_batch = torch.from_numpy(hr_batch)
    x_batch = Variable(x_batch,requires_grad = True).cuda()
    y_batch = Variable(y_batch,requires_grad = False).cuda()


    y_pred = model(x_batch)

    print y_pred

    loss = criterion(y_pred, y_batch)
	print('loss = ',[0])
	# Zero gradients, perform a backward pass, and update the weights.

The input x_batch is 34 x 1 X 17 X 17 and y_batch is 34 X 1 X 32 X 32.
However while training, the y_pred always gives a zero tensor.
Please help.

I have the same issue on my LSTM code, it always output zeros. Is there any update?

@torotoki Your model is obviously very different from his. So you would be better off opening a new thread, and if you don’t show any code, then there is no way we can tell you what is wrong with your code.

@jpeg729 I’m not sure this is different or not (maybe I should open a new thread), but finally I solved this issue.

In my case, it always output zeros when I set the batch size to “1”, but in a different batch size it outputs some various vectors and learns correctly.

I’m still trying to figure out why this is happend and what part of my code makes this issue (in 600 lines code). I just note this since it might be helpful for someone else.

That is weird. The model should work the same way whatever the batch size. It could be that your model is mixing inputs from different samples at some point in its forward method.

That said, if the model seems to be learning correctly then maybe there is a bug in pytorch, and in that case a minimal example that shows the error would be useful.

1 Like