Loss doesn't Decrease and the output is zero

I am implementing a 3d deconvolution autoencoder. The issue I am having now is that the training(and testing) loss doesn’t seems to decrease and the output of the neural network is a zero matrix.
I wonder if my training process is right, i.e., declare a net, read in the data, change the data into variable type, feed the data into the net, calculating loss, backpropogate and update the weight.
this is my net:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv3d(1, 4, 5, padding=2)
        self.conv2 = nn.Conv3d(4, 16, 5, padding=2)
        self.enc1 = nn.Conv3d(16, 32, 5, stride=2, padding=2)
        self.enc2 = nn.Conv3d(32, 64, 5, stride=2, padding=2)
        self.enc3 = nn.Conv3d(64, 128, 5, stride=2, padding=2)
        self.enc4 = nn.Conv3d(128, 256, 5, stride=2, padding=2)
        self.dec4 = nn.ConvTranspose3d(256, 128, 5, stride=2, padding=2)
        self.dec3 = nn.ConvTranspose3d(128, 64, 5, stride=2, padding=2, output_padding=1)
        self.dec2 = nn.ConvTranspose3d(64, 32, 5, stride=2, padding=2, )
        self.dec1 = nn.ConvTranspose3d(32, 16, 5, stride=2, padding=2)
        self.conv3= nn.Conv3d(16, 8, 5, padding=2)
        self.conv4= nn.Conv3d(8, 1, 5, padding=2)
    def forward(self, x):
        return x

and this is my training process

#define a net
net = Net()
#using Gpus
if torch.cuda.is_available():
    print('cuda disabled')
optimizer=optim.SGD(net.parameters(), lr=0.0001)

for i_batch, sample in enumerate(dataLoader):
    print('read the data')
    input,target=sample['tr'].type(torch.FloatTensor), sample['gt'].type(torch.FloatTensor)
    if torch.cuda.is_available():
        input, target=input.unsqueeze(1).cuda(), target.unsqueeze(1).cuda()
        input, target=input.unsqueeze(1), target.unsqueeze(1)
    input, target=Variable(input), Variable(target)
    print('put the data into net')
    loss = criterion(output, target)
    loss = loss*10
    print('back propagate')
    print('iter %d,  training loss: mse %.4f' %(i_batch, loss))

    if i_batch%50==0:
        test_input=Variable(testsample['tr'].unsqueeze(0).unsqueeze(1).type(torch.FloatTensor).cuda(), volatile=True)
        print('save test sample id %d blur to %s, result to path %s'%(id_test, save_blur_path, save_result_path))

Did you return the output from forward function?

return x

I think adding this line to the forward() would return the value for the model output correctly

Yes, I did. There was a type in the question. Actually I did have return x in my code.

And I try different dataset. After a few iteration, the output decrease to zero. Fairly strange

Could you try to initialize the layers?

def weights_init(m):
    if isinstance(m, nn.Conv3d):


Thanks! I initialize this way now. still not working…

def weights_init(m):
    if isinstance(m, nn.Conv3d):
    elif isinstance(m, nn.ConvTranspose3d):

Ok, thanks for trying!
It seems you are missing the last conv layer in your forward pass, but I assume it’s a typo like the missing return statement.

In summary: your model does not learn anyhing, neither the training sequences nor the validation/test sequences.
The losses stay the same and the model’s output is all zeros.

In this case, I would try to boil down the problem by making the problem simpler.
One approach to see if your model architecture is good at all is to use very few samples (you could also use just one sample) and to try to overfit badly on it.
If this doesn’t work at all, we can dig a bit deeper.

Could you try that and report your results?


Sure, it is a typo. I am working on that now And the loss is decreasing, but super slow. The mse loss just took 20 iterations to decrease from 2705 to 2700

And the speed slow down now to 0.003 per iteration

As long as it’s decreasing, it’s a good sign. We can probably speed it up with an adaptive optimizer like Adam, but in the debug phase it’s kind of useless to tune the optimizer.

Is you network architecture taken from some working example or did you just come up with it?

It is a simple AutoEncoder I come up myself:joy:

Probably I should try some network of verified performance? But fairly strange that even that I feed the target to the input(the input is the same with the target output) and only feed one sample, it performs so bad

Ok, nice. Could you explain a bit more about your dataset, i.e.

  • the image sequence size
  • where does your data come from (natural images, medical images?)
  • batch_size

Are you normalizing the input? If not, you should try it.
If you are normalizing, try a tanh in your model output.
Your output might be just “out of bounds”, which could make the training quite hard.

I am trying to use the 3D convolution neural nets to do some Deconvolution task.(i.e. task like 3D deblur)

  • The image sequences are of size 101*101*101
  • The data are fake 3D tree shape blood vessel models. I use the code from http://vascusynth.cs.sfu.ca/Data.html to generate 1500 volumn models, each of which is of size 101*101*101. I convolve them with a 10*10*10 kernel to simulate the psf in the imaging process and add 30dB psnr noise. But now I just feed the same data into the net as the targeted output. I originally hoped to use 1200 of them as training set and the rest as the test set.
  • Batch size is four(larger size like 8 would cause cuda out of memory error)
  • I am not normalizing it currently, I will first try with that next.
    Thank you so much for helping me!