Why the output turns into “nan“ Afer a 3d convolution?

Pengzhangzhi · October 4, 2020, 11:23am

Hi!
Afer a 3d convolution ,all of the reslut of the output are nan.
Here is the code in forward function

            print("input",torch.isnan(input).any())
            
            conv = self.con3d1(input)

            print("conv1",torch.isnan(conv).any())

And the self.conv3d1 is

self.con3d1= nn.Conv3d(in_channels=2, out_channels=64, kernel_size=(6, 3, 3), padding=(2, 1, 1))

For your information, I pad the input tensor by hand (because torch don’t support “same padding”)

pad1 = torch.zeros((input.shape[0], 2, 1, 16, 8)).to(DEVICE)
input = torch.cat((input, pad1), dim=2)

Noted that the input data is absolutely clean and normalized.

Turns out it’s because the gradient is toooo large,so i implement gradient clipping,then the problem sloved.

But I am wondering that why gradient explode would happend in pytorch?

I was trying to convert a keras code into a pytorch code, and the same 3d convolution layer in keras was ran perfectly.

conv = Conv3D(filters=64, kernel_size=(6, 3, 3), strides=(1, 1, 1), border_mode="same",
                      kernel_initializer='random_uniform')(input)

Any suggestion is helpful !
If you want to know more detailed information please let me konw!

Pengzhangzhi · October 7, 2020, 2:08pm

I have finally solved this bug.The reason is nothing about the 3d convolution!

ptrblck · October 10, 2020, 4:13am

I’m curious to know what the reason for the NaN outputs was. Could you share the debugging steps and solution?

Pengzhangzhi · October 12, 2020, 2:55am

Thanks sir! The reason is I didn’t initialize parameters of a special layer.