Afer a 3d convolution ,all of the reslut of the output are nan.
Here is the code in forward function
print("input",torch.isnan(input).any()) conv = self.con3d1(input) print("conv1",torch.isnan(conv).any())
And the self.conv3d1 is
self.con3d1= nn.Conv3d(in_channels=2, out_channels=64, kernel_size=(6, 3, 3), padding=(2, 1, 1))
For your information, I pad the input tensor by hand (because torch don’t support “same padding”)
pad1 = torch.zeros((input.shape, 2, 1, 16, 8)).to(DEVICE) input = torch.cat((input, pad1), dim=2)
Noted that the input data is absolutely clean and normalized.
Turns out it’s because the gradient is toooo large,so i implement gradient clipping,then the problem sloved.
But I am wondering that why gradient explode would happend in pytorch?
I was trying to convert a keras code into a pytorch code, and the same 3d convolution layer in keras was ran perfectly.
conv = Conv3D(filters=64, kernel_size=(6, 3, 3), strides=(1, 1, 1), border_mode="same", kernel_initializer='random_uniform')(input)
Any suggestion is helpful !
If you want to know more detailed information please let me konw!