Conv3d Problem: SIGSEGV(Signal 11)

Here I meet a problem when use Conv3d to process a [32,32,32] pixel sized cube,and here is the code:

layer1=nn.Sequential(
nn.AvgPool3d((2,1,1),stride=(2,1,1))
)
layer2_1=nn.Sequential(
nn.Conv3d(1,64,(3,3,3),(1,1,1),(1,1,1)),
nn.LeakyReLU(RL_leakyrate),
)

Here my input x is a Variable with data type: torch.tensor.double,and the size is [4,1,32,32,32].

Then I try 2 operations:

out=layer1(x)

The step is normal and returns a [4,1,16,32,32] size’s tensor with type double ,and I try the next operation:

out2=layer2_1(out)

And it returns the error message:

“Process finished with exit code 139 (interrupted by signal 11: SIGSEGV).”

Then I try to debug step by step, and I notice that in the source code “functional.py”, function"conv3d", line 116:

return f(input,weight,bias)

When I run this code, the program exit with all variables disappeared. I’m sure nobody has raised this question before, so I raise the problem and find some help here.

If more information is needed, please tell me, and I will show the details as clear as possible.

Thanks

Hi Phil,

I tried this script on pytorch version 0.2.0 and on pytorch master branch. It ran without errors on both.
Can you tell me about what machine you are running this on, is it an old computer?

import torch
import torch.nn as nn
from torch.autograd import Variable
layer1=nn.Sequential(
    nn.AvgPool3d((2,1,1), stride=(2,1,1))
)
layer2_1=nn.Sequential(
    nn.Conv3d(1,64,(3,3,3),(1,1,1),(1,1,1)),
    nn.LeakyReLU(0.1),
)

layer1.double()
layer2_1.double()

inp = Variable(torch.Tensor(4,1,32,32,32).double())
out=layer1(inp)
out2=layer2_1(out)


layer1.cuda()
layer2_1.cuda()

inp = Variable(torch.Tensor(4,1,32,32,32).double().cuda())
out=layer1(inp)
out2=layer2_1(out)

Thanks for reply! Here I find that your code also return the same error.
When I run your code in this way:

import torch
import torch.nn as nn
from torch.autograd import Variable
layer1=nn.Sequential(
    nn.AvgPool3d((2,1,1), stride=(2,1,1))
)
layer2_1=nn.Sequential(
    nn.Conv3d(1,64,(3,3,3),(1,1,1),(1,1,1)),
    nn.LeakyReLU(0.1),
)

layer1.double()
layer2_1.double()
inp = Variable(torch.Tensor(4,1,32,32,32).double())
out=layer1(inp)
out2=layer2_1(out)

The same bug reappears exactly in this step,and is completely the same as above I mentioned:

out2=layer2_1(out)

And I also try

out=layer2_1(inp)

and I move the nn.LeakyRelu(0.1)part of layer2_1 and try

out=layer2_1(inp)

The same mistake reappears exactly the same as above.

I haven’t try the code with .cuda(),maybe it is a mistake only on pytorch of cpu version? Or maybe I need to reinstall my pytorch system?

I’m sorry for the late apply, thanks for your attention!

If possible, could you try the current master and see if the problem still persists? Thanks!

Thanks for reply!
Here my version is 0.2.0_1,and the problem still exists.
My computer doesn’t have GPU,so I just try the code with the cpu version
Do you have some advises?
= = =
I have tried 2 machines:

  1. Computer A without GPU and install CPU version 0.2.0_3
  2. Computer B with GPU and install GPU version 0.2.0_3
    Then I run the codes above,and here is the result:
    In computer A, the error reappears every time( after every reinstall)
    In computer B, I try the code with.cuda() and without ```.cuda()````, and the error vanishes,and the code runs normally.

So I think it is a bug in CPU version with a no-GPU computer ,which needs to be fixed.

Well, I’m use Ubuntu 16, and my computer is a new computer without GPU, so my code runs just on the cpu version with pytorch version 0.2.0_1

I have tried 2 machines:

  1. Computer A without GPU and install CPU version 0.2.0_3
  2. Computer B with GPU and install GPU version 0.2.0_3
    Then I run the codes above,and here is the result:
    In computer A, the error reappears every time( after every reinstall)
    In computer B, I try the code with.cuda() and without ```.cuda()````, and the error vanishes,and the code runs normally.

So I think it is a bug in CPU version with a no-GPU computer ,which needs to be fixed.

Well, if possible, could you try the latest master code (not just 0.2, try a manual install from current github master)? I don’t have a non-gpu machine by hand. Thanks! :slight_smile:

@11117 I ran your code (the following) on a machine without a gpu. My torch.version is 0.2.0_4 and no errors occurred. Could you update your pytorch or build from master and try again?

import torch
import torch.nn as nn
from torch.autograd import Variable
layer1=nn.Sequential(
    nn.AvgPool3d((2,1,1), stride=(2,1,1))
)
layer2_1=nn.Sequential(
    nn.Conv3d(1,64,(3,3,3),(1,1,1),(1,1,1)),
    nn.LeakyReLU(0.1),
)

layer1.double()
layer2_1.double()
inp = Variable(torch.Tensor(4,1,32,32,32).double())
out=layer1(inp)
out2=layer2_1(out)

Amazing!
I update my version into 0.2.0_4 and the bug vanishes (using the source code in github)
Thanks a lot!

Amazing!
I try your way and install the latest version from the current github, then the bug vanishes!
(But the install page of pytorch still stays in version 0.2.0_3, in which the bug still stays, should we update it?)
Thanks a lot! Now I can go on recommending pytorch in our lab!

1 Like

0.2.0_3 is still our latest release. It will be updated when we release a newer version! There are a lot of new stuff and fixes/improvements in master though. Feel free to try them! :slight_smile: