I was trying to use the weight_norm() in the master branch so I built the bleeding edge version of PyTorch from source.
The error message is as below:
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007fffd69bba8c in THCudaFree () from /home/user2/.conda/envs/pytorch_master/lib/python3.6/site-packages/torch/lib/libTHC.so.1
So could anyone tell what’s the best practice to build PyTorch from source?
And if I run python setup.py install directly, it will incur an import error:
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
Current thread 0x00007faeabc48700 (most recent call first):
[1] 12644 abort (core dumped) python setup.py install
My solution to this is to deactivate the virtual environment and re-enter it again. This time I can install the PyTorch without any problem.
But no matter what program I am trying to run, as long as it uses CudaTensor, it would crash with a segmentation fault as above.
I am wondering if it’s because of the conda virtual envs. If I would like to install the bleeding edge version of python in a virtual env, could you please shed some light on the best way to do this?
import math
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
import torch.backends.cudnn as cudnn
input = torch.randn(64, 3, 32, 32).cuda()
input_var = Variable(input)
cudnn.benchmark = True
net = nn.Conv2d(3, 24, kernel_size=3, stride=1,
padding=1, bias=False).cuda()
net.train()
output_var = net(input_var)
Finally found where the seg fault comes from! It’s because I set cudnn.benchmark = True. Do you have any idea on it?
FYI: I could run v0.12 with the flag cudnn.benchmark = True on the same computer, so the installed cudnn is supposed to not be the problem. Is it possible that something goes wrong when linking to the cudnn lib?