Dear all,
I am trying to build PyTorch (git clone
of the github repo) from sources but I encounter an error at compile time:
FAILED: caffe2/CMakeFiles/torch.dir/operators/torch_generated_channelwise_conv3d_op_cudnn.cu.o
/tmp/pytorch/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu(102): error: identifier "__ldg" is undefined
/tmp/pytorch/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu(123): error: identifier "__ldg" is undefined
/tmp/pytorch/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu(102): error: identifier "__ldg" is undefined
detected during instantiation of "void caffe2::DepthwiseConv3dGPUKernelNCHW(caffe2::DepthwiseArgs, const T *, const T *, T *, int) [with T=float]"
(380): here
/tmp/pytorch/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu(123): error: identifier "__ldg" is undefined
detected during instantiation of "void caffe2::DepthwiseConv3dGPUKernelNCHW(caffe2::DepthwiseArgs, const T *, const T *, T *, int) [with T=float]"
(380): here
/tmp/pytorch/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu(184): error: identifier "__ldg" is undefined
detected during instantiation of "void caffe2::DepthwiseConv3dBackpropFilterGPUKernelNCHW(caffe2::DepthwiseArgs, const T *, const T *, T *, int) [with T=float]"
(505): here
/tmp/pytorch/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu(303): error: identifier "__ldg" is undefined
detected during instantiation of "void caffe2::DepthwiseConv3dBackpropInputGPUKernelNCHW(caffe2::DepthwiseArgs, const T *, const T *, T *, int) [with T=float]"
(515): here
6 errors detected in the compilation of "/tmp/tmpxft_00003524_00000000-13_channelwise_conv3d_op_cudnn.compute_30.cpp1.ii".
CMake Error at torch_generated_channelwise_conv3d_op_cudnn.cu.o.Release.cmake:279 (message):
Error generating file
/tmp/pytorch/pytorch/build/caffe2/CMakeFiles/torch.dir/operators/./torch_generated_channelwise_conv3d_op_cudnn.cu.o
From what I understand, __ldg
is available only for devices with compute capabilities >= 3.5, knowing that I am compiling PyTorch against CUDA 9.2.148 on a NVidia V100 GPU (so my hardware definitely meets these requirements).
Any help would be appreciated.