PyTorch 1.2.0 build fails on 'error:identifier "__ldg" is undefined

Dear all,

I am trying to build PyTorch (git clone of the github repo) from sources but I encounter an error at compile time:

FAILED: caffe2/CMakeFiles/torch.dir/operators/torch_generated_channelwise_conv3d_op_cudnn.cu.o 

/tmp/pytorch/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu(102): error: identifier "__ldg" is undefined

/tmp/pytorch/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu(123): error: identifier "__ldg" is undefined

/tmp/pytorch/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu(102): error: identifier "__ldg" is undefined
          detected during instantiation of "void caffe2::DepthwiseConv3dGPUKernelNCHW(caffe2::DepthwiseArgs, const T *, const T *, T *, int) [with T=float]" 
(380): here

/tmp/pytorch/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu(123): error: identifier "__ldg" is undefined
          detected during instantiation of "void caffe2::DepthwiseConv3dGPUKernelNCHW(caffe2::DepthwiseArgs, const T *, const T *, T *, int) [with T=float]" 
(380): here

/tmp/pytorch/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu(184): error: identifier "__ldg" is undefined
          detected during instantiation of "void caffe2::DepthwiseConv3dBackpropFilterGPUKernelNCHW(caffe2::DepthwiseArgs, const T *, const T *, T *, int) [with T=float]" 
(505): here

/tmp/pytorch/pytorch/caffe2/operators/channelwise_conv3d_op_cudnn.cu(303): error: identifier "__ldg" is undefined
          detected during instantiation of "void caffe2::DepthwiseConv3dBackpropInputGPUKernelNCHW(caffe2::DepthwiseArgs, const T *, const T *, T *, int) [with T=float]" 
(515): here

6 errors detected in the compilation of "/tmp/tmpxft_00003524_00000000-13_channelwise_conv3d_op_cudnn.compute_30.cpp1.ii".
CMake Error at torch_generated_channelwise_conv3d_op_cudnn.cu.o.Release.cmake:279 (message):
  Error generating file
  /tmp/pytorch/pytorch/build/caffe2/CMakeFiles/torch.dir/operators/./torch_generated_channelwise_conv3d_op_cudnn.cu.o

From what I understand, __ldg is available only for devices with compute capabilities >= 3.5, knowing that I am compiling PyTorch against CUDA 9.2.148 on a NVidia V100 GPU (so my hardware definitely meets these requirements).

Any help would be appreciated.

It might be related to this issue.
From the issue:

enable the environment variable export TORCH_CUDA_ARCH_LIST=7.0 , and this should be fixed.

Thanks for your help, it solved this issue.