I implemented a pytorch cuda extension of xnor_gemm.
when I run this gemm in a small demo.py there is no problem
But there is CUDA memory access error when I put this function in a ALBERT/huggingface forward function.
pytorch version 1.4.0 (cuda version 10.1)
cuda nvcc 10.0
I tried other version of pytorch such as 1.6.0 but error still happens
Here is the small demo
import xnor_cuda
def test():
a = torch.ones(32,128).to(device='cuda')
b = torch.ones(128,32).to(device='cuda')
output1 = xnor_cuda.xnor_gemm(a,b)
return 1
test()
Here is psudo code that error happens test() is the same as the small demo
Class A:
def forward():
test()
The error is
RuntimeError: CUDA error: an illegal memory access was encountered (copy_kernel_cuda at /pytorch/aten/src/ATen/native/cuda/Copy.cu:180)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7f5316c95193 in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x56e3912 (0x7f531c7cb912 in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #2: <unknown function> + 0x1a2a41d (0x7f5318b1241d in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #3: <unknown function> + 0x1a266ff (0x7f5318b0e6ff in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #4: at::native::copy_(at::Tensor&, at::Tensor const&, bool) + 0x3e (0x7f5318b10bee in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #5: <unknown function> + 0x199af9d (0x7f5318a82f9d in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #6: <unknown function> + 0x56e2ed2 (0x7f531c7caed2 in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #7: <unknown function> + 0x1a2a41d (0x7f5318b1241d in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #8: <unknown function> + 0x1a266ff (0x7f5318b0e6ff in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #9: at::native::copy_(at::Tensor&, at::Tensor const&, bool) + 0x3e (0x7f5318b10bee in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #10: <unknown function> + 0x436ecb8 (0x7f531b456cb8 in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #11: <unknown function> + 0x199af9d (0x7f5318a82f9d in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #12: <unknown function> + 0x1cb836d (0x7f5318da036d in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #13: at::native::to(at::Tensor const&, c10::Device, c10::ScalarType, bool, bool, c10::optional<c10::MemoryFormat>) + 0x2a6 (0x7f5318da20d6 in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #14: <unknown function> + 0x1ffdbf3 (0x7f53190e5bf3 in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #15: <unknown function> + 0x3ce3db2 (0x7f531adcbdb2 in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #16: <unknown function> + 0x204864a (0x7f531913064a in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #17: at::Tensor::to(c10::Device, c10::ScalarType, bool, bool, c10::optional<c10::MemoryFormat>) const + 0x1fb (0x7f53622ca24b in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch_python.so)
frame #18: at::print(std::ostream&, at::Tensor const&, long) + 0x917 (0x7f53189df5e7 in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #19: <unknown function> + 0x366fd (0x7f52be2546fd in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/xnor_cuda-0.0.0-py3.6-linux-x86_64.egg/xnor_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #20: xnor_gemm_cuda(at::Tensor, at::Tensor) + 0x337 (0x7f52be25535a in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/xnor_cuda-0.0.0-py3.6-linux-x86_64.egg/xnor_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #21: xnor_gemm(at::Tensor, at::Tensor) + 0x57 (0x7f52be247417 in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/xnor_cuda-0.0.0-py3.6-linux-x86_64.egg/xnor_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #22: <unknown function> + 0x2b6fd (0x7f52be2496fd in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/xnor_cuda-0.0.0-py3.6-linux-x86_64.egg/xnor_cuda.cpython-36m-x86_64-linux-gnu.so)
frame #23: <unknown function> + 0x32ae1 (0x7f52be250ae1 in /home/piaotairen/.conda/envs/piaoenv36/lib/python3.6/site-packages/xnor_cuda-0.0.0-py3.6-linux-x86_64.egg/xnor_cuda.cpython-36m-x86_64-linux-gnu.so)
<omitting python frames>