CUDA error: an illegal memory access with C++ API while calling .to(k::CUDA) from cuda code


I have a class with some at::Tensor fields and some of its methods are implemented in cuda.
At some point an at::Tensor field has to be moved to cuda in order to execute a cuda kernel. During such an operation:

tensor =;

The following error arises:

CUDA error: an illegal memory access was encountered (copy_from_cpu at /opt/conda/conda-bld/pytorch_1549628766161/work/aten/src/ATen/native/cuda/
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x45 (0x7f565fc4ccf5 in ../lib/python3.6/site-packages/torch/lib/
frame #1: (anonymous namespace)::copy_from_cpu(at::Tensor&, at::Tensor const&) + 0x4ab (0x7f566510b86b in ../lib/python3.6/site-packages/torch/lib/
frame #2: void (anonymous namespace)::_copy__cuda<int>(at::Tensor&, at::Tensor const&, bool) + 0xd35 (0x7f56651a7245 in ../lib/python3.6/site-packages/torch/lib/
frame #3: at::native::_s_copy__cuda(at::Tensor&, at::Tensor const&, bool) + 0x1e5 (0x7f566510bbe5 in ../lib/python3.6/site-packages/torch/lib/
frame #4: at::CUDAIntType::s_copy_(at::Tensor&, at::Tensor const&, bool) const + 0x62 (0x7f5663ecd9a2 in ../lib/python3.6/site-packages/torch/lib/
frame #5: at::TypeDefault::copy_(at::Tensor&, at::Tensor const&, bool) const + 0x108 (0x7f56607896c8 in ../lib/python3.6/site-packages/torch/lib/
frame #6: at::TypeDefault::copy(at::Tensor const&, bool, c10::optional<c10::Device>) const + 0x322 (0x7f5660751732 in ../lib/python3.6/site-packages/torch/lib/
frame #7: <unknown function> + 0x719737 (0x7f5660577737 in ../lib/python3.6/site-packages/torch/lib/
frame #8: at::native::to(at::Tensor const&, c10::Device, c10::ScalarType, bool, bool) + 0x4c8 (0x7f566057a148 in ../lib/python3.6/site-packages/torch/lib/
frame #9: at::TypeDefault::to(at::Tensor const&, c10::Device, c10::ScalarType, bool, bool) const + 0x1b (0x7f5660717d0b in ../lib/python3.6/site-packages/torch/lib/

Am I not supposed to call .to() in .cu files?

I use the latest stable PyTorch 1.0.1 and CUDA 9.0

Can you try moving the tensor to CUDA before calling the CUDA kernel?