Illegal memory access on tensors with large dimensions for extension

I am trying to use the correlation function from https://github.com/NVlabs/PWC-Net and/or https://github.com/lliuz/ARFlow. I have updated both to use the static method implementation required for 1.6.0 and replaced THCudaTensor/at::Tensor with torch::Tensor.

If I try to test with reasonably sized tensors (i.e. 16, 8, 128, 128), it fails with an illegal memory access during the backward methods (after resizing blobs), however it will work fine with smaller shaped tensors (i.e. 16, 64, 64, 64), even if the volume is greater.

I haven’t touched the CUDA kernels for either of them, and I guess they should be fine as they’re both published (and one being from nvidia)…

Both of them are previously used older versions of CUDA (8&9) and Torch (0.4 and 1.1), could this also be the issue?

Source for my changes can be found here.

Cheers

Your link is not available and yields a 404.
I don’t know which custom kernels you are using, but since the illegal memory access seems to be size-dependent, I guess some int32/int64 indexing might fail.
You could run the code via cuda-gdb and create an issue in the corresponding repository, where the kernel is provided.

I did a few more tests and debugging via printing in the kernels, and it seems for some combinations of input height/width and padding (moreso lack of padding), during the correlation forward and backward, it tries to access outside of the image i.e. index -1.

I’m not entirely sure who is the original author of the correlation kernel, I think the earliest I’ve noticed is nvidia’s PWC-net. Its 404’ing because I figured this out, made the fix and noted that I need at least some padding (and to increase it if I have this problem again) and then merged it back into my master branch.

The bad accesses occur where the boundary checks are (unless otherwise stated in a comment). If I can figure out a more elegant solution, I could make one and submit a PR? I believe there should probably be a way of calculating required minimal required padding.

That sounds like great debugging. Sure, if you have a proper fix, the authors would probably be really glad to receive it. :slight_smile: