Thanks for your analysis. I have roughly checked your repository and use copy_
instead of _th_copy_ignoring_overlaps_
in my code.
But now another problem arises. My extension works fine only when the tensor is in the current device.
y = fun(x.to(0)) # work fine
y = fun(x.to(1)) # raise error
I’m not asking you to solve my problem. I will make a careful comparison and do more test later.