How can I use "__cuda_array_interface__" and tensor.requires_grad_() together?

JTiC · December 15, 2019, 5:39am

I use package Numba to write the forward and backward cuda operation, and use it as a module. When I use it, the input tensors, with grad required, go through the modules. The example can be described as:

def forward_cudakernel():
        ...   # write by NUMBA

def backward_cudakernel():
        ...  # write by NUMBA

def forward_cuda(x)
    with torch.no_grad():
        x = numba.cuda.as_cuda_array(x)
        forward_cudakernel[blocks, threads](x)

def backward_cuda(x)
    with torch.no_grad():
        x = numba.cuda.as_cuda_array(x)
        backward_cudakernel[blocks, threads](x)

class BaseModule(Function):
    @staticmethod
    def forward():
        ...
        return forward_cuda()
    @staticmethod
    def backward():
        ...
        return grads

class Module(nn.Module):
    def forward(feature):
        fun = BaseModule.apply
        return fun(feature)

>>> features = torch.rand(2, 256, 20, 20).cuda()
>>> features.requires_grad_()
>>> net = Module()
>>> out = net(features)
>>> out.mean.backward()

Where “out” and “feature” needs grads so that the module can run. However, the attr “cuda_array_interface”（cuda.as_cuda_array related） is not available when a tensor is with gradient. It seems incompatible. So, How can I use “numba” and “pytorch” together? Thanks!

JTiC · December 16, 2019, 2:14am

Now, I add a line x = x.data in forward_cuda(), discarding the gradient before using “as_cuda_array”. I find it worked, but I don’t know if there are some potential bugs remained.

albanD · December 16, 2019, 10:55am

As mentioned in the error message that you get if you try to use it here you should use .detach() and not .data.

JTiC · December 17, 2019, 11:40am

Thank you very much!