Issue with gradient of sparse matrix using cuda

Hello everyone,

I can replicate the code on torch.sparse.mm — PyTorch 1.9.0 documentation using CPU. However, when I try to run it on GPU using cuda, it seems I can no longer get the gradient.

In the code below, if device = ‘cpu’, then I can get the sparse and dense gradient for a and b respectively. If device = ‘cuda’, then I get the error that
<ipython-input-104-da255cd7360a>:1: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more information.
This applies for both torch.sparse.mm or torch.spmm method.

Is it the case that gradient using sparse matrix is in general not available on cuda?

import torch
device='cpu'
# sparse
a = torch.randint(0, 2, size=(2, 3)).float().to_sparse().requires_grad_(True).to(device)
a
# dense
b = torch.randn(3, 2, requires_grad=True).to(device)
b
y = torch.sparse.mm(a, b)
# y = torch.spmm(a, b)
y
y.sum().backward()
a.grad
b.grad

No, that’s not the case.
As the warning explains, you are trying to access the .grad attribute of a non-leaf tensor, which won’t work.
The reason you are seeing this warning only when using the GPU is because the to(device='cpu') operation will be a no-op on the CPU and a valid operation on the GPU.
You can check it additionally via print(a.is_leaf) and print(b.is_leaf), which will show True on the CPU and False on the GPU.
This code snippet should work:

device='cuda'
# sparse
a = torch.randint(0, 2, size=(2, 3)).float().to_sparse().to(device).requires_grad_(True)
a
# dense
b = torch.randn(3, 2, device=device, requires_grad=True)
b
y = torch.sparse.mm(a, b)
y
y.sum().backward()
a.grad
b.grad

Awesome! Thank you for the insightful explanation Peter!