Hello everyone,

I can replicate the code on torch.sparse.mm — PyTorch 1.9.0 documentation using CPU. However, when I try to run it on GPU using cuda, it seems I can no longer get the gradient.

In the code below, if device = ‘cpu’, then I can get the sparse and dense gradient for a and b respectively. If device = ‘cuda’, then I get the error that

`<ipython-input-104-da255cd7360a>:1: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more information.`

This applies for both torch.sparse.mm or torch.spmm method.

Is it the case that gradient using sparse matrix is in general not available on cuda?

```
import torch
device='cpu'
# sparse
a = torch.randint(0, 2, size=(2, 3)).float().to_sparse().requires_grad_(True).to(device)
a
# dense
b = torch.randn(3, 2, requires_grad=True).to(device)
b
y = torch.sparse.mm(a, b)
# y = torch.spmm(a, b)
y
y.sum().backward()
a.grad
b.grad
```