I have a custom torch.autograd.Function
that outputs a torch.sparse.FloatTensor
gradient in the backward pass and I’d like to use built-in optimizers such as SparseAdam
and Adagrad
to optimize my variables using the sparse gradients. SGD
has no problem using my custom torch.sparse.FloatTensor
gradient, but SparseAdam
and Adagrad
expect a _sparse_mask
attribute. I get this error when running optimizer.step()
: AttributeError: 'torch.sparse.FloatTensor' object has no attribute '_sparse_mask
. I couldn’t find much information about _sparse_mask
in the documentation. What’s the proper way to output custom sparse tensors to work with sparse optimizers?