I am training a model with a custom loss function which involves sparse tensors being created.
Specifically, in the forward pass of the loss function:
I ravel
my prediction and labels arrays, stack
them, and use them as indices to create a sparse_coo_tensor
, with the values as torch.ones
and shape given by [(torch.amax(raveled_labels) + 1,(torch.amax(raveled_pred) + 1]
. I then make a few more sparse tensors by performing some index_select
ops on the first created sparse tensor, perform some sparse.sum
and mul
ops on the newly created sparse tensors, and some more arithmetic with the returned values to output a final number.
Correct me if i am wrong, but the creation of a sparse tensor in the forward pass of a loss function does not have autograd support; all the other ops work with autograd. It would be really neat if autograd/backward works with this entire chain of ops – might just revolutionize computer vision
If I try to train a model with this custom loss function as is (i.e without writing my own backward), I get a CUDA error: invalid configuration argument.
Thanks everyone! happy to answer more questions about this use case!