I am training a model with a custom loss function which involves sparse tensors being created.
Specifically, in the forward pass of the loss function:
ravel my prediction and labels arrays,
stack them, and use them as indices to create a
sparse_coo_tensor, with the values as
torch.ones and shape given by
[(torch.amax(raveled_labels) + 1,(torch.amax(raveled_pred) + 1]. I then make a few more sparse tensors by performing some
index_select ops on the first created sparse tensor, perform some
mul ops on the newly created sparse tensors, and some more arithmetic with the returned values to output a final number.
Correct me if i am wrong, but the creation of a sparse tensor in the forward pass of a loss function does not have autograd support; all the other ops work with autograd. It would be really neat if autograd/backward works with this entire chain of ops – might just revolutionize computer vision
If I try to train a model with this custom loss function as is (i.e without writing my own backward), I get a
CUDA error: invalid configuration argument.
Thanks everyone! happy to answer more questions about this use case!