Backprop Through Sparse Tensor Is Not Memory Efficient?

Thanks for clarification!

Hopefully, in my case, the backprop operation is quite simple. I have implemented an customized backprop operation to solve the memory issue.

1 Like