Hi there,
I am writing Graph Attention Convolution, and got a trouble in pytorch framework.
The setting is, I have a indices tensor i (shape: 2 x m), and a value tensor v (shape: 2 x m).
v is a Variable that needs grad. (It is produced by some attention network in fact)
Now I need to construct a sparse.Tensor(i, v) and use it to do further operations, but the construction step would be non-differentiable. (I want to convert to sparse.Tensor because I need torch.mm for efficient sparse * dense forward, which I don’t know how to implement by myself)
Could you help with some solutions for this?
Thank you.
(Pytorch currently works perfect for me except the poor support of sparse Tensor …)