Implementation of Autograd for Sparse Tensor

Hi there,

I am currently implementing the Graph Attention Network, which is a variant of GCN. There are only few feasible operations for sparse Tensor, so I have to implemented many Functions for sparse operations.

One main problem, is that, my method needs to compute the grad in sparse * dense for the sparse part, say sparse matrix S (n x n) have m entries, dense matrix D is (n x k), grad_output G has shape (n x k). Currently I am using index_select according to m entries to make G’ (m x k), D’ (m x k), then grad(S) = (G’ * D’).sum(dim=1).

Another main problem, it seems spare.FloatTensor is not supported to construct from cuda.LongTensor and cuda.FloatTensor? Each time I construct it I have to use .cpu() for once.

I was wondering what is the current best way to implement this, and will pytorch support sparse autograd?