I see that in torch/nn/modules/sparse.py there is a TODO SparseLinear, in orch/nn/_functions/thnn/auto.py there is a SparseLinear, and there is even a SparseLinear.c file. However I did not see a good guide describing the different levels of abstraction between those stages, so I was hoping I could get a few pointers here before trying to learn it from the source code.

Could you describe what would be the steps one would need to take to finish the SparseLinear module (I implemented one on my own, but I suspect it is much uglier and clumsier than what the TODO above is supposed to be (and I have not really touched the C code)).

You would need to implement sparse * sparse matrix multiplication. There are some libraries out there that do it, but in general, I think the easiest way to get this to work is by converting the sparse matrices to CSR format (pytorch sparse matrices are COO format) and performing the sparse multiply.