I am implementing a sparse random projection, using a random sparse matrix (Li, Ping, Trevor J. Hastie, and Kenneth W. Church. “Very sparse random projections.”).
This operation should in principle be much faster than a dense random projection, but experimentally I get the converse. In my example, I am multiplying a 5k vector by a 10k*5k matrix. When using a dense matrix, doing torch.mm(dense_mat, vec)
executes in around 500 microseconds, while using a sparse matrix, torch.mm(sparse_mat, vec)
executes in 40 ms.
If I further increase the size, the dense multiplication is still much faster than the sparse one.
Does somebody have some insights in why this happens?