Hi, I have noticed that using torch.mm
operation using tensor with sparse_csr
layout is significantly slower in pytorch 1.10.2 than in pytorch 1.9.1.
Here is a small code sample I’ve used to test it:
import timeit
print(torch.__version__)
torch.manual_seed(1233)
crows = torch.randint(0, 50, (1000000,)).cumsum(0).int()
crows[0] = 0
cols = torch.randint(0, 100000, (crows[-1],)).sort()[0].int()
mock_vals = torch.rand_like(cols, dtype=torch.float)
csr_mat = torch._sparse_csr_tensor(crow_indices = crows, col_indices=cols, values=mock_vals)
dense_mat = torch.rand(csr_mat.shape[-1], 20)
print(timeit.timeit("torch.mm(csr_mat, dense_mat)", globals=globals(), number=100))
In pytorch version 1.9.1, this code yields:
1.9.1+cpu
1.9244596790522337
In pytorch version 1.10.2 (same machine), almost same code yields (except this time I use torch.sparse_csr_tensor
instead of torch._sparse_csr_tensor
):
1.10.2+cpu
5.861249879002571
As you can see there is a significant difference in times between those versions. I would like to upgrade to pytorch 1.10.2 but I would also like to use sparse_csr matrix format as it provides speed up in my project. However I’d expect that matrix multiplication would be faster for newer pytorch version. Have I overlooked something?