If your sizes are relatively small, then you can use something like this to create a sparse matrix using the default strided tensor layout.
value=torch.rand(64, 10) # shape=[64,10]
ids=torch.randint(0, 99, (10,)) # shape=[10], eg [94,13,20,6,27,45,15,7,53,2]
sparse_tensor = torch.zeros(64, 100)
sparse_tensor[:, ids] = value
It will not make that much of a difference in memory if you define it like this or as a sparse_tensor
. For matrix multiplication you can then use @
or torch.matmul
or torch.mm
.
However, if you do have very large sparse matrices, then you can either create a torch.sparse_coo_tensor
or a torch.sparse_csr_tensor
.
According to the documentation, torch.sparse_csr_tensor
does not support CUDA, so I will show you how to do a torch.sparse_coo_tensor
for your case.
value=torch.rand(64, 10) # shape=[64,10]
ids=torch.randint(0, 99, (10,)) # shape=[10], eg [94,13,20,6,27,45,15,7,53,2]
# First you need to redefine your indices to be coordinates.
# There are many ways to do it as shown on the documentation
# For this example I did it like this
# [[0, 94], [0, 13], [0, 20], ..., [63, 7], [63, 53], [63, 2]]
idx = [[i, int(j)] for i in range(64) for j in ids]
sparse_tensor = torch.sparse_coo_tensor(list(zip(*idx)), value.view(-1), (64, 100))
also, if you want to see which operations support gradient, you can look here