# Creating a sparse tensor from CSR Matrix

Is there a straightforward way to go from a `scipy.sparse.csr_matrix` (the kind returned by an sklearn `CountVectorizer`) to a `torch.sparse.FloatTensor`?

Currently, I’m just using `torch.from_numpy(X.todense())`, but for large vocabularies that eats up quite a bit of RAM.

You could convert a `csr` format matrix to `coo`, and then process that a little before sending it into the sparse tensor constructor. I believe scipy’s `coo` format looks similar to pytorch’s sparse tensors.

@jbarrow @richard I’m attempting to solve the same problem, but I’m getting a little lost after converting to COO format. I’m not sure which attributes of the new matrix to pass to the sparse tensor constructor.

1 Like

The sparse tensor constructor is:
`torch.sparse.FloatTensor(indices, values, size)`.
an example can be found here: http://pytorch.org/docs/master/sparse.html?highlight=sparse%20tensor

2 Likes

``````import torch
import numpy as np
from scipy.sparse import coo_matrix

coo = coo_matrix(([3,4,5], ([0,1,1], [2,0,2])), shape=(2,3))

values = coo.data
indices = np.vstack((coo.row, coo.col))

i = torch.LongTensor(indices)
v = torch.FloatTensor(values)
shape = coo.shape

torch.sparse.FloatTensor(i, v, torch.Size(shape)).to_dense()
``````

Output

``````0 0 3
4 0 5
[torch.FloatTensor of size 2x3]
``````
5 Likes