I have an input_matrix which is scipy verion of sparse matrix in csr format. It’s a binary representation and consists of only 1’s and 0’s.
> input_matrix
<1500x24995 sparse matrix of type '<type 'numpy.float32'>'
with 1068434 stored elements in Compressed Sparse Row format>
I load it into a DataLoader using the below code:
cuda = torch.cuda.is_available()
kwargs = {'num_workers': 1, 'pin_memory': True} if cuda else {}
input_loader = DataLoader(input_matrix.toarray(), batch_size=32, shuffle=True, **kwargs)
Now when I check the input_loader in the interpreter, I see 0’s, 1’s and other values such as 2’s appearing.
> input_loader
1 1 1 ... 0 0 0
0 0 0 ... 0 0 0
0 1 1 ... 0 0 0
... ? ...
0 2 2 ... 0 0 0
0 0 0 ... 0 0 0
1 1 1 ... 0 0 0
[torch.FloatTensor of size 32x24995]
If it helps, when I convert the csr_matrix into tensor using torch.from_numpy(input_matrix) I donot see values other than 0’s and 1’s.
0 1 1 ... 0 0 0
0 0 0 ... 0 0 0
0 1 0 ... 0 0 0
... ? ...
1 1 1 ... 0 0 0
0 0 0 ... 0 0 0
1 1 1 ... 0 0 0
[torch.FloatTensor of size 1500x24995]
Is the method employed to load the data correct? If not can how to correctly load the data into dataloader.