Dataloader returning non-CUDA Tensors

Abhai_Kollara · February 21, 2018, 8:14am

I create a simple dataset that has two CUDA tensors, one of shape (1000, 5) and the other (1000,) but when I load them using a DataLoader only the first one is returned as CUDA tensor. I think this may have something to do with the default collate_fn of the DataLoader. Any fixes ?

import torch
from torch.utils.data import Dataset, DataLoader
import numpy as np

x = torch.cuda.FloatTensor(np.random.randn(1000, 5))
y = torch.cuda.LongTensor(np.random.randint(0,10, size=(1000, )))

class data(Dataset):
    def __init__(self, inputs, targets):
        self.x = inputs
        self.y = targets

    def __len__(self):
        return self.x.size()[0]

    def __getitem__(self, idx):
         return (self.x[idx], self.y[idx])

d = data(x, y)
dl = DataLoader(d, batch_size=5)

for item in dl:
    print(item)
    break

Output

[
 0.9943 -0.6458 -0.7077 -0.9228 -1.1230
-0.1050 -0.1452  0.0680 -0.5407 -0.5932
 2.7039  0.0426  0.0692  1.3255  0.0133
-0.3883  0.5905  1.4192 -0.7205 -1.5623
 1.8158  0.5436  0.8788  2.3459  0.9836
[torch.cuda.FloatTensor of size 5x5 (GPU 0)]
,
 6
 7
 2
 2
 7
[torch.LongTensor of size 5]
]

SnowWalkerJ · February 23, 2018, 9:22am

This is a bug with DataLoader. Simply reshape the tensors as two-dimensional should be good.
y = torch.cuda.LongTensor(np.random.randint(0,10, size=(1000, 1)))

Abhai_Kollara · February 23, 2018, 2:30pm

Yes, it works. But still can’t use it directly with loss functions like CrossEntropyLoss and NLLLoss which expects it’s target to be of dimension (N, ) and not (N, 1). The target has to be reshaped back to its original (N, ) shape