Is torch.utils.data.DataLoader only works in python2?

the df is DataFrame and contain
“data”, “label” column

and I want to use DataLoader function to split the data every batch_size

batch_size = 32
test_loader = DataLoader(dataset=df, batch_size = batch_size,
                        shuffle = True, num_workers=2)

for i, data in enumerate(test_loader):
    print(data['data'])

but there is error like below:

~/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py in next(self)
334 self.reorder_dict[idx] = batch
335 continue
→ 336 return self._process_next_batch(batch)
337
338 next = next # Python 2 compatibility

how can i fix this error?
thank you

Hi,

The dataset should be a torch.utils.data.Dataset subclass, not a DataFrame.

1 Like

Oh thank you and I have another questions:

def change(data):
    temp = []
    for i in data:
        temp.append(i)
    return temp

class ReadyDataset(Dataset):
    def __init__(self):
        df = pd.read_csv("train_after.csv", converters={"data": literal_eval, "label": literal_eval})
        self.len = df.count()
        x = df['data']
        y = df['label']
        self.x_data = torch.from_numpy(np.array(change(x)))
        self.y_data = torch.from_numpy(np.array(change(y)))
        
    def __getitem__(self, index):
        return self.x_data[index], self.y_data[index]
    
    def __len__(self):
        return self.len

batch_size = 32
dataset = ReadyDataset()
test_loader = DataLoader(dataset=dataset, batch_size = batch_size,
                        shuffle = True, num_workers=2)

and now there is no error but in this code:

for i, data in enumerate(test_loader, 0):
    x, y = data
    print(x, y)

appear another error

TypeError: ‘Series’ object cannot be interpreted as an integer

Would you have a stack trace of where this error comes from? Series is not a pytorch class, so it’s another lib’s object causing issues.