the df is DataFrame and contain
“data”, “label” column
and I want to use DataLoader function to split the data every batch_size
batch_size = 32
test_loader = DataLoader(dataset=df, batch_size = batch_size,
shuffle = True, num_workers=2)
for i, data in enumerate(test_loader):
print(data['data'])
but there is error like below:
~/.local/lib/python3.5/site-packages/torch/utils/data/dataloader.py in next (self)
334 self.reorder_dict[idx] = batch
335 continue
→ 336 return self._process_next_batch(batch)
337
338 next = next # Python 2 compatibility
how can i fix this error?
thank you
albanD
(Alban D)
November 28, 2018, 11:24am
2
Hi,
The dataset should be a torch.utils.data.Dataset
subclass, not a DataFrame.
1 Like
Oh thank you and I have another questions:
def change(data):
temp = []
for i in data:
temp.append(i)
return temp
class ReadyDataset(Dataset):
def __init__(self):
df = pd.read_csv("train_after.csv", converters={"data": literal_eval, "label": literal_eval})
self.len = df.count()
x = df['data']
y = df['label']
self.x_data = torch.from_numpy(np.array(change(x)))
self.y_data = torch.from_numpy(np.array(change(y)))
def __getitem__(self, index):
return self.x_data[index], self.y_data[index]
def __len__(self):
return self.len
batch_size = 32
dataset = ReadyDataset()
test_loader = DataLoader(dataset=dataset, batch_size = batch_size,
shuffle = True, num_workers=2)
and now there is no error but in this code:
for i, data in enumerate(test_loader, 0):
x, y = data
print(x, y)
appear another error
TypeError: ‘Series’ object cannot be interpreted as an integer
albanD
(Alban D)
November 28, 2018, 12:57pm
4
Would you have a stack trace of where this error comes from? Series
is not a pytorch class, so it’s another lib’s object causing issues.