Hi,
Do you know how to extract the raw data or Dataset (in the original order when the Dataloader was created) from a shuffled Dataloader?
Best regards
Hi,
Do you know how to extract the raw data or Dataset (in the original order when the Dataloader was created) from a shuffled Dataloader?
Best regards
Did you mean, Extracting inputs and targets from DataLoader?
Yes, that’s what I mean … I want to extract the raw features/target in the same order of the original data, from which the dataset was constructed … and from which the shuffled dataloader was loaded
I’m looking for something like
dl = DataLoader(....shuffled=True)
dl.data or dl.dataset # ordered and original data, not shuffled
Best regards
apologies but i dont know to revert the shuffling process of DataLoader
,
I think to extract raw files we can get using like this,
for input, target in dl:
...
While building DataLoader
, if shuffle
argument is not passed,
I think It will be ordered like raw files because shuffle=False
is default arg in Dataloader https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader , unless you specified to be True
which will shuffle raw dataset.
Yes, I know that. That’s why I posted:
I’m looking for something like
dl = DataLoader(....shuffled=True)
dl.data or dl.dataset # or something like this, to get ordered and original data, not shuffled
Your proposed approach using dl.dataset
should work:
class MyDataset(Dataset):
def __init__(self):
self.data = torch.arange(10)
def __getitem__(self, index):
x = self.data[index]
return x
def __len__(self):
return len(self.data)
dataset = MyDataset()
loader = DataLoader(dataset, shuffle=True, batch_size=2)
for data in loader:
print(data)
for data in loader.dataset:
print(data)
If you want to just get the raw data from the dataset you don’t need to use DataLoader. You can just index the dataset. For example if your dataset is called DS and getitem() returns an image and target you could use a for loop to iterate over the dataset in the order that the raw data is in.
for i in range(DS.__len__()):
img, target = DS[i]