I have a list of my data path but not all of them are in the directory. And i will copy my data from other place once my dataloader going to process this file (e.g. 1 batch before this). But this turn to be the OSError: file not found. Do you know how can i fix this or doing this way in another way around. The reason that i want to do this way is my very large dataset and i want to just partially copy them then remove after the training on that data is done.
I also add some code snippets of my toy example that show the problem.
import glob, os
import numpy as np
from torch.utils.data import DataLoader, TensorDataset, Dataset, RandomSampler, Sampler
class CustomDataset(Dataset):
def __init__(self, data):
self.data = data
def __len__(self):
return len(self.data)
def __getitem__(self, index):
return np.loadtxt(self.data[index])
if __name__ == '__main__':
if os.path.exists('./3.txt'):
os.system('rm 3.txt')
file = glob.glob('./*.txt') + ['./3.txt']
print(file)
dataset = CustomDataset(file)
dataloader = DataLoader(dataset=dataset, batch_size=1, shuffle=False, num_workers=1, drop_last=True, pin_memory=True, persistent_workers=True)
for i, x in enumerate(dataloader):
print("before : ", i, x)
if i == 1:
with open('./3.txt', 'w') as f:
f.write('3')
print("after : ", i, x)
input(dataset.data)```