Returning a Tensor is orders of magnitudes (60 times in this example) slower than returning something like a numpy array in a DataSet when used with a DataLoader with num_workers
>0.
This is the case even if the Tensor happens to share memory with a numpy array (i.e. from_numpy
), but does not seem to happen when we don’t use any workers.
How can I deal with this issue? (the code is run on Linux)
Some example:
import torch
import numpy as np
class MyDataSet(torch.utils.data.Dataset):
def __init__(self):
super().__init__()
def __len__(self):
return 100000
def __getitem__(self, idx):
arr = np.arange(250)
tensor = torch.arange(250)
# return arr # 1s
return tensor # 60s
def collate_wrapper(batch):
return batch
ds = MyDataSet()
data_loader = torch.utils.data.DataLoader(
ds,
num_workers=2,
shuffle=False,
batch_size=64,
collate_fn=collate_wrapper,
prefetch_factor=1,
)
c = 0
for _ in data_loader:
c+=1
print(c)