How to speedup element wise update?

In my custom dataset, the data is loaded from Pandas. There is augmentation by creating a new item based on a simple rule:

def augmentation_by_color(x, colors):
    x = np.array(x)
    return np.array([colors[xi] for xi in x.flatten()]).reshape(x.shape)
colors = [0,1,2,3,4,5,6,7,8,9]
colors = np.random.permutation(colors)
new_data = (dataitem_from_pandas, colors)
# new_data to tensor and return for training on CUDA

Everything works on CPU and then converted to tensor that will be processed on CUDA. How to speed up this process? Should I do all operations on tensors (convert Pandas item to tensor and the do permutations there)? If so can you suggest a code?

If loading data takes most of time then should I increase the num_workers?


Yes I would say that doing this as a single indexing operation would make it much faster.
Maybe something as simple as colors[x]? Or colors[x.flatten()].reshape(x.shape) ?

Thank you it made it much faster. Do you think I should keep it on CPU or switch to GPU? In general case how to speedup dataloaders?

If this is done in the preprocessing step of the dataloader, you can keep it on CPU.
In general, large indexing ops will be better on CPU unless the data is already on the GPU.

How can increasing num_workers help?

I will help if your CPU is underutilized. But it can be detrimental if it is already fully used.
The only way to know here is to test different values and see which one gives you the best result as it will depend on many things (like disk, preprocessing, caching, etc)

1 Like