Let’s say I have 5
CPUs, a dataset of length 100
, and batch size of 8
. Let’s say the DataLoader
assigns 8
indices in [0, 99]
. For example: {1, 2, 33, 55, 61, 62, 77, 78}
.
Now I want to be able to control which CPU workers are assigned to fetch the corresponding data. I want CPU 1 to fetch indices in [0, 19]
, CPU 2 to fetch indices in [20, 39]
, CPU 3 to fetch indices in [40, 59]
, CPU 4 to fetch indices in [60, 79]
, and CPU 5 to fetch indices in [80, 99]
.
Consequently, CPU 1 should be assigned to fetch {1, 2}
, CPU 2 should be assigned to fetch {33}
, CPU 3 should be assigned to fetch {55}
, CPU 4 should be assigned to fetch {61, 62, 77, 78}
, and CPU 5 should not be assigned to fetch any indices.