Hi,
I have a huge image dataset and would like to train on my rtx 2060 and rtx 3060 simultaneously.
Since they do have quite a different memory an compute capability i wonder if this is viable at all or just train on the 3060.
Since they are so different would it be better to run each process with another batch size fitted to their vram but would this make it necessary to customize the DistributedSampler so that they receive an uneven split of the dataset?
Thanks for your help.