Help with custom dataloaders

Hello everyone, I need help

I have two datasets of images - indoors and outdoors, they don’t have the same number of examples.

Each dataset has images that contain a certain number of classes (minimum 1 maximum 4), these classes can appear in both datasets, and each class has 4 categories - red, blue, green, white. Example:
Indoor - cats, dogs, horses
Outdoor - dogs, humans

I am trying to train a model, where I tell it, “here is an image that contains a cat, tell me it’s color” regardless of where it was taken (Indoors, outdoors, In a car, on the moon)

To do that, I need to present my model examples so that every batch has only one category (cat, dog, horse or human), but I want to sample from all datasets (two in this case) that contains these objects and mix them. How can I do this?

It has to take into account that the number of examples in each dataset is different, and that some categories appear in one dataset where others can appear in more than one. and each batch must contain only one category.

I would appreciate any help, I have been trying to solve this for a few days now.

Then, you need to create a dictionary to map each category to an array of image paths

Thank you for your suggestion,
Here is what I did (in case anyone tries to do this):
(based on this How to load data from multiply datasets in pytorch - Stack Overflow)

  1. concatenate the datasets (after translation)
    2.generate the indices corresponding to each dataset (1-15 dataset 1, 16-31 dataset 2, 32-40 dataset3 etc)
  2. assign indices to each class (class1: 1-15, 32-40, class2: 1-15, 16-32 etc) while making sure that the number of items in each class is divisible by the batch size
  3. create shuffle the indices of each class using subsetrandomsampler and create for each a batchsampler
  4. generate one array of all the indices of all classes by randomly sampling each batchsampler until they are depleted while recording the class order [class1, class3, class1, class2, class5] inds[[],[],[],[],[]]
  5. create a custom single sampler with these indices
  6. create a custom batch loader using the sample
    while iterating over the loader, each batch come with its corresponding class