Prepare dataloader

image

  • every image contains exactly two labels out of possible 35 labels
  • img1 to img 30 in each folder are the frames of a particular video

How do I build a pytorch dataloader for this dataset?

You should be looking at Torchvision’s ImageFolder: torchvision.datasets — Torchvision 0.8.1 documentation

1 Like

The user case is different, as each image contains two labels! But I agree that a good starting point would be checking out the insides of ImageFolder :slight_smile:
@rjrohit996 how are the labels specified? Generally the labels are built from the folder structure, but in your case you have two labels. Where do they come from?

I am trying to do kind of video captioning but not exactly that.
In my dataset, each image has exactly two objects and one relationship among them.
For example:- person-ride-bicycle i.e. each image contains following pair: object1-relationship-object2

object 1 and object 2 have 35 classes/labels to choose from
relationship have 85 classes/labels to choose from

More about dataset-
let’s consider first folder 0000. It has 30 images, these are frames of a same video (all will have same relationship let’s say: person-ride-bicycle)

i have annotations corresponding to each the folder. for eg,
for the folder 0000, i have annotation 12-45-7 (object1-relationship-object2)
for the folder 0001, i have annotation 22-54-30
and so on…

Architecture looks something like this…
relationship below is dog-plays-frisbee

@ptrblck can you help? how can i load frames of a particular video as a single batch. all frames have same label