Own database using torchvision.datasets.ImageFolder

hello everyone.
I’m trying to lip reading task using PyTorch.

so, I want to use the owm lip database, to training sets
.
cut/ speaker1 / utterance 1
cut/ speaker1 / utterance 2

cut/ speaker30 / utterance 49
cut/ speaker30 / utterance 50

each utterance [1~50] directory has 1~40.jpg, lip image

In this case, is it right torchvision.datasets.ImageFolder ?

dataset = torchvision.datasets.ImageFolder(root=’./cut’, transform=transforms.ToTensor())
data_loader = torch.utils.data.DataLoader(dataset=dataset, batch_size=batch_size)

Yes, if each utterance corresponds to a different class and you would like to classify each image separately.
However, if you would like to classify a “sequence” of these images into one of the classes, I would rather write an own Dataset and use e.g. a sliding window approach.

thanks for your reply,
is there any reference documents your suggestion?

You can find a small example for a comparable use case here. In your case, you shouldn’t use the sampler, since you are not dealing with different persons.
Let me know, if you can adapt this code for your use case.

thanks! I’ll try this code in my case.