Own database using torchvision.datasets.ImageFolder

LeeYongHyeok · November 27, 2018, 3:39pm

hello everyone.
I’m trying to lip reading task using PyTorch.

so, I want to use the owm lip database, to training sets
.
cut/ speaker1 / utterance 1
cut/ speaker1 / utterance 2
…
cut/ speaker30 / utterance 49
cut/ speaker30 / utterance 50

each utterance [1~50] directory has 1~40.jpg, lip image

In this case, is it right torchvision.datasets.ImageFolder ?

dataset = torchvision.datasets.ImageFolder(root=’./cut’, transform=transforms.ToTensor())
data_loader = torch.utils.data.DataLoader(dataset=dataset, batch_size=batch_size)

ptrblck · November 27, 2018, 3:48pm

Yes, if each utterance corresponds to a different class and you would like to classify each image separately.
However, if you would like to classify a “sequence” of these images into one of the classes, I would rather write an own Dataset and use e.g. a sliding window approach.

LeeYongHyeok · November 27, 2018, 3:55pm

thanks for your reply,
is there any reference documents your suggestion?

ptrblck · November 27, 2018, 4:30pm

You can find a small example for a comparable use case here. In your case, you shouldn’t use the sampler, since you are not dealing with different persons.
Let me know, if you can adapt this code for your use case.

LeeYongHyeok · November 27, 2018, 4:33pm

thanks! I’ll try this code in my case.