I want to select only 10% of imagenet from datasets.ImageFolder as follows:
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['train', 'val']}
I know I should use torch.utils.data.Subset , but I am not sure how I should feed image_datasets to this, espacially because the size of train and val is not equal. I appreciate any suggestions
Thanks for response, @ptrblck
The problem is when I use Subset, Dataloader shows an error.
I want to modify this code:
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['train', 'val']}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=args.batch_size,
shuffle=True, num_workers=2 ) for x in ['train', 'val']}
I want to modify it, in a way that only 10% of train, and 10% of val be considered. I used subset as follows:
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['train', 'val']}
image_train = data_utils.Subset(image_datasets['train'], 60000)
image_val = data_utils.Subset(image_datasets['val'], 2500)
dataloaders_train = {x: torch.utils.data.DataLoader(image_train[x], batch_size=args.batch_size, shuffle=True, num_workers=2 ) for x in ['train']}
dataloaders_val = {x: torch.utils.data.DataLoader(image_val[x], batch_size=args.batch_size,shuffle=True, num_workers=2 ) for x in ['val']}
but this gives an error,
TypeError: βintβ object is not subscriptable
I am not sure what I am missing here. Any suggestions?
image_datasets = {x: TensorDataset(torch.randn(100, 1)) for x in ["train", "val"]}
lens = {x: len(image_datasets[x]) for x in ["train", "val"]}
image_datasets = {x: torch.utils.data.Subset(image_datasets[x], indices=torch.randperm(int(lens[x]*0.1))) for x in ["train", "val"]}
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=10, shuffle=True, num_workers=2 ) for x in ["train", "val"]}
dataloader_train = dataloaders["train"]
dataloader_val = dataloaders["val"]