hi,
i need limit number of load images to all_deta = datasets.ImageFolder(path)
example folder contains 1000 images(including all subfolders) i want ot load 800 among them only how can i do that , later on i want use iter
and next
as well.
You could wrap the all_deta
object into torch.utils.data.Subset
and pass the wanted 800 indices to it.
i did it , like
dataset_subset = torch.utils.data.Subset(all_data, np.random.choice(len(all_data), 800, replace=False))
train_data, test_data = random_split(dataset_subset, [train_data, test_data])
train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
problem is…its not iterable through
trainiter = iter(train_loader)
features, labels = next(trainiter)
error was
TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class 'PIL.Image.Image'>
how can we modify iter and next according to imagefolder object
Your ImageFolder
is missing the transform
to return the supported classes and returns PIL.Image
s instead. Add transform=torchvision.transforms.ToTensor()
and it should most likely work.
i did that also,
image_transforms = {
'train':torchvision.transforms.Compose([
torchvision.transforms.Resize(size=image_size),
torchvision.transforms.RandomHorizontalFlip(),
torchvision.transforms.RandomCrop(size=image_size),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225))
issue is same,
Could you post a minimal, executable code snippet to reproduce the issue, please?
import numpy as np
import torchvision
import torch
from torch.utils.data import Dataset, DataLoader, random_split, sampler
import torch.optim as optim
import torch.nn as nn
import torch.nn.functional as F
from torchvision import models, datasets, models
import random
import os
batch_size = 16
image_size = 299
image_transforms = {
'train':torchvision.transforms.Compose([
torchvision.transforms.Resize(size=image_size),
torchvision.transforms.RandomHorizontalFlip(),
torchvision.transforms.RandomCrop(size=image_size),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225))
]),
'val':torchvision.transforms.Compose([
torchvision.transforms.Resize(size=image_size),
torchvision.transforms.CenterCrop(size=image_size),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225))
])
}
path=‘D:/imgs/’
all_data = datasets.ImageFolder(root=path)
dataset_subset = torch.utils.data.Subset(all_data, np.random.choice(len(all_data), 1000, replace=False))
train_data=800
test_data=200
train_data, test_data = random_split(dataset_subset, [train_data, test_data])
train_data.dataset.transform = image_transforms[‘train’]
train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)
trainiter = iter(train_loader)
features, labels = next(trainiter)
print(features.shape, labels.shape)
TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found <class ‘PIL.Image.Image’>
Try to pass the transformation directly to the ImageFolder
instead of trying to manipulate the internal attribute afterwards.