Trying to iterate through my custom dataset

dsillman2000 · April 16, 2017, 7:02pm

Hi all,

I’m just starting out with PyTorch and am, unfortunately, a bit confused when it comes to using my own training/testing image dataset for a custom algorithm. For starters, I am making a small “hello world”-esque convolutional shirt/sock/pants classifying network. I’ve only loaded a few images and am just making sure that PyTorch can load them and transform them down properly to 32x32 usable images. My ImageFolder is set up like so:

imgs/socks/sockimages.jpeg
imgs/pants/pantsimages.jpeg
imgs/shirt/shirtimages.jpeg

and a similar setup for my testing images folder. According to my current knowledge, the image loader built into PyTorch should read the labels from the subfolder names within the training/test images. However, I’m getting a TypeError complaining that my iterator is not iterable. Here’s my code and the error:

import torch
import torchvision
import torchvision.datasets as dset
import torchvision.transforms as transforms

transform = transforms.Compose(
[transforms.ToTensor(),
 transforms.Scale((32,32)),
 transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = dset.ImageFolder(root="imgs",transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,shuffle=True,         num_workers=2)

testset = dset.ImageFolder(root='tests',transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,shuffle=True,     num_workers=2)

classes=('shirt','pants','sock')

import matplotlib.pyplot as plt
import numpy as np

# functions to show an image
def imshow(img):
    img = img / 2 + 0.5     # unnormalize
    npimg = img.numpy()
    plt.imshow(np.transpose(npimg, (1, 2, 0)))

# get some random training images
dataiter = iter(trainloader)
images, labels = dataiter.next()

# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))

#Error:
TypeError: ‘builtin_function_or_method’ object is not iterable
It says that it is in reference to the line containing dataiter.next(), meaning that the compiler believes that I cannot iterate dataiter?

Please help! Thanks in advance,

-David Sillman, new to PyTorch

smth · April 16, 2017, 9:09pm

I just tried this basic sanity check, and it seems to work fine:

import torch
import torchvision
a=torchvision.datasets.MNIST(root='.', download=True, transform=torchvision.transforms.ToTensor())
iter(torch.utils.data.DataLoader(a)).next()

for some reason dataiter in your case is remaining an inbuilt function, i’m not sure why.

dsillman2000 · April 16, 2017, 11:02pm

Yes, the MNIST example worked for me too. For some reason, though, my custom images are somehow contributing to my dataiter or trainloader or something being uniterable…?

Perhaps someone can try running the code themselves to see if there is just an issue on my side somehow? It’s clear that the compiler thinks that one of my variables is actually a method.

EDIT : Here is my FULL error message:

Traceback (most recent call last):
  File "classifier.py", line 32, in <module>
(images, labels) = DataLoaderIter(trainloader).__next__()
File "//anaconda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 174, in __next__
return self._process_next_batch(batch)
File "//anaconda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 198, in _process_next_batch
 raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
File "//anaconda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 34, in _worker_loop
 samples = collate_fn([dataset[i] for i in batch_indices])
File "//anaconda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 34, in <listcomp>
samples = collate_fn([dataset[i] for i in batch_indices])
File "//anaconda/lib/python3.6/site-packages/torchvision/datasets/folder.py", line 67, in __
getitem__
img = self.transform(img)
File "//anaconda/lib/python3.6/site-packages/torchvision/transforms.py", line 29, in __call__
img = t(img)
File "//anaconda/lib/python3.6/site-packages/torchvision/transforms.py", line 130, in __call__
w, h = img.size
TypeError: 'builtin_function_or_method' object is not iterable

ritchieng · May 24, 2017, 10:08am

ERROR DEBUGGING 1
I’m having the same error message when using the official datasets from torchvision such as CIFAR10. It’s because of transforms.Scale((32,32)). If you remove the scale, it’s fine. It seems like a bug.

ERROR DEBUGGING 2
I then went on to understand scale works on PIL. So I went to use ToPILImage first.
So now I’ve

transform_trf = transforms.Compose([transforms.ToPILImage(),
                                    transforms.Scale(size=[28, 28]),
                                    transforms.ToTensor(),
                                    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
                                    ])

Error given AssertionError: pic should be Tensor or ndarray.

ERROR DEBUGGING 3

Naturally I would think there’s a need for Tensor so I added a toTensor() operation.

transform_trf = transforms.Compose([transforms.ToTensor(),
                                    transforms.ToPILImage(),
                                    transforms.Scale(size=[28, 28]),
                                    transforms.ToTensor(),
                                    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
                                    ])

Now there’s this error which is hard to debug TypeError: unsupported operand type(s) for /: 'list' and 'int'

I was hoping someone can clarify what’s happening here.

@smth @apaszke

tom · May 24, 2017, 11:26am

Hi,

the error message is not terribly intuitive, but the reason is that the size parameter should be an integer that should be the length of the smaller edge (see torchvision.transforms.Scale in the documentation).
It may be worth extending the Scale transform or checking for it if it is a common error case.

Best regards

Thomas

tyathalae · June 8, 2017, 11:10am

Hello,

I also encounter this error.
My images are 256x256 8bpp PNG files.
Curious thing is, I am running same code on 2 different PCs and it works on the non-CUDA system.
With the CUDA-enabled system, if I add the scale operation first,
'train': transforms.Compose([ transforms.Scale((64, 64)), transforms.ToTensor() ]),

I get the error message: TypeError: unsupported operand type(s) for /: 'tuple' and 'int'

If I first convert to tensor and then scale,
'train': transforms.Compose([ transforms.ToTensor(), transforms.Scale((64, 64)) ]),
I get the TypeError: 'builtin_function_or_method' object is not iterable error.

Finally, as @ritchieng found out, if I remove the scaling, it works without any problems.

Best regards,
Sel

tyathalae · July 7, 2017, 9:41am

Hello @ritchieng,
For me this issue was caused by an old (v0.1.8) torchvision package from pip. In that package, scale function was not checking the type of size if it is an int or not.

I have installed it from sources (master branch) and the problem of TypeError: unsupported operand type(s) for /: 'list' and 'int' is resolved.

Regards,
Sel

Li_Ting_Tay · July 9, 2017, 6:50am

The currently used torchvision package is v0.1.12.
There is still error:
TypeError: unsupported operand type(s) for /: ‘list’ and ‘int’.

May I now any other ways to resolve it?

tom · July 10, 2017, 1:31am

Likely, the torchvision package is still not quite up to date (0.1.12 sounds like a torch version). So you may need the source:

Best regards

Thomas

isalirezag · August 18, 2017, 10:00pm

did you solve this issue?

smth · September 21, 2017, 3:52am

@isalirezag updating your torchvision package should solve the issue.

Royi · November 11, 2017, 12:21pm

@smth, Is there a way to access subset of the set?

For instance, I defined:

hTestLoader = torch.utils.data.DataLoader(datasets.MNIST('../../DataSets/MNIST/', train = False, download = True, transform = transforms.ToTensor()), batch_size = testBatchSize, shuffle = False)

Assume its length is 100.
How could one access a random sub set of, let’s say 5, of the batches?

QuantScientist · November 11, 2017, 1:12pm

class FullTrainningDataset(torch.utils.data.Dataset):
    def __init__(self, full_ds, offset, length):
        self.full_ds = full_ds
        self.offset = offset
        self.length = length
        assert len(full_ds)>=offset+length, Exception("Parent Dataset not long enough")
        super(FullTrainningDataset, self).__init__()
        
    def __len__(self):        
        return self.length
    
    def __getitem__(self, i):
        return self.full_ds[i+self.offset]
    
validationRatio=0.11    

def trainTestSplit(dataset, val_share=validationRatio):
    val_offset = int(len(dataset)*(1-val_share))
    print ("Offest:" + str(val_offset))
    return FullTrainningDataset(dataset, 0, val_offset), FullTrainningDataset(dataset, 
                                                                              val_offset, len(dataset)-val_offset)


# In[25]:


batch_size=128

from torch.utils.data import TensorDataset, DataLoader

# train_imgs = torch.from_numpy(full_img_tr).float()
train_imgs=XnumpyToTensor (full_img)
train_targets = YnumpyToTensor(data['is_iceberg'].values)
dset_train = TensorDataset(train_imgs, train_targets)


train_ds, val_ds = trainTestSplit(dset_train)

train_loader = torch.utils.data.DataLoader(train_ds, batch_size=batch_size, shuffle=False,
                                            num_workers=1)
val_loader = torch.utils.data.DataLoader(val_ds, batch_size=batch_size, shuffle=False, num_workers=1)

print (train_loader)
print (val_loader)

euwern · December 27, 2017, 4:09am

I use the most recent version of torchvision (v0.20) and I still get the same error “TypeError: ‘builtin_function_or_method’ object is not iterable”

import torchvision.transforms as t
train_trans = t.Compose([
    t.ToTensor(),
    t.RandomCrop(224),
    t.ColorJitter(
        brightness = 0.4,
        contrast = 0.4,
        saturation = 0.4),
    t.RandomHorizontalFlip(),
    t.Normalize(mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225]),
])

I notice it is the transforms.ToTensor() function that is causing the issue. My workaround is to position ToTensor function just before transforms.Normalize and all the other transform functions before it.

import torchvision.transforms as t
train_trans = t.Compose([
    t.RandomCrop(224),
    t.ColorJitter(
        brightness = 0.4,
        contrast = 0.4,
        saturation = 0.4),
    t.RandomHorizontalFlip(),
    t.ToTensor(),
    t.Normalize(mean=[0.485, 0.456, 0.406],
                std=[0.229, 0.224, 0.225]),
])

nicolas-gervais · December 28, 2019, 7:24pm

IT IS NOT A BUG. As per the documentation, scale was a transformation that could be applied to a PIL Image, not a tensor! You will get these errors if you turn your PIL Image into a tensor before applying these transformations. See the other thread on this site here.

Change the order like this:

transform = transforms.Compose(
[transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
 transforms.Scale((32,32)),
 transforms.ToTensor()])

Brando_Miranda · September 27, 2022, 9:46pm

can you not directly loop through the data set object?

yes it works:

    root = Path('~/data/').expanduser()
    import torch
    import torchvision
    mnist = torchvision.datasets.MNIST(root=root, download=True, transform=torchvision.transforms.ToTensor())
    # iter(torch.utils.data.DataLoader(mnist)).next()
    for x, y in mnist:
        assert isinstance(x, torch.Tensor)
        assert isinstance(y, int)
        pass
    for x, y in iter(mnist):
        assert isinstance(x, torch.Tensor)
        assert isinstance(y, int)
        pass