ImageFolder incompatible with albumentations

SK412 · April 5, 2021, 8:03pm

HI! I am trying to use albumenations transforms with ImageFolder utility but I am stuck at this error, can some body help me out here?
I have my data structure in the following way

root/dog/xxy.png
root/dog/[...]/xxz.png

root/cat/123.png
root/cat/nsdf3.png
root/cat/[...]/asd932_.png

Thus, I am using ImageFolder class.


transform = A.Compose([

    A.RandomCrop(width=256, height=256),

    A.HorizontalFlip(p=0.5),

    A.RandomBrightnessContrast(p=0.2),

])

data = ImageFolder(root='/content/flower_data/train',transform=transform)

loader = DataLoader(data)

for x, y in loader:

    print(x.shape) # image

    print(y)

    if y==1:break # image label

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-77-041b03c16ee0> in <module>()
      8 data = ImageFolder(root='/content/flower_data/train',transform=transform)
      9 loader = DataLoader(data)
---> 10 for x, y in loader:
     11     print(x.shape) # image
     12     print(y)

4 frames
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/folder.py in __getitem__(self, index)
    178         sample = self.loader(path)
    179         if self.transform is not None:
--> 180             sample = self.transform(sample)
    181         if self.target_transform is not None:
    182             target = self.target_transform(target)

TypeError: __call__() takes 1 positional argument but 2 were given ```

ptrblck · April 6, 2021, 8:27am

The error seems quite strange, as it points towards the used transformation, which apparently doesn’t accept any additional input arguments.
Could you load a single image and try to pass it directly to transform without the ImageFolder dataset?

SK412 · April 6, 2021, 8:11pm

Hi, Thanks for the reply @ptrblck! Yes, I am able to run it on single image without any errors, can you suggest me any way to make it work with Pytorch ImageFolder because I will need to send images in batches during training, currently the code works for one image at an instance, my another concern is the code needs image path but when i use imagefolder I dont have any access to image path that i can provide to these transforms, one other way would be to write a custom dataset function where i store image path in a text file along with label and an index that can accesse the pathname of each image and pass it to transforms and make the code work,this is my backup approach. So at the moment i am looking to make it work with ImageFolder alone if possible. Sorry for the long paragraph.

EDIT:
Additional Info. All my images are Jpegs and I am looking forward to use Densenets and similar CNN’s as my Model.

Thanks for taking your time to help the forum, your answers are very helpful.

image_path = '/content/flower_data/train/1/image_06734.jpg'

image = cv2.imread(image_path)

image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Augment an image

transformed = transform(image=image)

transformed_image = transformed["image"]

plt.imshow(transformed['image'])

#plt.imshow(x.permute(1, 2, 0))```

SK412 · April 7, 2021, 6:00am

Update:
Hi Again, So I went with the backup approach i.e. creating a text file with paths. I am sharing it here if others also end in a similar situation. Can someone tell me if there is any way to improve the efficiency. I am concerned with the fact that I have to open my text file each time i need to get a pic. So for each epoch it has to open (batch_size*number of batches) times so in total number of samples * epochs, will this affect the computational time, if so how adversely. Can someone suggest any modification that can solve the problem? Thanks!

class Flowers(Dataset):

  def __init__(self,path,json_path=None,transform=None):

    self.path=path

    self.json_path=json_path

    self.transform=transform

    self.textmaker()


  def textmaker(self):

    f= open("train.txt","w+")

    for species in os.listdir(self.path):    

          spec_folder = os.path.join(self.path,species)

          for image in os.listdir(spec_folder):

            image_path=os.path.join(spec_folder,image)

            line=str(image_path)+','+str(species)+'\n'

            f.write(line)

    f.close()

  def __getitem__(self,idx):

    text = open("train.txt")

    lines=text.readlines()

    line=lines[idx]

    image_path=line.split(",")[0]

    label=line.split(",")[1][:-1]

    label=torch.tensor(int(label))

    image = cv2.imread(image_path)

    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    text.close()

    if self.transform:

      image=self.transform(image=image)

    return image,label

  def __len__(self):

    N=0

    for dirpath,dirnames,filenames in os.walk(self.path):

      n=len(filenames)

      N+=n

    return N

ptrblck · April 7, 2021, 7:55am

I’m not sure I understand the current issue correctly.
Based on your description the transformations are working fine on a single sample.
However, apparently you are seeing errors when using a DataLoader, which is strange, since the transformations would still be applied on a single sample in the default setup (i.e. without using a BatchSampler etc.).

Which transformation would need an image path? Usually the transformations are applied on an image or tensor directly.

SonSlider · April 8, 2021, 9:37am

I will need to send images in batches during training, currently the code works for one image at an instance, my another concern is the code needs image path but when i use imagefolder I dont have any access to image path that i can provide to these transforms, one other way would be to write a custom dataset function where i store image path in a text file along with label and an index that can accesse the pathname of each image and pass it to transforms and make the code work,this is my backup approach. So at the moment i am looking to make it work with ImageFolder alone if possible. Sorry for the long paragraph.

SK412 · April 10, 2021, 5:51am

Hi!, I was able to find the issue and rectified it in my case, you are right the problem is not with pytorch transforms. Let me summarise it for your understanding.
Lets say I create a ImageFolder, My data is strictly in the order PyTorch expects it t be in case of ImageFolder.

data = ImageFolder(root='/content/flower_data/train',transform=transform)

Here Transforms are from Albumentations Library,

transform = A.Compose([
    A.RandomCrop(width=256, height=256))],ToTensorV2()])

Then I end up with this issue,

TypeError: __call__() takes 1 positional argument but 2 were given

The Code works fine with pytorch transforms, when i create my own custom dataset class like i mentioned in one of my above answers, then the problem with albumntaions library is also solved . I checked their issues(Albumentations) on GitHub, turns out few more people had this issue, there was no conclusive solution, At the moment the solution is to create your own Dataset Function and then use their library, it is same like regular PyTorch transforms or use ImageFolder Dataset with PyTorch transforms. Hope this clarifies it. Cheers!