T.Compose | TypeError: __call__() takes 2 positional arguments but 3 were given

I have been getting this odd error saying that I have passes too many arguments into my call() method. I am thoroughly perplexed because I am sure I have only passes 2 arguments.

Here is my Compose class:

class Compose(object):
    def __init__(self, transforms):
        self.transforms = transforms

    def __call__(self, image, target):
        for t in self.transforms:
            image, target = t(image, target)
        return image, target

Here’s the function that is call the method:

def get_transform(train):
    transforms = []
    if train:
        # during training, randomly flip the training images
        # and ground-truth for data augmentation
    return T.Compose(transforms)

Here’s the error msg:

Any help would be appreciated! <3

There are a few issues in the example code:

  • Even though you are defining your own Compose class, it seems you are still using the torchvision.transforms.Compose one, so you might want to remove the T namespace.
  • Your custom Compose object takes two inputs. However, the underlying torchvision.transforms don’t, so you would have to call the transformation on the image and target separately.
  • Since you are using (one) random transformation, the image and target will not be transformed using the same random number, which might be wrong (e.g. for a segmentation use case). If you want to apply the same random transform on both inputs, have a look at this post
  • RandomHorizontalFlip and Resize work on PIL.Images, so ToTensor should be applied last.

Here is a small dummy example:

class Compose(object):
    def __init__(self, transforms):
        self.transforms = transforms

    def __call__(self, image, target):
        for t in self.transforms:
            image = t(image)
            target = t(target)
        return image, target

import torchvision.transforms as transforms

transform = []
transform = Compose(transform)

to_image = transforms.ToPILImage()
x = to_image(torch.randn(3, 24, 24))
y = to_image(torch.randn(3, 24, 24))

transform(x, y)

Thank you! This did get me a little further. However, when I try to pass my target variable to the transform Resize I run into a problem, because my target variable is a dictionary, not a PIL image, with bounding boxes, labels, iscrowd, etc.

It looks like I will have to make a custom resize function for my target.
Do you know of a way to resize BBox coordinates so they correspond to the resized image?

I just realized that my dataset gives me the boxes as proportions of the image. All I have to do is apply the resize to the image then multiply the height and width of the transformed image by the given proportions of the bboxes.