What is the main difference between transforms from torchvision.transforms and torchvision.transforms.functional?

There are a lot of similar functions eg five_crop, affine etc. Whats the main differences? When should you use one from the other?

The functional API is stateless, i.e. you can use the functions directly passing all necessary arguments.
On the other side torchvision.transforms are mostly classes which have some default parameters or which store the parameters you’ve provided.
For example using Normalize, you could define the class and use it with the passed parameters. Using the functional approach, you would have to pass the parameters every time:

transform = transforms.Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
data = transform(data)

# Functional
data = TF.normalize(data, mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))

You can use the functional API to transform your data and target with the same random values, e.g. for random cropping:

i, j, h, w = transforms.RandomCrop.get_params(image, output_size=(512, 512))
image = TF.crop(image, i, j, h, w)
mask = TF.crop(mask, i, j, h, w)

Sometimes it’s also just your coding style.
While some find the functional API to be cleaner, some prefer the classes.

4 Likes

You can use the functional API to transform your data and target with the same random values, e.g. for random cropping

Thank you this was what I was looking for.

TF is import torchvision.transforms.functional as TF correct?

Yes, sorry for not mentioning it.