I implemented some affine transforms for pytorch – specifically, Rotation(), Translation(), Shear(), and Zoom(), and an over-arching Affine() transform which can perform all of those transforms while only using one interpolation.
Right now it’s not maximally efficient because i cast to and from numpy… Eventually I’ll implement this all in torch and then it can be performed efficiently on the gpu.
Here is a link to the gist:
You’ll see that the transforms take in an option y argument which allows you to perform the same affine transform on both the input and target image… You’ll need a special dataset class for this, which I’ve provided as a starting point in the following gist for both in-memory (TensorDataset) and out-of-memory (FolderDataset) data… it basically just required adding a co_transform argument:
I’ve spot-checked these transforms so I’m confident they aren’t blatantly wrong, but if you come across a bug then add a comment in the gist.
Short example of the easiest use case
import torch
affine_transform = Affine(rotation_range=30, translation_range=(0.2,0.2), zoom_range=(0.8,1.2))
x = torch.ones(10,3,50,50)
y = torch.ones(10,3,50,50)
data = TensorDataset(x, y, co_transform=affine_transform)
x_sample, y_sample = data[0] # the transforms should be applied to both x and y
Nice! We’ll think how could that be integrated into torchvision. Note that numpy conversions are nearly free, because the array/tensor you get after the conversion shares the data with the original object.
Cool! That’s very simple! The gist above is very old… If you see now the affine transforms in my torchsample package are now completely written in pytorch and thus can be included in nn.Module’s and/or run on gpu. Whatever works for you though!