A gist of affine transforms in pytorch


(Nick) #1

Hi all,

I implemented some affine transforms for pytorch – specifically, Rotation(), Translation(), Shear(), and Zoom(), and an over-arching Affine() transform which can perform all of those transforms while only using one interpolation.

Right now it’s not maximally efficient because i cast to and from numpy… Eventually I’ll implement this all in torch and then it can be performed efficiently on the gpu.

Here is a link to the gist:

You’ll see that the transforms take in an option y argument which allows you to perform the same affine transform on both the input and target image… You’ll need a special dataset class for this, which I’ve provided as a starting point in the following gist for both in-memory (TensorDataset) and out-of-memory (FolderDataset) data… it basically just required adding a co_transform argument:

I’ve spot-checked these transforms so I’m confident they aren’t blatantly wrong, but if you come across a bug then add a comment in the gist.

Short example of the easiest use case

import torch
affine_transform = Affine(rotation_range=30, translation_range=(0.2,0.2), zoom_range=(0.8,1.2))
x = torch.ones(10,3,50,50)
y = torch.ones(10,3,50,50)
data  = TensorDataset(x, y, co_transform=affine_transform)

x_sample, y_sample = data[0] # the transforms should be applied to both x and y

Converting a keras DirectoryIterator to a torch variable
(Adam Paszke) #2

Nice! We’ll think how could that be integrated into torchvision. Note that numpy conversions are nearly free, because the array/tensor you get after the conversion shares the data with the original object.


(Kaiyin Zhong) #3

I made something simpler:

from PIL import Image
from scipy import misc
import matplotlib.pyplot as plt
import numpy as np
from scipy import ndimage
from skimage.transform import warp, AffineTransform

f = misc.face(gray=True)
plt.hist(f.flatten())

class RandomAffineTransform(object):
    def __init__(self,
                 scale_range,
                 rotation_range,
                 shear_range,
                 translation_range
                 ):
        self.scale_range = scale_range
        self.rotation_range = rotation_range
        self.shear_range = shear_range
        self.translation_range = translation_range

    def __call__(self, img):
        img_data = np.array(img)
        scale_x = np.random.uniform(*self.scale_range)
        scale_y = np.random.uniform(*self.scale_range)
        scale = (scale_x, scale_y)
        rotation = np.random.uniform(*self.rotation_range)
        shear = np.random.uniform(*self.shear_range)
        translation = (
            np.random.uniform(*self.translation_range),
            np.random.uniform(*self.translation_range)
        )
        af = AffineTransform(scale=scale, shear=shear, rotation=rotation, translation=translation)
        img_data1 = warp(img_data, af.inverse)
        return Image.fromarray(img_data1 * 255)

I ended up adding some randomness into it:

from PIL import Image
from scipy import misc
import matplotlib.pyplot as plt
import numpy as np
from scipy import ndimage
from skimage.transform import warp, AffineTransform


class RandomAffineTransform(object):
    def __init__(self,
                 scale_range,
                 rotation_range,
                 shear_range,
                 translation_range
                 ):
        self.scale_range = scale_range
        self.rotation_range = rotation_range
        self.shear_range = shear_range
        self.translation_range = translation_range

    def __call__(self, img):
        img_data = np.array(img)
        h, w, n_chan = img_data.shape
        scale_x = np.random.uniform(*self.scale_range)
        scale_y = np.random.uniform(*self.scale_range)
        scale = (scale_x, scale_y)
        rotation = np.random.uniform(*self.rotation_range)
        shear = np.random.uniform(*self.shear_range)
        translation = (
            np.random.uniform(*self.translation_range) * w,
            np.random.uniform(*self.translation_range) * h
        )
        af = AffineTransform(scale=scale, shear=shear, rotation=rotation, translation=translation)
        img_data1 = warp(img_data, af.inverse)
        img1 = Image.fromarray(np.uint8(img_data1 * 255))
        return img1

(Nick) #4

Cool! That’s very simple! The gist above is very old… If you see now the affine transforms in my torchsample package are now completely written in pytorch and thus can be included in nn.Module’s and/or run on gpu. Whatever works for you though!


(Alex Rogozhnikov) #5

It seems that these lines are not needed.
Thanks for a neat implementation!


(Royi) #6

Is there a built in way to do it in PyTorch?


(Royi) #7

For now, using PIL I defined lambda function:

imageRotate = lambda mI: mI.rotate((2 * imageRotAngle * np.random.rand(1)) - imageRotAngle)

trainSetTransform = transforms.Compose([transforms.RandomCrop(28, padding = imageCropPad), transforms.Lambda(imageRotate), transforms.ToTensor()])

This (On MNIST Data) shifts and rotates the image.
I think rotation should be built in in PyTorch.


(Will) #8

Thank you for your effort.
Could you please show how to apply it in transforms.Compose?
I tried it like this

transforms.Compose([
transforms.RandomSizedCrop(224),
transforms.RandomHorizontalFlip(),
transform.Affine(rotation_range=10,translation_range =0.1),
transforms.ToTensor()]

but end up with an error:TypeError:'tuple' object is not callable.

Do you know why?

And the version of my pytorch is 0.3.