Can someone help on porting this snippet of torch to Pytorch?

Hi, I’m trying to train on ImageNet the same way it is done in torch(based on this). All transformations are present in Pytorch, except one. AlexNet lighting method.
The torch implementation can be found and viewed from fb.resnet.torch/datasets/transforms.lua#L183
However, I dont know how to port this to Pytorch!
Here is the torch implementation :

-- Lighting noise (AlexNet-style PCA-based noise)
function M.Lighting(alphastd, eigval, eigvec)
   return function(input)
      if alphastd == 0 then
         return input
      end

      local alpha = torch.Tensor(3):normal(0, alphastd)
      local rgb = eigvec:clone()
         :cmul(alpha:view(1, 3):expand(3, 3))
         :cmul(eigval:view(1, 3):expand(3, 3))
         :sum(2)
         :squeeze()

      input = input:clone()
      for i=1,3 do
         input[i]:add(rgb[i])
      end
      return input
   end
end

I found a seemingly Pytorch port (preprocess.py), but it does not work and complains about this line:

alpha = img.new().resize_(3).normal_(0, self.alphastd)

and gives the error :

AttributeError: ‘Image’ object has no attribute ‘new’

Here is how it looks :

__imagenet_pca = {
    'eigval': torch.Tensor([0.2175, 0.0188, 0.0045]),
    'eigvec': torch.Tensor([
        [-0.5675,  0.7192,  0.4009],
        [-0.5808, -0.0045, -0.8140],
        [-0.5836, -0.6948,  0.4203],
    ])
}

# Lighting data augmentation take from here - https://github.com/eladhoffer/convNet.pytorch/blob/master/preprocess.py
class Lighting(object):
    """Lighting noise(AlexNet - style PCA - based noise)"""

    def __init__(self, alphastd, eigval, eigvec):
        self.alphastd = alphastd
        self.eigval = eigval
        self.eigvec = eigvec

    def __call__(self, img):
        if self.alphastd == 0:
            return img

        alpha = img.new().resize_(3).normal_(0, self.alphastd)
        rgb = self.eigvec.type_as(img).clone()\
            .mul(alpha.view(1, 3).expand(3, 3))\
            .mul(self.eigval.view(1, 3).expand(3, 3))\
            .sum(1).squeeze()
        return img.add(rgb.view(3, 1, 1).expand_as(img))

Can anyone please help me on this ?
Thanks alot in advance

img seems to be a PIL.Image. Try to cast it to a tensor before the new() call e.g. with:

img = = torch.from_numpy(np.array(img))
# or
import torchvision.transforms.functional as TF
img = TF.to_tensor(img)

The former approach will keep the data as it was loaded (uint8 in range [0, 255]), while the latter will rescale your image to [0, 1] and cast it to float.

1 Like

Thank you very very much. That indeed helped me spot the culprit!
I had to use transform.ToTensor() prior to calling the Lighting() method. Failing to do so resulted in a PIL image being sent as the input for lightining() which expects a Tensor and that’s why the error occurs.
So it should have been like :

    train_tfms = transforms.Compose([
            transforms.RandomResizedCrop(size),
            transforms.RandomHorizontalFlip(),
            transforms.ColorJitter(.4,.4,.4),
            transforms.ToTensor(),
            Lighting(0.1, __imagenet_pca['eigval'], __imagenet_pca['eigvec']),
            normalize,
        ])