Pre-processing using torchvision.transforms.functional

lavender99 · April 7, 2019, 3:19pm

i want to use torchvision.transforms.functional functions ,like gamma and so on , should i add them into def getitem(self,idx) ?

ptrblck · April 7, 2019, 3:29pm

If you would like to use adjust_gamma on the data samples, yes, you can add it to your __getitem__ as other transformations.

lavender99 · April 7, 2019, 3:32pm

so all Functional Transforms should be added to getitem but the TORCHVISION.TRANSFORMS added using Compose

ptrblck · April 7, 2019, 3:34pm

transforms.Compose is one way to just apply a sequence of transformations sequentially.
It really depends on your use case.
E.g. the functional transforms API is useful for segmentation use cases, as you can apply “random” transformations on the data and target using the same random parameters.

lavender99 · April 7, 2019, 3:37pm

acutally yes my case is segmentation , is using both of the transforms will give better result ?

ptrblck · April 7, 2019, 3:39pm

By “using both of the transforms” do you mean applying the transformation on the image and target?
If so, that would be necessary, since otherwise you would e.g. randomly crop the image and target using other parameters, thus destroying the correspondence between both.

Have a look at this example.

lavender99 · April 7, 2019, 3:41pm

No i mean using transforms.Compose and Functional Transforms

ptrblck · April 7, 2019, 3:43pm

You can use a mix of both or just the functional API.
The results won’t differ, as it’s basically just coding style.
Internally the torchvision.transforms methods call their functional counterpart.

Just make sure to use random transformation using the functional API for a segmentation use case.

lavender99 · April 8, 2019, 4:39pm

i am trying to add normaliztion using functional API.
but it seems nor the right method i think

if random.random() > 0.5:
            image = TF.normalize(image, 0.34, 0.27, inplace=False)
            mask = TF.normalize(mask, 0.08, 0.28, inplace=False)

ptrblck · April 8, 2019, 4:47pm

Currently you are randomly normalizing the data. Are you sure you would like to apply it randomly?

If image and mask are single channel images, you should pass the mean and std as:

image = TF.normalize(image, (0.34, ), (0.27, ), inplace=False)

Also, if you are working on a segmentation use case, the mask shouldn’t be normalized, since this will corrupt your class labels.

lavender99 · April 8, 2019, 5:04pm

no i do not want to apply it randomly, I would like to normalize the full dataset between, i am more familiar with torchvision.transforms methods but you mentioned that functional API is better for segmentation tasks , also i tried the torchvision.transforms with Unet model it gives me a negative loss while when i used functional API the loss as low at the beginning and then started to be Nan

ptrblck · April 8, 2019, 5:19pm

In that case just apply TF.normalize on your data without the if-clause.

Yes, if you would like to randomly apply transformations, e.g. transforms.RandomCrop, you should use the functional API instead, as you can sample the parameters and apply the same parameters on the image as well as the mask.

The torchvision.transforms modules call internally their functional methods, so if you’ve applied exactly the same transformations, both should yield the same results.
However, if it’s possible, you could post the code so that we could have a look.

lavender99 · April 8, 2019, 7:42pm

thank you so much .
the code is long to be written here , i am sharing the google colab link with you here
https://colab.research.google.com/drive/1cwJPCta7-2FXg9mOLQqzRViye5KevDuU

ptrblck · April 8, 2019, 11:45pm

Thanks for the code!
It looks like you are dealing with RGB images.
Using a single value in TF.normalize will only normalize the first channel, so you might want to provide 3 values for the mean and stddev.

lavender99 · April 9, 2019, 10:34am

thank you !
i correct the normalize for the 3 channels but i am stilling haveing loss = Nan

ptrblck · April 9, 2019, 2:13pm

Do you get the NaNs using both approaches or just using the functional API?
Could you check your input for NaN values using input != input?

lavender99 · April 9, 2019, 2:43pm

Just using functional API

lavender99 · April 9, 2019, 3:06pm

Type: SList
String form: [‘/bin/bash: input: command not found’]
Length: 1
File: /usr/local/lib/python3.6/dist-packages/IPython/utils/text.py
Docstring:
List derivative with a special access attributes.

These are normal lists, but with the special attributes:

.l (or .list) : value as list (the list itself).
.n (or .nlstr): value as a string, joined on newlines.
.s (or .spstr): value as a string, joined on spaces.
.p (or .paths): list of path objects (requires path.py package)

Any values which require transformations are computed only once and
cached.

ptrblck · April 9, 2019, 8:04pm

You should run the code without the question mark:

a = np.array([0, 1, np.nan])
x = torch.from_numpy(a)
print((x!=x).any())
> tensor(1, dtype=torch.uint8)

This can be used as a condition to check for NaNs in your input data.

Anyway, since the torchvision.transforms.Normalize module doesn’t create NaNs, your input data should be fine.
Could you post the code using the Normalize module so that I could debug both code snippets?

lavender99 · April 11, 2019, 9:05am

bs = 2
num_epochs = 100
learning_rate = 1e-3
mom  = 0.9
class MYDataLoader(data.Dataset):
    def __init__(self,root_dir,seg_dir,transforms = None):
        self.root_dir = root_dir
        self.seg_dir = seg_dir
        self.transforms = transforms
        self.files = os.listdir(self.root_dir)
        self.lables = os.listdir(self.seg_dir)
    
    def __len__(self):
        return len(self.files)
    
    def __getitem__(self,idx):
        img_name = self.files[idx]
        label_name = self.lables[idx]
        img = Image.open(os.path.join(self.root_dir,img_name))
        label = Image.open(os.path.join(self.seg_dir,label_name))
        if self.transforms:
            img = self.transforms(img)
            label = self.transforms(label)
            return img,label
        else:
            return img, label
full_dataset = MYDataLoader('/ data/training/images',
                                     '//data/training/labels',
                                    transforms=tfms.Compose([tfms.Resize((256,256)),
    tfms.ColorJitter(hue=.05, saturation=.05),
    tfms.RandomHorizontalFlip(),tfms.Grayscale(num_output_channels=1),
    tfms.ToTensor() , tfms.Normalize([0.5],[0.5])
]))

this is the code with Normalize module.