Making augmentation transform dependent on epoch number

Stefano_Savian · July 7, 2020, 2:21pm

Dear all,

I have a doubt on data augmentation using dataloader. I want to apply a transformation with a certain probability p which can decreases either as the number of epochs progress, or when the loss function is higher than a certain threshold. However, due to the concurrent nature of the dataloader workers, I cannot figure out how to apply this to the transorm class, as I don’t know how I could get the epoch or loss as inputs to my function. So my question is

Is it possible to apply augmentation with the probability changing on the epoch number or loss function?
How could it be implemented?

below is the pseudo code of my transform function.

class custom_transform(object):

def __init__(self, size,p):
    if isinstance(size, numbers.Number):
        self.size = (int(size), int(size))
    else:
        self.size = size
    self.p = p
def __call__(self, inputs, target):
    if random.random() < self.p:
        GENERATE TRANSFORMED inputs, target

        inputs, target
    else:
        return inputs, target

@albanD @ptrblck your ideas are very welcome!

Many Thanks,
Stefano

albanD · July 9, 2020, 9:48pm

Hey,

I can’t think of a builtin way to do this. But you should be able to have a global variable shared by all the workers that contain this p that they will read and will be set by the main training process?
Maybe a 1 element Tensor will work as it will be shared across the processes.

ptrblck · July 11, 2020, 11:29pm

@albanD’s idea sounds good.

If your workflow would allow you to change the probability after each epoch finished, you could also directly manipulate the loader.dataset. At the beginning of each epoch, the workers of the DataLoader will be respawned and will thus use the changed dataset.