How to add noise to MNIST dataset when using pytorch

copyrightly · November 1, 2019, 6:28am

I want to add noise to MNIST. I am using the following code to read the dataset:

train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('../data', train=True, download=True,
                   transform=transforms.Compose([
                       transforms.ToTensor(),
                       transforms.Normalize((0.1307,), (0.3081,))
                   ])),
    batch_size=64, shuffle=True)

I’m not sure how to add (gaussian) noise to each image in MNIST.

ptrblck · November 1, 2019, 12:33pm

You could create a custom transformation:

class AddGaussianNoise(object):
    def __init__(self, mean=0., std=1.):
        self.std = std
        self.mean = mean
        
    def __call__(self, tensor):
        return tensor + torch.randn(tensor.size()) * self.std + self.mean
    
    def __repr__(self):
        return self.__class__.__name__ + '(mean={0}, std={1})'.format(self.mean, self.std)

and just add it to transforms.Compose:

transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,)),
    AddGaussianNoise(0., 1.)
])

copyrightly · November 12, 2019, 2:46am

Hi Ptrblck, may I ask another question. What if I want to add noise to just a fraction of the training samples, not all of them? Thank you!

ptrblck · November 12, 2019, 2:49am

If you would like to add it randomly, you could specify a probability inside the transformation and pass this probability while instantiating it.

On the other hand, if you would like to apply it just for specified data indices, you might need to apply the noise inside the training loop and use the data index (return the index additionally in your Dataset).

copyrightly · November 12, 2019, 5:54am

I tried to add it randomly and used the following code:
transforms.RandomApply(AddGaussianNoise(args.mean, args.std), p=0.5)
But I received an error:

assert isinstance(transforms, (list, tuple))
AssertionError

Is there anything wrong with my code? Thank you!

ptrblck · November 12, 2019, 5:38pm

Could you try to pass AddGaussianNoise as a list or tuple?

copyrightly · November 12, 2019, 9:02pm

Thank you! I changed it to [AddGaussianNoise(args.mean, args.std)]. Now it works!

saba · February 10, 2020, 2:43am

Hi Dear Ptrblck,

I have a question, I want to add noise to my original training dataset to have more robust model. It is good to add noise after data normalization or before data normalization my normalization is zero mean and unite variance?

ptrblck · February 10, 2020, 2:44am

I would probably add it after the normalization, as you can easily define the standard deviation and mean of your (white) noise.

saba · February 10, 2020, 2:46am

after normalization cause each value of the noise have different effect on the training, but before normalization the effect of the noise on training is same.is not it?

ptrblck · February 10, 2020, 2:47am

The effect would be the same, but I think it might be easier to define the noise relative to your samples, if each data sample has already a zero mean and a unit variance.
Anyway, I don’t think it should make a difference if you define the noise using the mean of the unnormalized inputs and their stddev.

saba · February 10, 2020, 2:54am

my inputs are patches, for each patch if I define a Gaussian noise with the same mean and std can be good? the patch are 11117

ptrblck · February 10, 2020, 3:00am

The approach sounds reasonable, but I can’t say if it’ll work good or bad.
Feel free to share the results of your experiments.

saba · February 12, 2020, 5:10am

Hi Ptrblck,

I want to create a 3 dimensional Gaussian with defined size and standard deviation. My code in Matlab is :

 % Gaussian3D Creates 3-D Gaussian Kernel of specified
 % width sigma_array=[sigma_x, sigma_y, sigma_z] with
 % profile length of size_array=[size_x, size_y, size_z]
 a1=t1;
 b1=t2;
 c1=t3;
 if(all([sigma_array ])>0 & length(size_array) == length(sigma_array))
  % Make 1D Gaussian kernel
  % Filter each dimension with the 1D Gaussian kernels\
  sigma_x=sigma_array(1);
  size_x=size_array(1);
  sigma_y=sigma_array(2);
  size_y=size_array(2);
  sigma_z=sigma_array(3);
  size_z=size_array(3);

   Kx = fspecial('gaussian', [1 round(size_x)], sigma_x);
   Ky = fspecial('gaussian', [1 round(size_y)], sigma_y);
  Kz = fspecial('gaussian', [1 round(size_z)], sigma_z);
  % since gaussian Kernel is separable
  G=convn(Hz,convn(Hx,Hy));
 end
end```

Would you please tell me what is the equivalent of Convn and the fspecial in pytorch?

ptrblck · February 12, 2020, 6:49am

You could use torch.distributions.multivariate_normal.MultiVariateNormal or alternatively sample from torch.randn and scale with the stddev as well as shift with the mean.

saba · February 12, 2020, 6:49am

Hi Ptrblck,

I wrote this code for Gaussian in pytorch . But I can not see my Gaussian. ```

“”"
import torch
import torch.nn as nn
import numpy as np

sigma_array=np.array([.5, .5, .5])
size_array=11
G= Gaussian3d(sigma_array,size_array)
def Gaussian3d(sigma_array,size_array):
size_array1=torch.tensor([1,2,3,4,5,6,7,8,9,10,11])
G = np.asarray(size_array)
x,y,z= torch.meshgrid(size_array1,size_array1,size_array1)

x = x -size_array/2-0.5
y = y -size_array/2-0.5
z = z -size_array/2-0.5

G = torch.exp(-((x)**2/(2*(sigma_array[0]**2)) +(y)**2/(2*(sigma_array[1]**2)) +(z)**2/(2*(sigma_array[2]**2)) ))

return G```

Do you see any problem?

saba · February 12, 2020, 6:53am

is it right now?"""
import torch
import torch.nn as nn
import numpy as np

sigma_array=np.array([1.5,1.5, 1.5])
size_array=11
G= Gaussian3d(sigma_array,size_array)
def Gaussian3d(sigma_array,size_array):
size_array1=torch.tensor([-5,-4 ,-3 ,-2 ,-1 ,0 ,1 ,2 ,3 ,4 ,5])
G = np.asarray(size_array)
x,y,z= torch.meshgrid(size_array1,size_array1,size_array1)

x = x -size_array/2-0.5

y = y -size_array/2-0.5

z = z -size_array/2-0.5

saba · February 12, 2020, 7:32am

Many thanks for your reply. Sorry I need t find the local maxima in the 3 dimension. In Matlab I use “imreginalmax” , My input is 12022080 ,the out put is a binary with the same size of the input. which means wherever it is 1 there is a local maximum in the input.

What is the equivalent in pytorch I need to have the same output means the binary in 3D wherever is 1 there is a local maxima in the input.

I really appreciate your help

Ripley · October 7, 2020, 2:20am

Hi, I saw your solution and it helps alot! Thank you so much! However, i am quite new to python from zero knowledge, would you be able to explain what the function under call does? and in general what noise adjust?

Oh and also, by adjusting the mean and std will it affect the normalization of the image when we pass it into our dataloader? Thank you!

ptrblck · October 7, 2020, 3:49am

AddGaussianNoise adds gaussian noise using the specified mean and std to the input tensor in the preprocessing of the data.
torch.randn creates a tensor filled with random numbers from the standard normal distribution (zero mean, unit variance) as described in the docs. In AddGaussianNoise.__call__ this noise tensor will be multiplied with self.std and self.mean will be added to scale and shift the distribution.