How to add noise to MNIST dataset when using pytorch

Thank you for the prompt response!

Hi ptrblck,
recently i came across Federated learning with Differential privacy, which is adding noise.
def train(args, model_bob, model_alice, device, federated_train_loader, epoch):

def train(args, model_bob, model_alice, device, federated_train_loader, epoch):
optimizer_bob = optim.SGD(model_bob.parameters(), lr=args.lr)
optimizer_alice = optim.SGD(model_alice.parameters(), lr=args.lr)
model_bob.train()
model_alice.train()
for batch_idx, ((input_bob, target_bob), (input_alice, target_alice)) in enumerate(zip(data_bob, data_alice)):

    input_bob, target_bob, input_alice, target_alice = input_bob.to(device), target_bob.to(device),\
                                                       input_alice.to(device), target_alice.to(device)
    optimizer_bob.zero_grad()
    optimizer_alice.zero_grad()
    # print(input_bob.size())
    output_bob = model_bob(input_bob)
    output_alice = model_alice(input_alice)
    loss_bob = F.nll_loss(output_bob, target_bob)
    loss_alice = F.nll_loss(output_alice, target_alice)
    loss_bob.backward()
    loss_alice.backward()
    for pram_bob, pram_alice in zip(model_bob.parameters(), model_alice.parameters()):
        # print(pram_bob.grad.size())
        grad_avg = (pram_bob.grad + pram_alice.grad) / 2.0
        grad_avg = grad_avg + torch.rand_like(grad_avg) * 0.0173
        pram_bob.grad = grad_avg
        pram_alice.grad = grad_avg
  
    optimizer_bob.step()
    optimizer_alice.step()
   
    print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss_bob: {:.6f}\tLoss_alice: {:.6f}'.format(
        epoch, batch_idx * args.batch_size, len(data_bob) * args.batch_size,
        100. * batch_idx / len(data_bob), loss_bob.item(), loss_alice.item()))

This was the train function.
Do you know what they are doing to gradients here?
There are averaging and doing some calculation which i wasn’t able to understand.

I’m unfortunately not familiar enough with federated learning approaches and don’t know how the noise addition was calculated or why the gradients are averaged in the first place.
Assuming you were using this code from a source code repository, you might want to ask the authors of the implementation (and share the response here if possible :wink: ).

Hello Ptrblck,

I have a question about adding noise to the test images. Do you know which kinds of noises would increase more misclassification? I applied the Gaussian and it seems there is not increasing misclassification.

No, I don’t know which “type” of noise would disallow the model to learn the classification task and I would expect to see a decrease in accuracy once the noise level in the input images is large enough.

Thank you for your reply

I’m trying to add increasing amounts of gaussian noise but for some reason I keep getting an error and only one image prints, is there something wrong with my code?

for x in range(10):
testset = dsets.MNIST(root=‘./data’, train=False, download=False, transform=transforms.Compose([
transforms.ToTensor(),
AddGaussianNoise(0, x*.01)
]))

imageAlt, _ = testset[x]
plt.imshow(imageAlt.numpy()[0], cmap='gray')

Could you post the error message you are seeing?

I omitted some things like file names but these are the actual error message parts.

UserWarning: train_data has been renamed data
warnings.warn(“train_data has been renamed data”)
UserWarning: train_labels has been renamed targets
warnings.warn(“train_labels has been renamed targets”)
ConvergenceWarning: lbfgs failed to converge (status=1):
STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.
Increase the number of iterations (max_iter) or scale the data as shown in:
6.3. Preprocessing data — scikit-learn 1.3.0 documentation
Please also refer to the documentation for alternative solver options:
1.1. Linear Models — scikit-learn 1.3.0 documentation
n_iter_i = _check_optimize_result(

The warnings are explaining that the internal dataset.train_data was renamed to .data, so you might want to change your code as it could break in future releases.
The last error message points to a convergence issue using LBFGS, so you might need to fine-tune the hyperparameters of this optimizer.

for x in range(10):

    testset = dsets.MNIST(root='./data', train=False, download=False, transform=transforms.Compose([
        transforms.ToTensor(),
        AddGaussianNoise(0, x*.01)
    ]))
    
    
    
    imageAlt, _ = testset[0]
    plt.imshow(imageAlt.numpy()[0], cmap='gray')

I got rid of all the errors but I still only get one altered mist image. I tried to do it outside a for loop and only the last image prints. Is it possible that plt.imshow can only plot one image per code run? If so is there an alternative

Just adding plt.figure() makes it work for anybody else who runs into the same problem with plt.imshow.

for x in range(10):

    #use cross entrophy to measure accuracy or false pos to pos
    #dnr should be power of determistics signal / (poswer of deterministics signal+Noise)
    
    testset = dsets.MNIST(root='./data', train=False, download=False, transform=transforms.Compose([
        transforms.ToTensor(),
        AddGaussianNoise(0, x*.01)
    ]))
    
    plt.figure()
    imageAlt, _ = testset[0]
    plt.imshow(imageAlt.numpy()[0], cmap='gray')

Hi Ptrblck,

I am working on a Road scenario dataset, where there is something called Blooming - spreading of light vertically on the object (also above and below object) if it is high intensity: example: on Road signs, number plates, white sign boards, (vertical noise).

To address this problem, I want to first add vertical noise above and below objects for COCO Dataset and train them and probably use the pre trained weights on my custom Dataset.

Can you please tell me how to add Noise only to particular objects, and how to add Noise only vertically to the object width?

I would assume you can add noise in the Dataset.__getitem__ where the sample including the (bounding boxes for the) objects are loaded. I don’t have an example ready.