Inconsistent gradient values for the same input

I am trying to implement an iterative fast gradient sign adversarial attack on a model. When I run the code for the same model, I get different accuracies for the same image. On further inspection, I found that many times for the same input the gradient differs. Is there any reason for this to happen?

runningAdv = torch.clamp( runningAdv + ((inputcopy.grad > deterministicEps).type(torch.FloatTensor) * self.perIterStep - \
                   (inputcopy.grad < -deterministicEps).type(torch.FloatTensor) * self.perIterStep).to(self.opts.gpuid) , -1 * self.epsilon, self.epsilon)
                

Initially I set deterministicEps to 0 and when this caused difference I though it may be due to precision errors as the gradients were of very small magnitude (1e-13) but even on setting deterministicEps as high as 1e-8, the output is variable.
The model is in model.eval() mode and the I generate the gradients as follows:

inp = inp.float().to(self.gpuid)
target = target.float().to(self.gpuid)
runningAdv = torch.zeros_like(inp).to(self.gpuid)
for i in range(self.niter):
                inputcopy = torch.clamp(inp + runningAdv, 0, 1)
                inputcopy.requires_grad = True
                out = self.model(inputcopy)
                loss = self.loss(out, target)
                loss.backward()
                runningAdv = torch.clamp( runningAdv + ((inputcopy.grad > deterministicEps).type(torch.FloatTensor) * self.perIterStep - \
                   (inputcopy.grad < -deterministicEps).type(torch.FloatTensor) * self.perIterStep).to(self.opts.gpuid) , -1 * self.epsilon, self.epsilon)

Why should there be changes in the gradient for the same image (even if the magnitude of the gradient is very small, shouldn’t its value remain consistent when I use the same input and model)

Hi,

Are you using Float or Double tensors? Keep in mind that below 1e-7, a single float will not be precise and such values should not be considered correct.
Some operations (especially on GPU) are non-deterministic and so can give different results where the difference is of the order of floating point precision: 1e-7 1e-8.
If you use cudnn, you can set cudnn.deterministic = True to get deterministic results.

Thanks, could you tell more about cudnn deterministic mode (or point to some source). I want to use it for benchmarking and want to know how precise the output will be.

On converting the model and the input to double format, the execution time increases around 6-10 times. Is there any way to avoid this without losing such a loss in performance?

@sahil_shah Unfortunately that depends on your hardware, you can see here for a more complete answer.

@Naman-ntc Getting deterministic results is tricky, there are many posts on this forum discussing that.
cudnn internaly can use many algorithms, some are deterministic (run twice on the same hardware will give you the same results), some not (run twice on the same hardware can give you different results). You can set the torch.backends.cudnn.determinic flag to True or False to only use deterministic algorithms or all of them.
The output precision will be as all other algorithms up to the input type precision.