I am working on adversarial attacks in pytorch. I am performing iterative gradient sign based attacks, but as cudnn is not deterministic the input gradient sign may vary and over many iterations this accumulates and gives very different results. For detailed discussion look discussion - 1 and discussion - 2.
I looked at other pytorch implementations but they also follow similar procedure.
As answered in those links by @albanD (really appreciated!) I first tried setting
torch.backends.cudnn.deterministic = True but it didn’t work out.
Then I also thought of giving cunn a try so I did
torch.backends.cudnn.enabled=False but it also didn’t help and still it’s non reproducible and varying significantly.
As adversarial attacks don’t have any kind of training involved but only backpropogation (atleast iterative gradient sign methods!) I think for I need to ensure reproducibility of results consistently
Why didn’t changing those flags work out? how to get deterministic behaviour (I could find only these two methods on searching)
Thanks a lot in advanced!
Update follow up:
Good catch with the Upsample.
I think there currently is no-one systematically investing in determinism.
(For most cases the problem is much less severe because of a continuous dependence on gradients, but the “fast gradient sign” is very discontinuous by design.)
Thanks! I didn’t know till yesterday that non-deterministic existed (in pytorch atleast ). It has been really crazy since then.
And thanks a lot for the fix!!!
@tom the hack you have mentioned is to expanding the tensor (appropriately).
Th@t won’t be equivalent to a nearest neighbour interpolation right. (default for nn.Upsample).( I was so happy with getting behaviour fixed that I overlooked this lol! )
Can the nearest neighbour interpolation be performed via some fixed convolution kernel after expanding the tensor? (such yhat it is close enough to nearest neighbour interpolation)
Or othewise i guess it may be a good idea to have a 3x3 kernel with learnable parameters posr expanding, (random but I expect ot to work :P)
Oh, I’m sorry, I thought it was for scale_factor=2:
a = torch.arange(9, dtype=torch.float).view(1,1,3,3)
b = torch.nn.functional.interpolate(a, scale_factor=2)
b2 = a[:, :, :, None, :, None].expand(-1, -1, -1, 2, -1, 2).reshape(a.size(0), a.size(1), a.size(2)*2, a.size(3)*2)
Ohh my bad I thought Upsample also interpolates the values. Seems it depends on interpolate keyword. Should have checked! thanks