I am working on adversarial attacks with pytorch.
I was initially using floatTensors but my backpropogation was not consistent.
After searching I found https://github.com/pytorch/pytorch/issues/5351 . Due to iteratively using thr wrong values my answers differ significantly across experiments.
Then after I moved to DoubleTensors my answers seem to be very consistent but it has become about 3-4 times slower. This is somewhat expected but is there any workaround into this as 3/4 is a really huge factor!
Since autograd differs in pytorch vs luatorch, does torch autograd pass gradcheck in luatorch pass with float operands??
This problem is not linked to pytorch itself.
Doing numerical differentiation with finite difference (what gradcheck does) can be quite far from the truth depending on the function, which point you differentiate and the value of epsilon you use.
The autograd in any framework you will use do the exact same computation. There can be minor differences (floating point error) because float operations are not commutative but the answer will be almost the same.
DoubleTensor operations are much slower (especially for some GPUs), because they require diferent hardware and GPU don’t have much of it because they are not really needed most of the time.
Hi yeah I am using 1080 Ti GPU and it does have much less flops for double values.
As I said I am working on adversarial attacks and therefore even 1e-8 difference in gradient magnitudes can cause difference (as I use torch.sign and 1e-8 can cause a shift from +ve to -ve)
And over multiple such iterations this difference accumulates and is quite large!
Well, if you need 1e-8 precision you don’t really have a choice, you need to work with doubles I’m afraid.