Differences in network prediction output for the same input

R0b · January 21, 2018, 9:16pm

I was playing with an autoencoder. The decoder part is defined like this:

    self.decoder = torch.nn.Sequential(
        torch.nn.Linear(10, 125)
        , torch.nn.ReLU()
        , torch.nn.Linear(125, 250)
        , torch.nn.ReLU()
        , torch.nn.Linear(250, 500)
        , torch.nn.ReLU()
        , torch.nn.Linear(500, 1000)
        , torch.nn.ReLU()
        , torch.nn.Linear(1000, 28 * 28)

After training the network without problems, if I do:

myInput1 = torch.autograd.Variable( torch.Tensor([1,0,0,0,0,0,0,0,0,0]) )
myOutput1 = autoencoder.decoder(myInput1)
print(myOutput1.data - myOutput1.data)

I, obviously get all ceros.
But if I predict the exact same input again in the network and substract the two outputs:

myOutput2 = autoencoder.decoder(myInput1)
print(myOutput1.data - myOutput2.data)

I get small numbers but not ceros.
My question is, why there is a different result?, why is it not deterministic if the network it’s not changing between executions?

This is all in the same machine without training the network again.

Obviously, I’m missing something here. Thanks in advance.

albanD · January 22, 2018, 10:26am

How small are the differences?
Is it floating point precision-level differences?

R0b · January 22, 2018, 10:57am

Yes, really small, (but the input is exactly the same) :

1.00000e-08 *
  0.1863
  0.7451
  0.0000
 -0.3725
  0.1863
.....

albanD · January 22, 2018, 11:07am

If you use multiprocessing (used by default if you run on cpu) or multigpu or some non-deterministic gpu operations, the accumulation between different workers can happen in a different order. Remember that for floats, (a+b)+c != a+(b+c) and so this kind of small errors can appear.