Discrepancy between theory and practice

kolorado · December 14, 2020, 11:47am

Hi,

in theory, these should give the same result, but these two layers produced different (but similar) results:

import torch
from torch import nn

b=20
outc = 10
inc = 10
res = 32
input_tensor = torch.randn((b, inc, res, res))
cnn_layer = nn.Conv2d(in_channels=inc, out_channels=outc, kernel_size=res,)
out_cnn_layer = cnn_layer(input_tensor)
fc_layer = nn.Linear(in_features=inc * res * res, out_features=outc)
fc_layer.weight.data = cnn_layer.weight.data.reshape(outc, -1)
fc_layer.bias.data = cnn_layer.bias.data
out_fc_layer = fc_layer(input_tensor.view(b, -1))
assert(torch.allclose(out_cnn_layer.view(b,-1),out_fc_layer))

InnovArul · December 14, 2020, 12:56pm

For me, this test passes!
pytorch 1.6.0

kolorado · December 14, 2020, 1:15pm

Did you run it multiple times? because mine is also: ‘1.6.0’

InnovArul · December 14, 2020, 2:06pm

I ran 15 times (or more!) and all of them passed.
Oh wait! I missed an assert. Sorry for the misunderstanding.
I will check again.

InnovArul · December 14, 2020, 2:18pm

I observe this behavior too.
I can see that the CNN and FC outputs are equal up to an absolute tolerance (atol) of 10^-5.

@albanD: do you have some comments on this?
In double data type, I would expect the absolute tolerance level should be even lesser than 10^-5. Am I missing something here?

albanD · December 14, 2020, 3:59pm

Hi,

Running on current colab, this is what I see:
The same thing as you: a difference of ~1e-6 difference for float. But adding torch.set_default_dtype(torch.double) at the beginning, it goes down to ~1e-15. @InnovArul did you set it properly to double?
So it looks like it is expected loss of precision because of floating point numbers.

Also note that the deeper your network is gonna be, the larger this difference is gonna be as most operation amplify a small difference that happened at the beginning.

InnovArul · December 14, 2020, 4:01pm

I see. Thanks @albanD. I did not set it to double. I assumed CPU has default type as Double.

albanD · December 14, 2020, 4:02pm

No contrary to numpy, we default to float as it makes a big difference in terms of runtime (especially on GPU) and is enough precision for most (if not all) deep learning applications.