Manual Convolution Does not match results of pytorch Conv2d (floating point)

Hi my manual implementation of the pytorch convolution2d will always have some precision difference compared with official pytorch implementation of conv2d.

Codes are attached for reproduce

def conv2d_matmul_fp(sample_input, weight, padding, stride, dilation):
    N,C,X,Y = sample_input.size()
    K,_,R,S = weight.size()

    out_size = (math.floor((X+padding[0]*2-dilation[0]*(R-1)-1)/stride[0]) + 1, math.floor((Y+padding[1]*2-dilation[1]*(S-1)-1)/stride[1]) + 1)

    simple_in_unfold = torch.nn.functional.unfold(sample_input, kernel_size=(R,S), dilation=dilation, padding=padding, stride=stride)
    res = torch.matmul(weight.view(weight.size()[0], -1), simple_in_unfold[0])
    return res.reshape(N, K, out_size[0], out_size[1])

def Conv2d_layer_matmul(sample_input, conv_layer):
    weight = conv_layer.state_dict()["weight"]
    padding = conv_layer.padding
    stride = conv_layer.stride
    dilation = conv_layer.dilation
    return conv2d_matmul_fp(sample_input, weight, padding, stride, dilation)

# Define sample model
sample_fp_conv2d = nn.Conv2d(3, 64, kernel_size=7, stride=(2,2), padding=(3,3), bias=False)
sample_fp_conv2d.eval()
# Extract the weights
weight = sample_fp_conv2d.state_dict()['weight']

# Define sample input data
# sample_input = val_data[0]
sample_input = torch.randn(1,3,224,224)

# Define sample result
sample_res = sample_fp_conv2d(sample_input)
print(sample_res.size())

res = Conv2d_layer_matmul(sample_input, sample_fp_conv2d)

# Comare both results
torch.where(torch.isclose(res, sample_res, rtol=1e-3)==False)

Really appreciate it if anyone could help

Hi Jianming!

I have not looked at your code.

However, this kind of discrepancy is common and is usually caused by
floating-point round-off error due to the fact that the order of operations
in your implementation will almost certainly differ from those in Conv2d.
Even though the two sets of operations should be mathematically
equivalent, the results can differ because of round-off error.

An easy way to test for this is to redo the calculation in double precision
(preferably using the same Conv2d weight matrix and random
sample_input tensor from your single-precision calculation, converted
to double precision). If performing the calculation in double precision
reduces the discrepancy by several orders of magnitude, you would
have strong evidence that your issue is due to round-off error.

Best.

K. Frank

Thanks for your kind reply.

I have tried both torch.float32 and torch.float64 for weights and input in my own implementation. And both of them demonstrate some differences compared with the official conv2d API. As for the results, double precision will even introduce more difference compared with the official implementation of Conv2d in Pytorch.

Might I ask whether there are other reasons except the precision issue to cause such a difference? And whether there is a custom implementation to match the custom implementation to the official Conv2d?

Thanks again for your help!

Best
Jianming

Hi Jianming!

If I understand correctly, you have used torch.float32 and torch.float64
for your implementation, but only used torch.float32 for the “official
conv2d API.”

My suggestion, in more detail, is:

Calculation 1:

Use torch.float32 for both your implementation and “the official conv2d.”
Measure the discrepancy. For example, calculate the mean and the
maximum of the absolute values of the differences between the two
results.

Calculation 2:

Use torch.float64 for both your implementation and “the official
conv2d.” (Ideally perform this computation using the same weight
and sample_input that you used in the torch.float32 calculation,
except converted to torch.float64.) The point is that you want to
perform both convolution implementations in double precision – that
is, torch.float64. Measure the discrepancy the same way as you
did for the torch.float32 calculation. Does using torch.float64
reduce the discrepancy?

Just to be clear, we do not expect the comparison between using
torch.float32 for one implementation and torch.float64 for
the other to reduce the discrepancy.

Best.

K. Frank