Torch vs SciPy FFT gives different results

I’m trying to write a full torch implementation for scikit-image’s phase_cross_correlation algorithm.

It’s fast as hell, and it kinda works but performs worse (quality-wise)
I was trying to debug the issue and was horrified to realize that torch.fft.fftn and scipy.fftpack.fftn sometimes don’t return the same results, I’ll elaborate with an example:

Testing on random arrays, I get good results:

import torch
import scipy.fftpack as s_fft,
import numpy as np

rnd = np.random.rand(5_000, 32, 32, 3)
t_rnd = torch.from_numpy(rnd)

np_fft = s_fft.fftn(rnd)
torch_fft = torch.fft.fftn(t_rnd)

torch.testing.assert_allclose(torch.from_numpy(np_fft), torch_fft)

AssertionError: Tensor-likes are not equal!

Mismatched elements: 15182859 / 15360000 (98.8%)
Greatest absolute difference: 6.906214515135596e-10 at index (1000, 0, 0, 0)
Greatest relative difference: 1.234703274957512e-11 at index (0, 0, 16, 0)

Ok, some small numeric error is to be expected so this is basically equal.
But then when I test with some real image data (e.g. CIFAR) I get this:

from torchvision.datasets import CIFAR10

ds = CIFAR10('cifar')
np_cifar =[:5_000]
cifar = torch.from_numpy(np_cifar)

np_fft = s_fft.fftn(np_cifar)
torch_fft = torch.fft.fftn(cifar)

torch.testing.assert_allclose(torch.from_numpy(np_fft), torch_fft)

Mismatched elements: 15359985 / 15360000 (100.0%)
Greatest absolute difference: 48.80068015457514 at index (1000, 0, 0, 0)
Greatest relative difference: 0.003518688442983903 at index (999, 20, 27, 1)

On my actual data (which I cannot share) it was as bad as 0.4 relative difference.
And after performing a few more operations (like multiplying by the conjugate and taking the inverse FFT) the results differ quite widely.

So, what the hell???

numpy 1.22.4
scipy 1.8.1
torch 1.13.1+cpu
CPU is 10th gen Intel i7

Try to narrow down if the mismatches are created at specific locations (in the batch and/or in their spatial dimensions) as I would guess the largest mismatches might come from the borders.

I was suspicious of the borders as well, but there was nothing I could do about it.
For my data at work, I managed to mitigate the issue by subtracting the mean of each item in the batch.
I don’t even know why I tried it or why it worked, but it did.
This approach didn’t work for me in the CIFAR example though.

Do you think I should open a git issue about this?

I don’t think there is an actual bug here, just that you are implicitly mixing dtype conventions between numpy and PyTorch. With the following modified code snippet:

import torch
from torchvision.datasets import CIFAR10
import scipy.fftpack as s_fft
import numpy as np

ds = CIFAR10('cifar', download=True)
np_cifar =[:5_000]
cifar = torch.from_numpy(np_cifar)

np_fft = s_fft.fftn(np_cifar)
torch_fft = torch.fft.fftn(cifar.double())

torch.testing.assert_allclose(torch.from_numpy(np_fft), torch_fft)

I get

Mismatched elements: 15320160 / 15360000 (99.7%)
Greatest absolute difference: 1.2410069878908557e-07 at index (2000, 0, 0, 0)
Greatest relative difference: 1.1741440376456833e-11 at index (4001, 12, 5, 2)
1 Like