The output of torch::stft (C++) and that of torch.stft (Python) don't match

I am re-writing a pytorch code in libtorch. My environments are below:

pytorch on AlmaLinux 8.4

  • python: 3.7.0
  • pytorch: 1.13.0.dev20220804
  • torch: 1.2.0

libtorch on Windows 11

  • libtorch: 1.11.0+cpu

I am having issue with the STFT. The output of torch::stft (C++) and that of torch.stft (Python) don’t match.

I decided to check those with a simple example as below:

+++++++++ Python +++++++++

t = np.arange(0, 512)
y = np.sin(2*np.pi*t/512)
F = torch.stft(torch.from_numpy(y), 
               n_fft=512, 
               hop_length=256, 
               win_length=512,
               pad_mode='reflect', 
               window=torch.hann_window(win_length),
               center=True,
               normalized=False,
               onesided=True
              )
F_real = F[:, :, 0]
np.shape(F_real)
print(f"shape: {np.shape(F_real)}")
print(f"{F_real[:10, 1]}")

+++++++++ C +++++++++

/* load sin wave. 
 * the sin curve used in above python code is output as .csv file. 
 */
int len_x = win_length;
double* x = new double[len_x];
std::fstream newfile;
std::string buf;
newfile.open("sin.csv", std::ios::in);
int i = 0;
if (newfile.is_open()) {   //checking whether the file is open
    while (std::getline(newfile, buf)) {
        //std::cout << buf << "\n"; //print the data of the string
        x[i] = std::stod(buf);
        i += 1;
    }
    newfile.close();   //close the file object.
};

/* stft */
int n_fft = 512;
int hop_length = 256;
int win_length = 512;

torch::Tensor input = torch::from_blob(x, { len_x, 1 }, torch::kFloat64);

/* pad
 * "The C++ version of STFT does not have padding implemented by default whereas the Python version does, in “reflect” mode particularly."
 * https://discuss.pytorch.org/t/some-problem-by-use-torch-stft-with-c-api-but-output-shape-is-different-to-python-api/111352/3
 * / 
int length = input.sizes()[0];
int pad = length / 2;
torch::Tensor left_pad = input.slice(0, 1, pad + 1).flip(0);
torch::Tensor right_pad = input.slice(0, length - pad - 1, length - 1).flip(0);
torch::Tensor input2 = torch::cat({ left_pad, input, right_pad }, 0);

int n_row = input2.sizes()[0];
int n_col = input2.sizes()[1];

/* STFT */
auto y = torch::stft(input2.transpose(0, 1), n_fft, hop_length, win_length);
auto y_real = y.index({ 0, Slice(), Slice(), 0 });
n_row = y_real.sizes()[0];
n_col = y_real.sizes()[1];
std::cout << "r_real: " << n_row << " x " << n_col << std::endl;
for (i = 0; i < 10; i++) {
    std::cout << i << ": " << y_real[i][1].item<double>() << std::endl;
};

The outputs are:
+++++++++ Python +++++++++
shape: torch.Size([257, 3])
tensor([ 1.1079e-05, 8.2265e-08, -5.5716e-06, 9.1402e-09, 1.7675e-07,
-1.2560e-07, 1.6873e-07, -3.7544e-08, -4.7826e-07, 2.7178e-07],
dtype=torch.float64)

+++++++++ C +++++++++
0: 6.18596e-16
1: -2.84217e-14
2: -4.93762e-16
3: 6.30612e-15
4: -4.82855e-16
5: -8.2439e-15
6: 1.04552e-15
7: 4.53907e-17
8: -1.4303e-16
9: 9.79987e-16

Why do they look so different??

The problem is solved. The reason why the outputs don’t match is because I didn’t apply Hann window in C++ while I did in Python.