Hi all,
I’m developing a neural vocoder. My loss is based on STFT. However, when I switch to pytorch 0.4.1, the loss became NaN after my first batch. I tried to reduce learning rate, however, the error still happens.
I created a simple code to test it:
import numpy as np
import torch
def cal_spec(signal, n_fft=2048, hop_length=256, win_length=1024):
window = torch.hann_window(win_length).cuda()
complex_spectrogram = torch.stft(
signal, n_fft=n_fft, hop_length=hop_length, win_length=win_length, window=window, center=False)
power_spectrogram = complex_spectrogram[:, :, :, 0] ** 2 + complex_spectrogram[:, :, :, 1] ** 2
return torch.sqrt(power_spectrogram)
grads = {}
def add_grad(name, x):
grads[name] = x
def reg(name):
return lambda x: add_grad(name, x)
# file can be downloaded from:
# https://drive.google.com/file/d/1qxTIKLcSShBcfX3kIgtf5scJSlQKtrPa/view?usp=sharing
d = np.load('a.npy.npz')
x = torch.tensor(d['pred'], requires_grad=True)
y = torch.tensor(d['target'], requires_grad=False)
pred_spec = cal_spec(x)
target_spec = cal_spec(y)
x.register_hook(reg('x'))
pred_spec.register_hook(reg('pred_spec'))
loss = torch.mean(torch.abs(pred_spec - target_spec))
loss.backward()
After inspecting, I see that the gradient of pred_spec is fine. However, the gradient of x is NaN.
Thanks.