Hi there,
I’m using pytorch to run a custom differentiable “forward” routine I’m writing myself.
Essentially the routine is devised to synthesize audio by using a bank of sine oscillators (like Magenta’s DDSP). The audio works fine and the error seems to backpropagate through the net (I scrapped a code from the internet to plot the errors using, look for plot_grad_flow on the internet).
However, when autograd runs I get this warning, that tells me something’s not good:
<some path>/torch/autograd/__init__.py:251: UserWarning: An output with one or more elements was resized since it had shape [], which does not match the required output shape [4, 48001]. This behavior is deprecated, and in a future PyTorch release outputs will not be resized unless they have zero elements. You can explicitly reuse an out tensor t by resizing it, inplace, to zero elements with t.resize_(0). (Triggered internally at ../aten/src/ATen/native/Resize.cpp:28.)
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
Providing a min working example is not easy, but this is the synthesis part in the forward method:
x = self.fullyLayers.forward(x)
# x is the output from a neural net, is a tensor of (15,)
freqs = x[:NOSC] # get frequency values from the NN output
... # I get a decay vector and an amplitude vector for all oscillators from x
freqs = freqs.reshape((-1,1)) # transpose of a 1D vector (will become Nx1)
... # doing the same for decays and amps
# DENORM FREQUENCY
self.freqs_denorm = 20.0 * SOME_CONSTANT_PITCH * (freqs + 0.5 + 1E-6) # treat them as multiples of F0 (start from F0/2 by adding +1)
self.freqs_denorm[0] = SOME_CONSTANT_PITCH
... # I also apply some denormalization to decay and amp to translate from the NN output range to a physical quantity
# HERE IS THE SYNTHESIS PART
t = torch.linspace(0, 1, 48000+1) # shape of t is (48001,)
e = torch.exp(torch.tensor(1.0))
decay_matrix = self.amps_denorm * e ** (-t*self.decays_denorm) # all of the following tensors will be (NOSC,48001)
freq_matrix = self.freqs_denorm.repeat((1,48000+1))
omegas = freq_matrix * (2.0 * torch.pi)
omegas = omegas / float(SAMPLING_FREQ)
phases = torch.cumsum(omegas,1)
audio = decay_matrix * torch.sin(phases+torch.pi/2)
audioS = torch.sum(audio, 0) # here a sum together all the oscillators outputs so that audioS is (48001,)
So one of the last tensors, which is (4,48001) is not working as I would expect. But I don’t know where to start from. Backprop is done in C++ so I can’t debug.
Please note my batchsize is 1.
BTW: the issue has been raised previously but without reply