I’ve tried to set up an vae for audio files, but I get the error input size should be between 0 and 1 when using the binary_cross_entropy() function, even though I didn’t have any of these issues when testing with MNIST Image files.
I suspect the error to result from the following function:
def reparameterize(self, mu, logvar):
std = logvar.mul(0.5).exp_()
eps = Variable(std.data.new(std.size()).normal_())
because that’s where any of my data starts to be out of the 0 to 1 range in the whole code. the processing continues as follows:
*def forward(self, x, sampleCount):*
x = x.view(1,1,sampleCount).float() #Reshape the audio data to fit Conv*
mu, logvar = self.encode(x)*
z = self.reparameterize(mu, logvar)
return z, mu, logvar
next is the loss function, which causes the error:
BCE = F.binary_cross_entropy(z.view(-1, sampleCount), x.view(-1, sampleCount).float())
can anyone please help me?
I’ve already tried using binary_cross_entropy_with_logits which didn’t crash my code, but yield any useable results.
I think the error is pointing out that the span of
x is not between 0 and 1.
Because, target needs to be either 0 or 1, in your case, the variable
Could you check what’s the min and max value of
x before you pass it to the loss function ?
And can you paste the error ?
Thanks for the reply! I’m sorry I didn’t paste the error, I forgot it… my bad.
The torch x before passing it through the loss function: [0.5729, 0.5729, 0.5729, …, 0.5160, 0.5266, 0.5555] (min 0, max 1)
all elements of input should be between 0 and 1
However, z (input for binary_cross_entropy) looks like that: [-0.9888, 0.1598, 1.3676, …, 1.4632, -2.4632, 2.5672] (min -4.68, max 5.9)
which is causing the error.
If I understood you correctly, both target and input should be either 0 and 1, which means that my data is not correctly transformed for that function? If so, is there a better loss function to use, since audio data is pretty hard to transform to only 0 and 1 while keeping the data loss to a minimum? That would also explain why it worked well on the mnist data, because that’s easy to transform to only 0 and 1 I guess.
Just to clarify,
binary_cross_entropy() does not require the
target values to be either
1. Rather, it requires these
values to be probabilities, that is, values in the range
In practice, it is often the case that the
target values are
(and can be understood as binary class labels), but forcing your
input values to be
1 will break differentiablity, and hence
Two more general questions: Why are you reparameterizing your
data? Conceptually, why are you using
and how is this supposed to interact with your reparameterization?
I fixed my issue, I had a function which imported the files and scaled the data to the range 0-1 which was not behaving properly. Everything is working now. Thanks to everybody helping, even if it didn’t solve my issue (because I didn’t upload enough of the code) it still helped me a understanding how pytorch works.