Getting same loss every time on training many to one RNN

saurabhvyas · October 2, 2017, 10:16am

I am new to pytorch, still trying to learn, I have created a many to one simple LSTM network , I am passing as an input a sequence of audio frames, and wanting to use last output state as the class probabilities vector ( basically binary audio classification )

I have extracted features from 2 wav files manually and stored in vectors, but I am having problem optimizing and backpropagating, I am always getting the same loss , I know number of examples for training are just 2, but shouldn’t loss be decreasing ?

Here is the relevant portion of the code

mfcc1 = audio_to_mfcc('/home/saurabh/Documents/audio_classification/data/lizzie.wav')
#print(mfcc1.shape)
mfcc2 = audio_to_mfcc('/home/saurabh/Documents/audio_classification/data/boy.wav')
#print(mfcc2.shape)



temp =  mfcc1[ : , np.newaxis , :]
temp2=  mfcc2[ : , np.newaxis , :]

#print(temp2.shape)
input_var = Variable(torch.Tensor(temp))
input2_var = Variable(torch.Tensor(temp2))

for epoch in range(num_epochs):
    outputs = rnn(input_var)
    outputs2= rnn(input2_var)
#print(outputs[999])
    final_output = outputs[999]
    final_output2= outputs2[998]

    final_output_numpy=final_output.data.numpy()[np.newaxis,:]
    final_output = torch.from_numpy(final_output_numpy)
    
    final_output_numpy2=final_output2.data.numpy()[np.newaxis,:]
    final_output2 = torch.from_numpy(final_output_numpy2)
    
#print(final_output_numpy.shape)

#print(final_output.size())

    label = Variable(torch.LongTensor([0]))
    label2 = Variable(torch.LongTensor([1]))
#print (label.size())
    loss = criterion(Variable(final_output, requires_grad=True), label)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    print(loss)
    
    loss2=criterion(Variable(final_output2, requires_grad=True), label2)
    optimizer.zero_grad()
    loss2.backward()
    optimizer.step()
    
    print(loss2)

and here is the output

torch.Size([1000, 1, 60])
torch.Size([999, 1, 60])
Variable containing:
 0.6557
[torch.FloatTensor of size 1]

Variable containing:
 0.7708
[torch.FloatTensor of size 1]

torch.Size([1000, 1, 60])
torch.Size([999, 1, 60])
Variable containing:
 0.6557
[torch.FloatTensor of size 1]

Variable containing:
 0.7708
[torch.FloatTensor of size 1]

smth · October 2, 2017, 2:13pm

You cant rewrap your Variables like this, it loses history.

You need to do all operations on a Variable.

x = torch.Tensor(...)
y = x * 2
y = Variable(y,requires_grad=True)
# In the above example, y does not know that x created it.

# Correct usage below:
x = Variable(torch.Tensor(...), requires_grad=True)
y = x * 2

saurabhvyas · October 2, 2017, 2:15pm

Thanks for your intuitive explanation

saurabhvyas · October 2, 2017, 3:52pm

Okay, so I have updated the code and made sure I don’t perform operations on tensors,but rather variables , here is the complete gist code ( exported from jupyter notebook as .py)

gist.github.com

https://gist.github.com/saurabhvyas/40bea8449fc92b6053294cdb8bff3394

gistfile1.txt


# coding: utf-8

# In[ ]:




# In[1]:

This file has been truncated. show original

Still loss is identical, after 2 epochs

I see , others are also facing similar problem

quanpn90 · October 2, 2017, 4:09pm

Hi,

I think you shouldn’t wrap your variable like this

t1=Variable(torch.FloatTensor(outputs.data[999]),requires_grad=True)

because it won’t know the relationship between t1 and outputs. I think You can directly index the variable outputs, or use split to get the information you want from the sequence.

saurabhvyas · October 2, 2017, 4:32pm

Thanks, I think I now understand it more clearly

With the following changes it works perfectly

   t1=outputs[999]
    t1=t1.unsqueeze(0)
  #  print(t1)
    
    t2=outputs2[998]
    t2=t2.unsqueeze(0)