Backpropagation for LSTM with fewer targets than inputs

I have an LSTM which shall predict a value. However, this prediction shall only be done after several calls. I.e. something like this:

I am providing my LSTM input of the shape [sequence length, batch size, features]. As far as I understand, the returned output will than have a prediction for each element in the sequence (output (seq_len, batch, hidden_size * num_directions)). While I’m happy to just ignore the values I dont require, I’m not sure how to do the backpropagation. More specifically, what to provide my loss function as target.


you can easily just take the indices you are interested in.
So for example:

a = Variable(torch.randn(20),requires_grad=True)
print (a)
b = a[torch.arange(1,a.size(0),2).long()]
print (b)
print (a.grad)

will give you alternating 0 and 1 because every second index is in the arange.

Best regards


1 Like