RNN many-to-One query

Hi I am pretty much new too pytorch and try to do sentimental analysis.

My Input data is
array([[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 8]
[ 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 1],
[ 0, 0, 0, 0, 0, 0, 9, 5, 1, 6, 7],
[ 0, 0, 0, 0, 9, 5, 1, 6, 7, 16, 17],
[ 0, 0, 0, 0, 0, 0, 2, 1, 3, 10, 4],
[ 2, 1, 3, 10, 4, 18, 19, 20, 21, 22, 23],
[11, 1, 6, 7, 24, 12, 5, 1, 13, 25, 26],
[ 2, 2, 2, 2, 8, 3, 3, 3, 4, 4, 4],
[ 0, 0, 0, 0, 0, 0, 0, 14, 1, 27, 14],
[ 0, 11, 1, 15, 2, 28, 12, 13, 15, 3, 4]]

Output Data
tensor([1., 0., 1., 1., 0., 0., 1., 0., 0., 1.]) where 1 is pos and 0 is neg review.

My Model

class RNN(nn.Module):

def init(self,n_vocab,n_embed,hidden_size):
super().init()
self.hidden_size = hidden_size
self.embedding = nn.Embedding(n_vocab+1,n_embed)
self.rnn = nn.RNN(n_embed, hidden_size, num_layers = 1, batch_first = True)

def forward(self,x):
x = x
x = self.embedding(x) # batch-size x seq_length x embedding_dimension
x,_ =self.rnn(x) #batch-size x seq_length x hidden_size
x = torch.sigmoid(x[:,:,-1][:,-1])
return x

n_vocab = len(char_to_int) ## 28
n_embed = 100
hidden_size = 8
model = RNN(n_vocab,n_embed, hidden_size)

I just Need to know that the indexing I did at last is correct or not (x = torch.sigmoid(x[:,:,-1][:,-1])).

Sorry for such a Naive question.

R u trying to take the last hidden states of the RNN? If that’s ur goal, u should use x[ :, -1, :] or simply just use the hidden states returned by rnn(the “_” in ur code).

1 Like

Hi

Thanks for replying.!!

Using slicing x[:,-1,:] I got below matrix.

tensor([[-0.7032, 0.9809, -0.0366, -0.8463, 0.8746, -0.9249, 0.9830, 0.9964],
[ 0.5889, 0.9578, 0.9545, 0.9815, -0.9984, 0.9589, 0.8505, 0.9543],
[-0.3179, 0.9790, 0.9406, 0.9769, -0.6671, -0.7646, 0.9281, -0.9901],
[ 0.5905, 0.8615, 0.7148, -0.7225, 0.2396, 0.4143, -0.1313, -0.9152],
[-0.9196, -0.9925, -0.8548, -0.9785, 0.3584, 0.9860, -0.4431, -0.9985],
[-0.9984, 0.4017, -0.9045, 0.4047, -0.6658, -0.4878, -0.9790, 0.7657],
[ 0.8931, 0.6347, -0.7855, -0.9594, -0.9998, 0.8930, 0.3539, -0.0567],
[-0.9782, -0.9661, -0.8166, -0.9830, -0.6294, 0.9168, 0.4004, -0.9968],
[-0.8964, -0.8088, 0.9799, 0.8978, -0.2576, 0.8573, -0.7513, -0.9112],
[-0.9389, -0.9517, -0.6023, -0.9468, 0.3026, 0.9553, -0.1915, -0.9959]],
grad_fn=)

Don’t we have to take the last values of all rows (that’s what my slicing do) ?? or we need to apply Linear transformation ?? What if I don’t want to apply linear transformation.

Basically my question is we have 8 hidden units out for each input features which hidden output we need to select.

If I need to tell below one is +ve or -ve review which value we need to consider.
(-0.7032, 0.9809, -0.0366, -0.8463, 0.8746, -0.9249, 0.9830, 0.9964).

I’m quite unsure about what is your question. Usually, we take the last hidden states(output) and apply a linear transformation on it to change to the output shape. Basically, you want to change the shape of last step [B, H] to [B, O], where H is the hidden size of lstm and O is the number of output class.

1 Like

Thanks I think that answer my question.I just wondering what if we don’t want to apply Linear transformation.

thanks for answering my question.

hi @G.M
One more question why do we need to use below function.?

def init_hidden(self):
return torch.zeros(1, self.hidden_size)

what is the difference if we use it or don’t use it.?

This is for initializing the hidden and cell states for lstm. Usually, we initialize them to zeros. By default, pytorch also initialize those states to zeros if you don’t pass them, so you actually don’t need the function.

Well, if you don’t initialize the hidden state after each batch, you should at least detach() it after each batch. Otherwise the backprop graph continuously grows batch after batch.

Do you mean to detach the output? Can you explain a bit further about why the graph would grow continuously? Thanks

So Basically after every epoch(Number of Iteration) it will reinitialize the hidden state back to zero if I use __init__hidden function.?

If I don’t use __init__hidden function every new epoch will have hidden state of previous epoch.?

This previous post explains it quite well.

2 Likes

You usually call init_hidden() or detach() after each batch.

Yes, without init_hidden(), you given the last hidden state of the previous batch as first hidden state for the current batch. However, detach() ensures that the hidden state is a constant and the loss is not backpropagated to the previous batch(es); see the forum post I linked to above.

1 Like

Hi, in the post you mentioned, I found that Pytorch lstm initializes hidden states to zeros if I don’t provide them as arguments, so that means I don’t necessarily have to initialize them after each batch. In @sgaur’s case, he/she is not providing the hidden states as arguments so I guess there’s no need to initialize/detach them each batch.

1 Like

If PyTorch handles the hidden staten like this in that case, then it should work. I just got used to do it manually all the time :slight_smile:

1 Like