I’ve been trying to train model to classify surname language with a char-level rnn.
But model during 10 epochs shows almost the same train and validation loss.
And in confusion matrix all predicts point at only one column.
That’s my notebook on colab.
What am I doing wrong?
Can someone give an advice?
Actually my notebook is modified version of that nice notebook in which there’s no optimizer and SGD is obtained by manual subtracting of gradients.
This way first we zero our net grads:
After that we get output of net and calculate our loss:
loss = criterion(output, category_tensor)
Next we calculate derivatives via backward():
and finally manual subtracting:
for p in rnn.parameters():
What I’ve changed in that is that I replace random data point choosing by creating train/validation datasets and add loss output.
But I can’t find the place where I spoil something.
Can you pls take another look at that?
I am not really familiar with RNNs, but looking at your code, I see hidden.detach_(), which removes the variable from the computation graph and therefore may cause your error. Here’s a link that explains what detach() does.
Yes, I did.
But there were no data set splitting into train/valid and data points were chosen by random.
So I only added data splitting.
The most confusing moment is that in confusion matrix is “active” only one category column (screens above) after every next training