Calculate test loss on seq2seq GRU

Hello,

I am a beginner in ML. How can I calculate the Test loss on this tutorial?
https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html

Thank you very much in advance!

/Angelos

You could use the code snippet for evaluation and provide your test sentences instead of randomly sampling some sentences.

Basically I would like to illustrate that in the same graph having train loss and test loss.
Right now in my implementation I have only Train loss.

/Angelos

To visualize the training and test loss inside the same figure, you could add the loss calculation to the mentioned code snippet and store the training and test losses in separate lists.
Note that you should detach the (training) loss, if you are storing its value in a list to avoid an increasing memory usage (which might create an OOM) via:

train_losses.append(loss.detach().cpu().item())

Once you have both lists, you can convert them to numpy arrays and plot via matplotlib.

What should I put as target then?
Will it be something liked this:

loss += criterion(decoder_output, decoded_words)

I assume you have a test set with input data and the corresponding targets.
If that’s the case, you should pass the test targets to the criterion.

If you don’t have test targets, could you explain the use case a bit?
E.g. how did you collect the test target and how would you like to calculate the loss?

There are no targets, that’s why I am a little bit confused. I feed the decoder’s predictions back to itself as the tutorial does.

The tutorial uses sentences in one language as the input and targets of the corresponding sentence in another language, no?
If your test set doesn’t have the translations, you won’t be able to calculate the loss.

Yes for the training.
But for the testing it just feeds the decoder’s predictions back to itself in the evaluation function and it doesn’t calculate the test loss.
So I am assuming that the target for the test set is the “decoded_words” which is the translation of the sentence. Am I right?

No, the evaluate method just prints the predictions and the target so that you could manually compare these.
To calculate the loss, you would have to compute it with the criterion (as done during training):

loss += criterion(decoder_output, target_tensor[di])

However, if you don’t have the targets, you won’t be able to calculate the loss and would have to output the predictions only.

Yes that’s why asked you in a previous comments if this is right:

loss += criterion(decoder_output, decoded_words)

I understood that I should have the targets, my question is how I can have them.

Yeah, sorry for the confusion. I tried to explain that you would have to have the target values to calculate the loss.

You would need to create them manually or download the targets, if you are working on a public dataset (they usually provide inputs and the corresponding targets).