Visualing the attention heatmaps from the beginner Tutorial?

https://pytorch.org/tutorials/beginner/translation_transformer.html

I am currently reviewing this tutorial, but I see that is it missing some things (as compared to the Tensorflow example: Modelo de transformador para compreensão de linguagem  |  Text

Basically, I would like to visualise the attention heatmaps at the encoder and decoder layers, but I am not sure which is the output for that? Anyway, I was hoping that someone had an implementation of it already, so that I could check it out.

And how could I use the test iters, for testing the model? Thank you!