Understanding of output from the language modeling tutorial

I’m trying to understand the output from the model in the “language modeling with nn.transformer and torchtext” tutorial (Language Modeling with nn.Transformer and torchtext — PyTorch Tutorials 2.1.0+cu121 documentation). I am basically trying to run part of a text and trying to decode the response from the model in plain text.

line = "This game was a lot of "
line_t = torch.tensor(vocab(tokenizer(line)), dtype=torch.long)
output = model(line_t.to(device))
output_flat = output.view(-1, ntokens)

In this case the output shape is [6, 6, 28782] and flattened shape is [36, 28782]. 29782 is the vocabulary size. Since the input shape is [6] (for the six words), I’m not sure why the output shape is [6, 6, 28782]. How do I decode/translate this output into text? From what I understand, the model is expected to predict the next word but I’m not sure how to identify that from the above output.

I hope that makes sense. Any help is appreciated!