I am trying to get attention weights for input sentences for a transformer model I have built with my own corpus of text. I have used the PyTorch Language Translation with nn.Transformer and torchtext — PyTorch Tutorials 2.0.1+cu117 documentation as a guide and the model trains and predicts well.
I now need to access attention weights of the encoder but am struggling and would appreciate any guidance. I need the attention scores for an input sentence for each head of the encoder, and the final attention weights of the last layer of the encoder block.
I have managed to access each layers attention score as follows:
src = text_transform['main'](src).view(-1, 1) num_tokens = src.shape src_mask = (torch.zeros(num_tokens, num_tokens)).type(torch.bool) input_embeddings = transformer.encode(src, src_mask) for layer in transformer.transformer.encoder.layers: emb, att = layer.self_attn(input_embeddings, input_embeddings, input_embeddings, need_weights=True) input_embeddings = layer(input_embeddings)
but this only provides attention scores for each layer of the encoder block and presumably only for the first head as i only get 6 sets of attention (unless they are averaged across the heads which is still not what i need)
I would like the final attention score for each encoder block and for each head. Any advice would be much appreciated.