nn.MultiheadAttention

Can someone explain what embed_dim is in nn.MultiheadAttention? The docs say that it is the total dimension of the model, but I am not quite sure what that means.

Would anyone be able to give a minimal example using nn.MultiheadAttention?

https://pytorch.org/docs/master/nn.html#multiheadattention

1 Like

The number of the dimensions the tensor have after passing trought a embedding layer. In hte paper Attention is all you need, it would be equal to d_model.

2 Likes