nn.MultiheadAttention

yngtodd · May 29, 2019, 2:58am

Can someone explain what embed_dim is in nn.MultiheadAttention? The docs say that it is the total dimension of the model, but I am not quite sure what that means.

Would anyone be able to give a minimal example using nn.MultiheadAttention?

https://pytorch.org/docs/master/nn.html#multiheadattention

Ruben_Chaves · June 2, 2019, 10:16pm

The number of the dimensions the tensor have after passing trought a embedding layer. In hte paper Attention is all you need, it would be equal to d_model.