Can someone explain what embed_dim
is in nn.MultiheadAttention
? The docs say that it is the total dimension of the model
, but I am not quite sure what that means.
Would anyone be able to give a minimal example using nn.MultiheadAttention
?
Can someone explain what embed_dim
is in nn.MultiheadAttention
? The docs say that it is the total dimension of the model
, but I am not quite sure what that means.
Would anyone be able to give a minimal example using nn.MultiheadAttention
?
The number of the dimensions the tensor have after passing trought a embedding layer. In hte paper Attention is all you need, it would be equal to d_model.