How to pass word embeddings and inputs to torch.nn.MultiheadAttention?

ADONAI_TZEVAOT · February 22, 2021, 4:52pm

In the documentation it says that the inputs to Multiheadattention are key, query and value.
I have the following 2 doubts :

How do I specify the inputs (word embeddings) on which the multiheadattention has to be performed ?
What should be the key, query and value inputs to the function? Aren’t they the weights that the network will learn ?

AbdulsalamBande · October 14, 2021, 8:50am

@ADONAI_TZEVAOT
So for self attention, the key , query and value are all the same. So if your input is for example ‘how are you’ and you have embedding dim of 300. The Key , Query and Value will all be [3,300] , since you have 3 words.
For the second doubt. Key, Query and Value are not the weights, they are your inputs. Pytorch creates random weights behind the scenes. You can watch this video for a detailed explanation Self Attention with torch.nn.MultiheadAttention Module - YouTube