In the documentation it says that the inputs to
key, query and value.
I have the following 2 doubts :
- How do I specify the inputs (word embeddings) on which the multiheadattention has to be performed ?
- What should be the key, query and value inputs to the function? Aren’t they the weights that the network will learn ?