I’m trying to do the following:
Say I have an input vector with 12 dimensions, I want to output a vector with 3 dimensions. Instead of fully connecting the input and output, I would like to compute the first feature of output based on the first 4 features of the input, and the second dimension of output based on 5th to 8th features in the input vector, and so on.
The output dimension should not be hardcoded, but variable! Therefore I can’t just simply split the input vector to the 3 equal pieces.
Concretely, I want to implement the self multi-head attention pooling in this paper.
Anyone could give me a hint?
Thanks very much!