Thanks for response. Your solution is still the same as mine, right? just you use list comprehension.
Cosine similarity is not for layer outputs, but just for two different vectors.

The output shape would be `(3,10,128)` where the `(10, 128)` would be your 1280 cosine similarity values that you need. Unsqueeze appropriate dims to get the desired shape.
The logic behind this is that,
`l2_distance(l2_normalized_vector_1, l2_normalized_vector_2) = k*CosineSimilarity(UnNormalized_vector_1, UnNormalized_vector_2)`
where `k` is a constant.
Refer [this](https://stats.stackexchange.com/questions/146221/is-cosine-similarity-identical-to-l2-normalized-euclidean-distance) for more on it.