Upsampling word embedding question

The two upsampling methods seem to be UpsamplingNearest2d and UpsamplingBilinear2d:

Does one of these methods work better than the other for upsampling a word/text embedding to size of a larger image embedding for the purpose of combining (via add or concat) the text embedding & image embedding?

I tried both and also tried simply repeating the whole text embedding.
Repeating worked best, followed by Bilinear, followed by NearestNeighbor.