In my project, I need to feed a tensor of shape [10, 17] to a transformer layer, which means there are 10 time stamps and each time stamp is a [1x17] vector.

Now, I have two tensors X1=[5, 64] and X2=[5, 17], and I hope to concatenate them together into a [10, 17] which will be consumed by the transformer layer.

Currently, I am thinking if I can convert X1 to [5, 17]. Maybe it can be achieved by applying a torch.nn.Linear(64,17) to each column of X1, but I am not sure how to implement it nicely with torch. Could someone give me some help?