Suppose i have a bert embedding of (32,100,768) and i want to PAD, to make it (32,120,768).
Should i PAD it with torch.zero(1,20,768) ? Where all weights are zero.
I know it can be initially padded in input ids. But i want to know how can i PAD the generated embeddings.
Thank you for any advise in this direction.