Does TokenEmbedding size must equal to embed size * vocab size in Transformer model

Andrew_Firman · March 25, 2022, 12:51am

I have long sequences and the vocab is huge resulting to OOM