Issue with nn.TransformerDecoder layer

Hi all,

I am getting an error with the usage of the nn.TransformerDecoder layer.

I initialize the layer as follows:

self.transformer_decoder_layer = nn.TransformerDecoderLayer(2048, 8)
self.transformer_decoder = nn.TransformerDecoder(self.transformer_decoder_layer, num_layers=6)

However, under forward method, when I run “self.transformer_decoder” layer as following;

tgt_mem = self.transformer_decoder(tgt_emb, mem)

where tgt_emb.shape = torch.Size([8, 9, 2048])
and mem.shape = torch.Size([8, 68, 2048]

I get the following error:

File “/gpfs/hpc/home/hasan90/nPAPER2/IMG+TXT/model/MultiModal/Decoder.py”, line 59, in forward
tgt_mem = self.transformer_decoder(tgt_emb, mem)
File “/gpfs/hpc/home/hasan90/build/miniconda2/envs/python3/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 547, in call
result = self.forward(*input, **kwargs)
File “/gpfs/hpc/home/hasan90/build/miniconda2/envs/python3/lib/python3.6/site-packages/torch/nn/modules/transformer.py”, line 216, in forward
memory_key_padding_mask=memory_key_padding_mask)
File “/gpfs/hpc/home/hasan90/build/miniconda2/envs/python3/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 547, in call
result = self.forward(*input, **kwargs)
File “/gpfs/hpc/home/hasan90/build/miniconda2/envs/python3/lib/python3.6/site-packages/torch/nn/modules/transformer.py”, line 329, in forward
key_padding_mask=memory_key_padding_mask)[0]
File “/gpfs/hpc/home/hasan90/build/miniconda2/envs/python3/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 547, in call
result = self.forward(*input, **kwargs)
File “/gpfs/hpc/home/hasan90/build/miniconda2/envs/python3/lib/python3.6/site-packages/torch/nn/modules/activation.py”, line 783, in forward
attn_mask=attn_mask)
File “/gpfs/hpc/home/hasan90/build/miniconda2/envs/python3/lib/python3.6/site-packages/torch/nn/functional.py”, line 3213, in multi_head_attention_forward
k = k.contiguous().view(-1, bsz * num_heads, head_dim).transpose(0, 1)
RuntimeError: shape ‘[-1, 72, 256]’ is invalid for input of size 1114112

How I can solve the problem?

Thanks,

ohh my bad,

I just read the docs of transformer class, I give the tensors wrongly. it must be in (seq_len, B, feat_size) format.

1 Like