I receive the error below when doing a forward pass with a Transformer. I read from the post here Transformer encoder error when encoding long sequence (more than 1024 tokens) · Issue #83142 · pytorch/pytorch · GitHub that it is because my sequence length is greater than 1024. Indeed my sequence is of length. 1300.
The same thread on GitHub mentions with a link that the issue has been fixed. Fix issue in softmax.cu with transformer error when mask seqlen > 1024 by erichan1 · Pull Request #83639 · pytorch/pytorch · GitHub
But I don’t see the solution there. What should I do? What version of PyTorch do I need? How do I fix the same issue? I am using PyTorch and CUDA versions 1.12.0 and 11.6 respectively.
Traceback (most recent call last):
File "/home_nfs/haziq/Keystate-Forecasting/train.py", line 234, in <module>
va_output = loop(net=net,inp_data=va_data,optimizer=None,counter=va_counter,epoch=i,args=args,mode="val")
File "/home_nfs/haziq/Keystate-Forecasting/train.py", line 155, in loop
out_data = net(inp_data, mode=mode)
File "/home_nfs/haziq/cenvs/ptg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home_nfs/haziq/Keystate-Forecasting/models/action2pose/stf.py", line 215, in forward
transformer_encoder_out = self.transformer_encoder(key, src_key_padding_mask=mask) # [batch, 2 + pose_padded_length* obj_wrist_padded_length, transformer_encoder_units]
File "/home_nfs/haziq/cenvs/ptg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home_nfs/haziq/cenvs/ptg/lib/python3.8/site-packages/torch/nn/modules/transformer.py", line 238, in forward
output = mod(output, src_mask=mask, src_key_padding_mask=src_key_padding_mask)
File "/home_nfs/haziq/cenvs/ptg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/home_nfs/haziq/cenvs/ptg/lib/python3.8/site-packages/torch/nn/modules/transformer.py", line 437, in forward
return torch._transformer_encoder_layer_fwd(
RuntimeError: Mask shape should match input shape; transformer_mask is not supported in the fallback case.