Transformer Embedding - IndexError: index out of range in self

maxrivera · August 21, 2022, 1:49pm

I’m training a transformer model from scratch, and was able to run a “mini” train and dev set on the CPU without any errors, but when I run the training loop on the larger dataset, I encounter this error (also includes debug print statements):

-- Inside Encoder Init --
d_vocab:  5493
d_embedding:  128
-- Inside Decoder Init --
d_vocab:  3194
d_embedding:  128
-- Inside Encoder Forward --
embedding in:  torch.Size([64, 60])
embedding out:  torch.Size([64, 60, 128])
-- Inside Decoder Forward --
embedding in:  torch.Size([64, 60])
Traceback (most recent call last):
  File "~/transformer_training.py", line 176, in <module>
    out = model(src,trg, mask)
  File "~/myenv2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1186, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/transformer.py", line 237, in forward
    x = self.decoder(x, trg, mask)
  File "~/myenv2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1186, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/transformer.py", line 205, in forward
    x = self.embedding(x)
  File "~/myenv2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1186, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/transformer.py", line 20, in forward
    x = self.embedding(x)
  File "~/myenv2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1186, in _call_impl
    return forward_call(*input, **kwargs)
  File "~/myenv2/lib/python3.9/site-packages/torch/nn/modules/sparse.py", line 159, in forward
    return F.embedding(
  File "~/myenv2/lib/python3.9/site-packages/torch/nn/functional.py", line 2197, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
IndexError: index out of range in self

NOTE: This does not occur when I use a smaller train and dev set for debug purposes (terminal output):

-- Inside Encoder Init --
d_vocab:  1435
d_embedding:  128
-- Inside Decoder Init --
d_vocab:  1446
d_embedding:  128
-- Inside Encoder Forward --
embedding in:  torch.Size([64, 60])
embedding out:  torch.Size([64, 60, 128])
-- Inside Decoder Forward --
embedding in:  torch.Size([64, 60])
embedding out:  torch.Size([64, 60, 128])

basingse · August 21, 2022, 3:37pm

Hello again,

In error trace of yours (error in decoder stage)

  File "~/transformer.py", line 20, in forward
    x = self.embedding(x)

can you add print(torch.max(x)) before the line x = self.embedding(x)

I guess the error is because of x contains id that is >=3194. If the value is greater than 3194, then pytorch will raise the error mentioned in the stack trace.

It means that while you are converting tokens to ids, you are assigning the value greater than 3194.

One way to debug this is checking the max value for the batch before sending to model. Once the value is greater than or equal to 3194, then in that batch you find out for which sample via

# x-> (batch_size, seq_len)
torch.max(x, dim=0)

and finding the index where 3194 comes. With this you know the index of the sample, now you can traceback to ids and then tokens and then to the original sentence as well.

Hope it helps.