Embedding Runtime Error Index Out of Range

eprox · September 16, 2020, 12:59pm

Hi,
I am getting the following error:

Traceback (most recent call last):
  File "train.py", line 549, in <module>
    task.execute()
  File "train.py", line 271, in execute
    train_loss, train_acc = self.train(phase)
  File "train.py", line 147, in train
    answer = self.model(batch)
  File "/home/eprox/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/eprox/uom/clnli/clnli-code/rnn-impl/models/bilstm.py", line 69, in forward
    premise_embed = self.embedding(batch.premise)
  File "/home/eprox/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/eprox/.local/lib/python3.6/site-packages/torch/nn/modules/sparse.py", line 114, in forward
    self.norm_type, self.scale_grad_by_freq, self.sparse)
  File "/home/eprox/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 1484, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: index out of range: Tried to access index 12209 out of table with 4979 rows. at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:418

Module code:

class bilstm(nn.Module):
	def __init__(self, sent_embed, d_hidden, dp_ratio, device):
		super(BiLSTM, self).__init__()
		self.hidden_size = d_hidden
		self.directions = 2
		self.num_layers = 2
		self.concat = 4
		self.device = device
		self.out_dim = 3
		self.embedding = nn.Embedding.from_pretrained(sent_embed)
		print('SentEmbed:'); print(sent_embed.shape) 
		# Output: torch.Size([4980, 1024])
		self.embed_dim = sent_embed_dim
		
		print(self.embedding) # Output: Embedding(4980, 1024)
		print('EMBED DIM IS: {}'.format(self.embed_dim)) # EMBED DIM IS: 1024
		self.projection = nn.Linear(self.embed_dim, self.hidden_size)
		self.lstm = nn.LSTM(self.hidden_size, self.hidden_size, self.num_layers,
									bidirectional = True, batch_first = True, dropout = dp_ratio)
		self.relu = nn.LeakyReLU()
		self.dropout = nn.Dropout(p = dp_ratio)

		self.lin1 = nn.Linear(self.hidden_size * self.directions * self.concat, self.hidden_size)
		self.lin2 = nn.Linear(self.hidden_size, self.hidden_size)
		self.lin3 = nn.Linear(self.hidden_size, self.out_dim)

		for lin in [self.lin1, self.lin2, self.lin3]:
			nn.init.xavier_uniform_(lin.weight)
			nn.init.zeros_(lin.bias)

		self.out = nn.Sequential(
			self.lin1,
			self.relu,
			self.dropout,
			self.lin2,
			self.relu,
			self.dropout,
			self.lin3
		)

	def forward(self, batch):
		print('*premiseshape:'); print(batch.premise.shape)
		# Output: torch.Size([128, 35]
		print('*embeddingshape:'); print(self.embedding)
		# Output: Embedding(4980, 1024)
		print('*projectionshape:'); print(self.projection)
		# Output: Linear(in_features=1024, out_features=100, bias=True)
		premise_embed = self.embedding(batch.premise)
		hypothesis_embed = self.embedding(batch.hypothesis)

		premise_proj = self.relu(self.projection(premise_embed))
		hypothesis_proj = self.relu(self.projection(hypothesis_embed))

		h0 = c0 = torch.tensor([]).new_zeros((self.num_layers * self.directions, batch.batch_size, self.hidden_size)).to(self.device)

		_, (premise_ht, _) = self.lstm(premise_proj, (h0, c0))
		_, (hypothesis_ht, _) = self.lstm(hypothesis_proj, (h0, c0))

		premise = premise_ht[-2:].transpose(0, 1).contiguous().view(batch.batch_size, -1)
		hypothesis = hypothesis_ht[-2:].transpose(0, 1).contiguous().view(batch.batch_size, -1)

		combined = torch.cat((premise, hypothesis, torch.abs(premise - hypothesis), premise * hypothesis), 1)
		return self.out(combined)

Based on the outputs in the comments, it seems like the embedding and the sentence embedding are the correct size. So I cannot understand why it is trying to access the index 12209.
Could someone please advise me on this?

ptrblck · September 18, 2020, 3:57am

Based on the error message it seems that the input to the embedding contains an index of 12209, while the number of embeddings is set to 4949.
Could you check it via print(batch.premise.max()) and make sure the indices are in the valid range ([0, num_embeddings-1])?

jianlue_zhang · October 22, 2021, 3:20am

hi ,i also meet this problem when i fine tuned roberta model. but i still do not know how to resolve it after seeing your solution repeatedly.
it shows this problem. but i do not understand your solution. do you mean that every number in the input_ids after convert_token_to_id needs to be smaller than the embedding size?

ptrblck · October 22, 2021, 5:45am

I’m unsure what “embedding size” means in this context, but the input to nn.Embedding has to be in the range [0, num_embeddings-1] while embedding_dim can be chosen to a desired value and will be used as the feature vector dimension for each dense output tensor.
Here is a small code snippet showing the issue:

num_embeddings = 100
embedding_dim = 10
emb = nn.Embedding(num_embeddings=num_embeddings, embedding_dim=embedding_dim)
x = torch.randint(0, num_embeddings, (64, 1))

print(x.min(), x.max())
# > tensor(1) tensor(99)

out = emb(x) # works since x is in [0, num_embeddings-1]

x[0] = 100
out = emb(x)
# > IndexError: index out of range in self
# fails because x is out of range