Loss.backward() - IndexError: select(): index 1 out of range for tensor of size [1, 32, 100] at dimension 0

Nikos_Spanos · December 16, 2021, 10:49am

I have created the following NN using PyTorch API (for NLP Multi-class Classification)

class MultiClassClassifer(nn.Module):
  #define all the layers used in model
  def __init__(self, vocab_size, embedding_dim, hidden_dim, output_dim):
    
    #Constructor
    super(MultiClassClassifer, self).__init__()

    #embedding layer
    self.embedding = nn.Embedding(vocab_size, embedding_dim)

    #dense layer
    self.hiddenLayer = nn.Linear(embedding_dim, hidden_dim)

    #Batch normalization layer
    self.batchnorm = nn.BatchNorm1d(hidden_dim)

    #output layer
    self.output = nn.Linear(hidden_dim, output_dim)

    #activation layer
    self.act = nn.Softmax(dim=1) #2d-tensor

    #initialize weights of embedding layer
    self.init_weights()

  def init_weights(self):

    initrange = 1.0
    
    self.embedding.weight.data.uniform_(-initrange, initrange)
  
  def forward(self, text, text_lengths):

    embedded = self.embedding(text)
    embedded = torch.mean(embedded, dim=1, keepdim=True)
    print(embedded.shape)

    #packed sequence
    packed_embedded = nn.utils.rnn.pack_padded_sequence(embedded, text_lengths, batch_first=True)

    tensor, batch_size = packed_embedded[0], packed_embedded[1]
    print(tensor.shape)

    hidden_1 = self.batchnorm(self.hiddenLayer(tensor))
    print(hidden_1.shape)

    output = self.act(self.output(hidden_1))
    print(output.shape)

    return output

Instantiating the model

INPUT_DIM = len(TEXT.vocab)
EMBEDDING_DIM = 100
HIDDEN_DIM = 64
OUTPUT_DIM = 3

model = MultiClassClassifer(INPUT_DIM, EMBEDDING_DIM, HIDDEN_DIM, OUTPUT_DIM)

To train the model I have created the following train() method:

def train(model, iterator, optimizer, criterion):
    
    epoch_loss = 0
    
    model.train()
    
    for batch in iterator:
        
        optimizer.zero_grad()

        text, text_lengths = batch.text_normalized_tweet
                
        predictions = model(text, text_lengths)
        
        loss = criterion(predictions, batch.label)
        print(loss)
        print(loss.item())

        loss.backward() #exception occurs here
      
        optimizer.step()
        
        epoch_loss += loss.item()
  
    return epoch_loss / len(iterator)

I have created the following NN using PyTorch API (for NLP Multi-class Classification)

class MultiClassClassifer(nn.Module):
  #define all the layers used in model
  def __init__(self, vocab_size, embedding_dim, hidden_dim, output_dim):
    
    #Constructor
    super(MultiClassClassifer, self).__init__()

    #embedding layer
    self.embedding = nn.Embedding(vocab_size, embedding_dim)

    #dense layer
    self.hiddenLayer = nn.Linear(embedding_dim, hidden_dim)

    #Batch normalization layer
    self.batchnorm = nn.BatchNorm1d(hidden_dim)

    #output layer
    self.output = nn.Linear(hidden_dim, output_dim)

    #activation layer
    self.act = nn.Softmax(dim=1) #2d-tensor

    #initialize weights of embedding layer
    self.init_weights()

  def init_weights(self):

    initrange = 1.0
    
    self.embedding.weight.data.uniform_(-initrange, initrange)
  
  def forward(self, text, text_lengths):

    embedded = self.embedding(text)
    embedded = torch.mean(embedded, dim=1, keepdim=True)
    print(embedded.shape)

    #packed sequence
    packed_embedded = nn.utils.rnn.pack_padded_sequence(embedded, text_lengths, batch_first=True)

    tensor, batch_size = packed_embedded[0], packed_embedded[1]
    print(tensor.shape)

    hidden_1 = self.batchnorm(self.hiddenLayer(tensor))
    print(hidden_1.shape)

    output = self.act(self.output(hidden_1))
    print(output.shape)

    return output

Instantiating the model

INPUT_DIM = len(TEXT.vocab)
EMBEDDING_DIM = 100
HIDDEN_DIM = 64
OUTPUT_DIM = 3

model = MultiClassClassifer(INPUT_DIM, EMBEDDING_DIM, HIDDEN_DIM, OUTPUT_DIM)

To train the model I have created the following train() method:

def train(model, iterator, optimizer, criterion):
    
    epoch_loss = 0
    
    model.train()
    
    for batch in iterator:
        
        optimizer.zero_grad()

        text, text_lengths = batch.text_normalized_tweet
                
        predictions = model(text, text_lengths)
        
        loss = criterion(predictions, batch.label)
        print(loss)
        print(loss.item())

        loss.backward() #exception occurs here
      
        optimizer.step()
        
        epoch_loss += loss.item()
  
    return epoch_loss / len(iterator)

However, when I start the first epoch of the training, during the loss.backward() step I receive the following error

IndexError: select(): index 1 out of range for tensor of size [1, 32, 100] at dimension 0

Here is the output of the print() functions

torch.Size([32, 1, 100])
torch.Size([32, 100])
torch.Size([32, 64])
torch.Size([32, 3])
tensor(1.1309, grad_fn=<NllLossBackward0>)
1.1309444904327393

Does the error have to do with the nn.Embedding layer of my network? And if so, how can I fix this exception.

You may find here my Golab Notebook.

mMagmer · December 16, 2021, 2:15pm

hi,
i think you have problem with this line.

i managed to reproduce the error with following code.

import torch
import torch.nn as nn

total = 256 * 56 * 4 
x = torch.arange(0, total).view(256,56 , 4).float()
x.requires_grad = True
o= nn.utils.rnn.pack_padded_sequence(x, torch.tensor(256*[57]), batch_first=True)
o[0].mean().backward()

if you place any number greater than 56 in lengths of each batch element, it raises an error.

Nikos_Spanos · December 16, 2021, 2:25pm

Indeed since you reproduced the error the rnn.pack_padded_sequece() has the problem.

The case is that I want to train my own embeddings that’s why I used the embedding layer. Afterwards, train a second algorithm using pre-trained glove embeddings. To train the customer embeddings with nn.Embedding layer is it necessary to use the rnn.pack_padded_sequence() method or is there any alternative?

mMagmer · December 16, 2021, 2:48pm

¯\_(ツ)_/ ¯ i know nothing about it

Nikos_Spanos · December 16, 2021, 2:51pm

It’s Ok, I hope some1 will find a workareound. Because neither do I know how to debug this. I am coming from Tensorflow and Keras, so I am totally newb with PyTorch

Nikos_Spanos · December 17, 2021, 1:35pm

mMagmer do you know the solution to the error raised for the o[0].mean().backward()?
Since you find it out maybe you know also a workaround. Appreciate your help

mMagmer · December 17, 2021, 1:40pm

i think you’re not suppose to have any length greater than original sequence.
57 -->55

import torch
import torch.nn as nn

total = 256 * 56 * 4 
x = torch.arange(0, total).view(256,56 , 4).float()
x.requires_grad = True
o= nn.utils.rnn.pack_padded_sequence(x, torch.tensor(256*[55]), batch_first=True)
o[0].mean().backward()

sagsriv · December 17, 2021, 1:46pm

The error, at least from

mMagmer:

import torch
import torch.nn as nn

total = 256 * 56 * 4 
x = torch.arange(0, total).view(256,56 , 4).float()
x.requires_grad = True
o= nn.utils.rnn.pack_padded_sequence(x, torch.tensor(256*[57]), batch_first=True)
o[0].mean().backward()

is pretty clear. pack_padded_sequence takes two args: input , lengths . first is BxTxH (because you are doing batch_first=True), the second is [length]*B, i.e., a B dim vector. If you are asking it to use batchsize 57 when the real batchsize is 56.

sagsriv · December 17, 2021, 1:56pm

This line is weird. Do you really want to mean over the tokens (dim=1) before putting it into an RNN?

In any case,

is going to be wrong, because batch_first means from 32x1x100 you have a batch of size 1, so the second args should be a vector of dim 1. and I can expect that text_lengths does not have that dimension.

Nikos_Spanos · December 17, 2021, 3:53pm

mMagmer, How can I specify the value of torch.tensor(256*[55]) because in my case I use the text_lengths which is a fixed number per batch iteration. For example, rnn.pack_padded_sequence(32, 13,…) (batch 1), rnn.pack_padded_sequence(32, 7,…) (batch 2). As you can see each batch has a different length of tokens per sentence.

mMagmer · December 17, 2021, 4:05pm

i’m not an expert in nlp .
but based on name of function pack_padded_sequence , i think you should first pad the sequence to have the same length.
see this link.
the input sequence of pack_padded_sequence is padded before that.
again i know nothing about nlp.

Nikos_Spanos · December 19, 2021, 7:02pm

Hmm isn’t that strange?..I mean I have a training iterator that has batches of different shape. First batch has shape [32, 13], next batch has shape [32, 7], next batch has shape [32,11] etc. So based on the solutions I followed that use rnn.pack_padded_sequence(), the packed_embedded was a tensor with [32x11,100], [32x13,100], [32x7, 100]…etc So, I have used the .mean() method to bring the tensor dimensions down to 32x1 (averaged). This is the solution I followed. Do you know any alternative solution to apply multi-class text classification with my own embeddings? Would appreciate your help.

Nikos_Spanos · December 21, 2021, 3:16pm

@sagsriv
Do you have any solution to this one

packed_embedded = nn.utils.rnn.pack_padded_sequence(embedded, text_lengths, batch_first=True)

, based on my implementation? I would really appreciate any help

Nikos_Spanos · December 21, 2021, 4:55pm

@sagsriv This is not true…batch_first=False will throw the error you describe. With batch_first=True the nn will understand that batch_size=32 and not 1.

ljt2448152687 · November 4, 2023, 2:45pm

I also encountered the same problem. You can try to delete batch_first=True. After I deleted it, there was no error. I hope to hear about your good results.