AttributeError: 'tuple' object has no attribute 'permute' error in XLM-R pytorch

nr_spider · March 3, 2022, 10:38am

I’m trying to convert a BERT-LSTM model to XLM-R - LSTM model. The complete code of BERT-LSTM worked fine without any bugs. The forward function of the BERT-LSTM is as follows.

def forward(self, sents):
        sents_tensor, masks_tensor, sents_lengths = sents_to_tensor(self.tokenizer, sents, self.device)
        encoded_layers, pooled_output = self.bert(input_ids=sents_tensor, attention_mask=masks_tensor, output_all_encoded_layers=False)
        encoded_layers = encoded_layers.permute(1, 0, 2)
        enc_hiddens, (last_hidden, last_cell) = self.lstm(pack_padded_sequence(encoded_layers, sents_lengths))
        output_hidden = torch.cat((last_hidden[0], last_hidden[1]), dim=1) 
        output_hidden = self.dropout(output_hidden)
        pre_softmax = self.hidden_to_softmax(output_hidden)

        return pre_softmax

When I tried to use the same forward function to train the XLM-R - LSTM model, I got the following error
TypeError: forward() got an unexpected keyword argument 'output_all_encoded_layers'

So, I removed output_all_encoded_layers=False from
encoded_layers, pooled_output = self.bert(input_ids=sents_tensor, attention_mask=masks_tensor, output_all_encoded_layers=False).

This is the new forward function.

def forward(self, sents):
        sents_tensor, masks_tensor, sents_lengths = sents_to_tensor(self.tokenizer, sents, self.device)
        encoded_layers = self.bert(input_ids=sents_tensor, attention_mask=masks_tensor)
        encoded_layers = encoded_layers.permute(1, 0, 2)
        enc_hiddens, (last_hidden, last_cell) = self.lstm(pack_padded_sequence(encoded_layers, sents_lengths))
        output_hidden = torch.cat((last_hidden[0], last_hidden[1]), dim=1)  
        output_hidden = self.dropout(output_hidden)
        pre_softmax = self.hidden_to_softmax(output_hidden)

        return pre_softmax

Now I get following error

AttributeError: 'tuple' object has no attribute 'permute'

How can I solve this?

basingse · March 4, 2022, 3:35am

Can you print encoder_layers ?. I think it’s tuple and thats the reason for error.

And also if you do this

for x in encoder_layers:
    print(x.shape)

What is the output ?

nr_spider · March 4, 2022, 7:22am

Thank you for the reply @basingse

Result of encoder_layers:

(tensor([[-0.0797,  0.1892],
        [-0.0392,  0.0979],
        [ 0.1752,  0.2210],
        [ 0.0306,  0.3879],
        [-0.0071,  0.0827],
        [ 0.0657,  0.1936],
        [ 0.0765,  0.2076],
        [ 0.1019,  0.3356],
        [ 0.1681,  0.1925],
        [-0.0091,  0.1454],
        [ 0.0312,  0.0524],
        [-0.0036,  0.0945],
        [ 0.2105,  0.0304],
        [ 0.0723,  0.0870],
        [ 0.0224,  0.2624],
        [ 0.1844,  0.1263],
        [-0.0011,  0.1066],
        [ 0.1123,  0.1126],
        [ 0.0765,  0.1924],
        [ 0.1651,  0.0948],
        [ 0.1470,  0.2115],
        [ 0.0909,  0.2878],
        [ 0.0508,  0.0911],
        [ 0.1751,  0.1574],
        [ 0.0950,  0.1157],
        [ 0.0642,  0.1616],
        [ 0.1972,  0.2262],
        [ 0.0196,  0.2006],
        [ 0.0088,  0.2144],
        [-0.0083,  0.0228],
        [ 0.2041,  0.2208],
        [ 0.1391,  0.1229]], grad_fn=<AddmmBackward0>),)

Result of shape of elements of encoder_layers:

torch.Size([32, 2])

basingse · March 4, 2022, 7:34am

Hello

So it’s a tuple of only one element.

You can just do
encoder_layers = encoder_layers[0]

before permute line.

nr_spider · March 6, 2022, 4:02am

Thank you @basingse ! It solved the problem, but now has a new error:

encoded_layers = encoded_layers.permute(1, 0, 2)
RuntimeError: number of dims don't match in permute

The main problem I have is why does is code throw this many errors for XLM-R model as there were no errors for BERT model

basingse · March 19, 2022, 4:04pm

Hello,

encoded layers has size of [32,2] that is you have only 2 dimensions but in permute you are using 3 dimensions.

As per my understanding the output of bert and XLM is not same ( in terms dimensions of output)

Just check what XLM returns ( from hugging face documentation may be)