I’m trying to convert a BERT-LSTM model to XLM-R - LSTM model. The complete code of BERT-LSTM worked fine without any bugs. The forward
function of the BERT-LSTM
is as follows.
def forward(self, sents):
sents_tensor, masks_tensor, sents_lengths = sents_to_tensor(self.tokenizer, sents, self.device)
encoded_layers, pooled_output = self.bert(input_ids=sents_tensor, attention_mask=masks_tensor, output_all_encoded_layers=False)
encoded_layers = encoded_layers.permute(1, 0, 2)
enc_hiddens, (last_hidden, last_cell) = self.lstm(pack_padded_sequence(encoded_layers, sents_lengths))
output_hidden = torch.cat((last_hidden[0], last_hidden[1]), dim=1)
output_hidden = self.dropout(output_hidden)
pre_softmax = self.hidden_to_softmax(output_hidden)
return pre_softmax
When I tried to use the same forward function to train the XLM-R - LSTM model, I got the following error
TypeError: forward() got an unexpected keyword argument 'output_all_encoded_layers'
So, I removed output_all_encoded_layers=False
from
encoded_layers, pooled_output = self.bert(input_ids=sents_tensor, attention_mask=masks_tensor, output_all_encoded_layers=False)
.
This is the new forward function.
def forward(self, sents):
sents_tensor, masks_tensor, sents_lengths = sents_to_tensor(self.tokenizer, sents, self.device)
encoded_layers = self.bert(input_ids=sents_tensor, attention_mask=masks_tensor)
encoded_layers = encoded_layers.permute(1, 0, 2)
enc_hiddens, (last_hidden, last_cell) = self.lstm(pack_padded_sequence(encoded_layers, sents_lengths))
output_hidden = torch.cat((last_hidden[0], last_hidden[1]), dim=1)
output_hidden = self.dropout(output_hidden)
pre_softmax = self.hidden_to_softmax(output_hidden)
return pre_softmax
Now I get following error
AttributeError: 'tuple' object has no attribute 'permute'
How can I solve this?