I’ve posted this question on SO (https://stackoverflow.com/questions/56962116/bert-pytorch-to-onnx-conversion-lambda-error), but I thought this might be a better place for it. I’m trying to convert the PyTorch BERT implementation from https://github.com/codertimo/BERT-pytorch, but I’m hitting an error with the TransformerBlock:
builtins.ValueError: Auto nesting doesn't know how to process an input object of type bert_pytorch.model.transformer.TransformerBlock.forward.<locals>.<lambda>. Accepted types: Tensors, or lists/tuples of them
The code for the TransformerBlock is:
class TransformerBlock(nn.Module):
"""
Bidirectional Encoder = Transformer (self-attention)
Transformer = MultiHead_Attention + Feed_Forward with sublayer connection
"""
def __init__(self, hidden, attn_heads, feed_forward_hidden, dropout):
"""
:param hidden: hidden size of transformer
:param attn_heads: head sizes of multi-head attention
:param feed_forward_hidden: feed_forward_hidden, usually 4*hidden_size
:param dropout: dropout rate
"""
super().__init__()
self.attention = MultiHeadedAttention(h=attn_heads, d_model=hidden)
self.feed_forward = PositionwiseFeedForward(d_model=hidden, d_ff=feed_forward_hidden, dropout=dropout)
self.input_sublayer = SublayerConnection(size=hidden, dropout=dropout)
self.output_sublayer = SublayerConnection(size=hidden, dropout=dropout)
self.dropout = nn.Dropout(p=dropout)
def forward(self, x, mask):
x = self.input_sublayer(x, lambda _x: self.attention.forward(_x, _x, _x, mask=mask)) // <-- Error!
x = self.output_sublayer(x, self.feed_forward)
return self.dropout(x)
Any workarounds?