I’m trying to convert the model in the run_classifier.py example from the pytorch-pretrained-BERT repo to ONNX format but run into a problem with a tensor size mismatch. The details are, hopefully, covered in this case on Stack Overflow.
I notice that libtorch with the PyTorch JIT is another option… perhaps they are the path forward? I would like to get this into a c++ server-side component. I’ve tried asking in the jit category but they appear to be solely focused on existing JIT issues.
Did you manage to get your model converted? I’ve been trying to convert the bert-base-uncased model to ONNX, but I’m hitting what appears to be a memory-related error:
builtins.RuntimeError: [enforce fail at CPUAllocator.cpp:56] posix_memalign(&data, gAlignment, nbytes) == 0. 12 vs 0. Looking into it with
free -m I noticed that it happens with as little as 6GB of RAM usage (running on cpu), which seems very odd, since the machine has 24GB installed. Any help appreciated.
For reference, my conversion is super straightforward:
output_dir = '/home/james/src/pytorch-pretrained-BERT/outputs'
max_seq_length = 128
num_labels = 128
# Load pre-trained model (weights)
model = BertModel.from_pretrained('bert-base-uncased')
model_state_dict = os.path.join(output_dir, 'pytorch_bert.bin')
print('Model loaded: ', model_state_dict)
# Save ONNX
msl = max_seq_length
dummy_input = torch.randn(1, msl, msl, msl, num_labels).long()
output_onnx_file = os.path.join(output_dir, "bert_test.onnx")
torch.onnx.export(model, dummy_input, output_onnx_file)