Model I am using (UniLM, MiniLM, LayoutLM …): LayoutLMv2
The problem arises when using:
- the official example scripts:
- Following NielsRogge demo for implementation,
The model is working fine when deploying it with normal save and load torch functions but to optimise the inference time trying to compile the model using AWS torch neuron sdk to deploy it over inferentia it is causing issue as the compilation of model uses torch.jit.trace() function to compile the model.
When tried implement it over normal torch environment still facing the same issue.
To Reproduce
Steps to reproduce the behavior:
-
Following NielsRogge demo for implementation till the encoded inputs generated in inference, Google Colab .
Expected behavior
A traced model should be created for storing and deploying it to AWS inferentia machine for compilation.
- Platform:
- Python version: Python 3.7.13
- PyTorch version (GPU?):1.12.0+cu113 , GPU - Yes
If there is any other information required, please drop a note.
Thanks