GPT2 TraceWarnings

Hi,

I have a GPT2 model fine-tuned for a question generation task. I am now trying to convert it into torchscript but I am facing the following issues. My results are not as good as the original model. It would be great if someone could help me figure out the issue. Following are the versions of the packages installed:
torch: 1.7
transformers: 4.1.1(tried with 3.0.1 and 2.0 as well)

/usr/local/lib/python3.6/dist-packages/transformers/models/gpt2/modeling_gpt2.py:168: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
w = w / (float(v.size(-1)) ** 0.5)
/usr/local/lib/python3.6/dist-packages/transformers/models/gpt2/modeling_gpt2.py:173: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
mask = self.bias[:, :, ns - nd : ns, :ns]
/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py:966: TracerWarning: Output nr 1. of the traced function does not match the corresponding output of the Python function. Detailed error:
With rtol=1e-05 and atol=1e-05, found 254695 element(s) (out of 27897075) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 5.1975250244140625e-05 (2.250950336456299 vs. 2.251002311706543), which occurred at index (0, 69, 37541).
_module_class,

Thanks