Slow training of rCNN and torch.jit.script() function not compatible

Hello,
in the following gist you can find some example code of my rCNN which I use for analyzing and training on chess games:

The data is structured as follows:
input: [sequence length, 12, 8, 8]
target: [sequence length, number of possible outputs]

The output tensor is a hot encoded tensor using all possible chess moves in my chess dataset.

The issue that I am facing is that despite my several attempts of optimizing the model (channel_last memory format, disabling debug APIs, etc.) I am getting very long training times and also the torch.jit.script() module is handing me an error whenever I try to parse my model like this:
model = torch.jit.script(model)
Any help would be greatly appreciated!