TorchScript function slower than non-JIT?

I have a simple function (LSTM layer) that I’m converting to TorchScript and executing. From some initial experiments, it looks like the JIT version runs slower than the non-JIT version on both CPU and GPU. The relevant code is listed here: https://github.com/lmnt-com/haste/blob/master/frameworks/pytorch/lstm.py#L31-L64. Is this expected behavior?