Specifically, when writing TC-like loops in JIT-ed functions.
My issues are:
- I haven’t been able to get good performance out of
jit.script
. My use case might be a little too dynamic? - When JIT-ing, I have no control on any kinds of optimizations. I can’t nudge the jitter to fuse a particular sequence of operations, for example. So I can’t make use of the jitter to eliminate OOM errors.
- Trying to do it with loops is generally slower than using existing maps and reductions with (memory-hungry) intermediates.
-
tensor_comprehensions
is not available on Windows, as far as I can see