Specifically, when writing TC-like loops in JIT-ed functions.
My issues are:
- I haven’t been able to get good performance out of
jit.script. My use case might be a little too dynamic?
- When JIT-ing, I have no control on any kinds of optimizations. I can’t nudge the jitter to fuse a particular sequence of operations, for example. So I can’t make use of the jitter to eliminate OOM errors.
- Trying to do it with loops is generally slower than using existing maps and reductions with (memory-hungry) intermediates.
tensor_comprehensionsis not available on Windows, as far as I can see