I have a for-loop that performs a lot of iterations, and in each one, I perform certain calculations in “extern inline” functions, and then set the result to a new torch::Tensor.

I’m attempting to optimize my program, so I would like to initialize memory for the tensor-variables used in the for-loop ahead of time (the same variable has the same size in each iteration), and then have the operations performed in the “extern inline” functions write their results straight into that pre-initialized memory.

Would this be possible somehow? Note that the extern inline functions are each only one line long (just a bunch of tensor operations), and I can replace them with #defines if necessary.

On a somewhat unrelated side-note, when working with small matrices & vectors, is it normal to experience a 100x (10000%) slowdown when switching from Eigen to LibTorch, or is that a sign that I’m doing something very wrong (I’ve already add no_grad guards and enabled inference mode)?