Sorry about the late reply, it took me some time to find the solution to the issue but your reproduction was really helpful!
Turns out the solution was to manually delete the model output with del output
after every evaluation step. This seems to fix the memory leak. I still don’t understand why this isn’t necessary during training, though.
Anyway, thanks again!