I have a function to test some functionality of the training, and I want to repeat the testing for many times.
How can I do that efficiently so that it runs in parallel and fast on the GPUs?
For instance, how to replace this small block:
N = 100
for i in range(N):
result = test_training()