I have a quick question on lazy evaluation of long running operations and if they block or not (separate answer for *.device(‘cpu’) and .device(‘cuda’), if necessary"
def MyFunc(inputTensor): x = SomeLongRunningFunction(inputTensor); y = AnotherLongRunningFunction(inputTensor); return x,y;
Does the computation of x block the computation of y, even though they are independent of each other? If yes, is there a simple way to make the computations occur simultaneously?
Similar questions for loops. Is the loop unrolled so the independent indexed computations can occur simultaneously?
def MyLoopFunc(inputTensor,N,M,device): x = torch.empty((N,M),device=device); for idx in range(N): x[idx,:] = LongRunningOperation(inputTensor,idx); return x;