I’m working on a problem involving sensitivity analysis and hoping to use pytorch and it’s in-build operations instead of coding everything from scratch in CUDA.
I’ve a small example code (using NN example as most people would be familiar with this here) as follows, where computations involving
dA are independent of that of
def sensitive(d_inp, inp, param): Z = torch.matmul(inp, param.T) dZ = torch.matmul(d_inp, param.T) A = torch.tanh(Z) dA = torch.unsqueeze(1 - torch.tanh(Z)**2, axis=1) * dZ return A, dA
I want to parallelise the code such that
dZ are computed in parallel, followed by the parallel evaluation of
I was looking for solution to this but couldn’t find anything. Hope someone can help me out here.