It takes a long time for torch to send back after specifying the gradient of the tensor

I use pytorch to specify the gradient of the output tensor and then return the gradient, using this code: im_val.backward(grad_tmp.detach().clone()). im_val and grad_tmp are a matrix. im_val is obtained by this code: fdict_val, _, im_val, mask = generator.gen_fixed(im1.detach().clone(),init_logo). Then I just need to get the gradient returned to init_logo, it seems that I can’t get it with this code: grads_on_logo, = torch.autograd.grad(im_val, [init_logo]). Because torch.autograd.grad can only be passed back as a number of loss, I would like to ask whether it is possible to use an API of torch to directly obtain the gradient on init_logo without obtaining the gradient of other tensors. Another problem is that it takes a long time for me to use im_val.backward(grad_tmp.detach().clone()), it may take at least ten minutes, but the code of the tensorflow version may be returned in a minute at most, I don’t know why.