Suppose I have a network f from (x1, x2) ----> f(x1, x2). I want to minimize | df/dx1+2* df/dx2 |, that is, I want to minimize the sum of the derivatives of the output of the network. Does anyone have some idea about this? Thank you.
For example, my network looks like this:

Thanks for your reply. My current f is the network; how can I specify x1 and x2?
I tried in this way, however, I get “One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.”

You’re getting a warning when you do torch.tensor([x1, x2]) no?
The is creating a new Tensor based on x1 and x2 but not in a differentiable way. So that’s why x1, x2 have not been used, torch.stack([x1, x2]) should work fine

Thanks for your reply; there is no warning when I do torch.tensor([x1, x2]) ; but torch.stack([x1, x2]) works. But do you have some idea if I have a batch of data, say, [[1, 2], [3, 4] ], how can I do this? Thanks.

You can just pass to autograd.grad whatever you had as input to your function.
I mentioned stack here because you were using something like it to generate tt2 but you don’t have to use it.

As the error mentions, your output is not a scalar (tt3). So you cannot get the gradients with just a single call to autograd.grad. You need to either compute a scalar loss for it or provide a grad_outputs depending on what you want to do for your application.