Optimize the derivatives of the output of the network

Suppose I have a network f from (x1, x2) ----> f(x1, x2). I want to minimize | df/dx1+2* df/dx2 |, that is, I want to minimize the sum of the derivatives of the output of the network. Does anyone have some idea about this? Thank you.
For example, my network looks like this:

``````class test(nn.Module):
def __init__(self):
super().__init__()

bo_b = False
self.l1 = nn.Linear(2, 1, bias = bo_b)

def forward(self, state):
v = self.l1(state)
return v
tt1 = test()
tt2 = torch.tensor([1, 2])
tt3 = tt1.forward(tt2)
``````

How can I create the criterion ( d tt3/ d x1 + 2*d tt3/ d x2) to minimize? Thanks

Hi,

If you have a that f, x1 and x2 defined (and requiring gradients), you can do:

``````out = f(x1, x2)

loss = some_criterion(df_dx1 + 2 * df_dx2)
loss.backward()
``````

Note that out has to be a single value for this to work, otherwise getting the full matrix of all the gradients will be much more expensive.

Hi, Alban,

Thanks for your reply. My current f is the network; how can I specify x1 and x2?
I tried in this way, however, I get “One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.”

``````x1 = torch.tensor(1.0, requires_grad = True)
x2 = torch.tensor(2.0, requires_grad = True)
tt2 = torch.tensor([x1, x2])
tt3 = tt1.forward(tt2)
``````

Thanks.

You’re getting a warning when you do `torch.tensor([x1, x2])` no?
The is creating a new Tensor based on x1 and x2 but not in a differentiable way. So that’s why x1, x2 have not been used, `torch.stack([x1, x2])` should work fine

Hi, Alban,

Thanks for your reply; there is no warning when I do `torch.tensor([x1, x2])` ; but `torch.stack([x1, x2]) ` works. But do you have some idea if I have a batch of data, say, [[1, 2], [3, 4] ], how can I do this? Thanks.

Hi,

I mentioned stack here because you were using something like it to generate `tt2` but you don’t have to use it.

Hi, Alban @albanD

Thanks for your reply; I still have some problem if the input is a batch. I have the following code:

``````tt1 = test()
x1 = torch.tensor([[1.0], [2.0]], requires_grad = True)
x2 = torch.tensor([[2.0], [3.0]], requires_grad = True)
tt2 = torch.cat([x1, x2], dim = -1)

tt3 = tt1.forward(tt2)