How to create a custom loss that does not directly use the output of the network

Hello! I would like to create a custom loss that does not directly use the output of my network. Indeed, I need to create a loss that returns the difference between the result of a function f(x) (where x is the output of my network) and max(f(x)). Unfortunately my code doesn’t work and I don’t know how to proceed… Here is my code:

def forward(self, x, y, hidden):
    c_0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size))
    y = torch.reshape(y, (y.shape[0], 1, 1))
    tmp = torch.cat((x, y), 2)
    output, (hn, cn) = self.lstm(tmp, (hidden, c_0))
    out = self.fc(output)
    return out, hn

def _train(self):
    num_epochs = 10
    num_iteration = 10

    save_loss_global = []
    save_loss_epoch = []

    for epoch in range(num_epochs):
        print("NOUVELLE EPOCH")
        X_train, Y_train = donneesAleatoires()
        self.maxRes = 0
        self.hidden = Variable(torch.zeros(self.num_layers, 1, self.hidden_size))
        tabY = torch.Tensor()
        tabY = torch.cat((tabY, Y_train), 1)
        for iteration in range(num_iteration):
            x_i = X_train[0]
            x_i = torch.reshape(x_i, (x_i.shape[0], 1, x_i.shape[1]))
            y_i = Y_train[0]

            outputs, self.hidden = self(x_i, y_i, self.hidden)

            YiPlus1 = self.function(outputs.detach().numpy().reshape(1, -1))
            self.optimizer.zero_grad()
            Yadd = Variable(torch.Tensor(YiPlus1))
            tabY = torch.cat((tabY, Yadd), 1)

            loss = self.my_loss(tabY, iteration)

            if YiPlus1 > self.maxRes:
                self.maxRes = YiPlus1
            if y_i.detach().numpy() > self.maxRes:
                self.maxRes = y_i.detach().numpy()

            #loss = Variable(loss, requires_grad=True)
            loss.backward(retain_graph=True)

            X_train = outputs

            Y_train = YiPlus1
            Y_train = Variable(torch.Tensor(Y_train))

            self.optimizer.step()
            save_loss_global.append(loss.item())
            if iteration == num_iteration -1:
                save_loss_epoch.append(loss.item())
            print(X_train)


def my_loss(self, target, epoch):
    if isinstance(target, np.ndarray):
        target = Variable(torch.Tensor(target))
    tmp = self.maxRes
    loss = target[0][0] - tmp
    if epoch > 0:
        for i in range(1, epoch + 1):
            loss = loss + (target[0][i] - tmp)
    loss = -loss
    return loss / (epoch+1)

I don’t quite understand the title of this topic as it seems you want to use the output of the model.

In any case, detaching the output will cut the computation graph and thus no gradients will be calculated for the parameters used to create the output in:

YiPlus1 = self.function(outputs.detach().numpy().reshape(1, -1))

This line of code looks also suspicious, as retaining the graph is often used as a workaround for another error:

loss.backward(retain_graph=True)

so check if this usage is indeed needed.

Re-wrapping tensors will also detach them from the computation graph:

Y_train = Variable(torch.Tensor(Y_train))

so also check if you really want to detach Y_train here (unless it was never attached to a computation graph).

Variables are also deprecated since PyTorch 0.4 and you can use tensors in newer releases.

Thanks ! I indeed think that the line “YiPlus1 = self.function(outputs.detach().numpy().reshape(1, -1))” is problematic. Do you know how to proceed to keep the gradient? Because I need to convert my outputs to np.array but I can’t find any documentation

If you need to use another library, such as numpy, you would have to write a custom autograd.Function and implement the forward as well as backward method manually as described here.

I’ll take a look at that thanks!