Computing multiple losses error

nn_beginner · March 18, 2022, 12:09pm

Hi,
I am designing a NN controller which is trained using a state space model (SSM) representation NN already built. The controller makes predictions for the next N outputs, where N = 2. The SSM using one of the inputs at a time to compute the next state, and so on for N inputs. After each output I change the input by one time step to accommodate for this.

This is where the issue arises, making updates to the tensor means the autograd can no longer compute the gradient.

Ideally, I would like a tensor of the N outputs caused by the N inputs, to compare to a desired tensor from where the loss can be computed, is there a way to do this?

Below, I have my code:

For context, X is the N previous input and state tensor


for dd in range(0, len(x_train) - batch, batch):
    X = x_train[dd:dd + batch]
    optimizer.zero_grad()
    output = dpc(X)
    optimizer.zero_grad()

    history = torch.zeros((batch, 21))
    #fill this array with the current states
    for r in range(batch):
        for ii in range(20):
            history[r][ii] = X[r][ii]
        history[r][20] = output[r][0]
    y = ss_model(history[0:4][0:21])
    y = (y * (max_a[1][0] - min_a[1][0])) + min_a[1][0]
    desired = torch.zeros((batch, 1))
    loss = (F.mse_loss(y, desired))
    print(y)

    for gg in range(batch):
        for ll in range(19):
            history[gg][ll] = history[gg][ll+2]
        history[gg][19] = history[gg][20]
        history[gg][20] = y[gg][0]

    y2 = ss_model(history[0:4][0:21])
    y2 = (y2 * (max_a[1][0] - min_a[1][0])) + min_a[1][0]
    print(y2)
    loss2 = (F.mse_loss(y2, desired))
    loss2.backward()
    optimizer.step()

Many thanks

albanD · March 22, 2022, 3:59pm

Hi,

It would help if you could provide a code sample that we could run here. With dummy Modules/Optimizers if you don’t want to share the ones you use.

Note that you can simplify your code as follows already:

for dd in range(0, len(x_train) - batch, batch):
    X = x_train[dd:dd + batch]
    optimizer.zero_grad()
    output = dpc(X)
    optimizer.zero_grad()

    history = torch.zeros((batch, 21))
    #fill this array with the current states
    history.narrow(1, 0, 20).copy_(X)
    history.narrow(1, -1, 1).copy_(output)

    y = ss_model(history[0:4])
    y = (y * (max_a[1][0] - min_a[1][0])) + min_a[1][0]
    desired = torch.zeros((batch, 1))
    loss = (F.mse_loss(y, desired))
    print(y)

    history.narrow(1, 0, 19).copy_(history.narrow(1, 2, 19))
    history.select(1, 19).copy_(history.select(1, 20))
    history.narrow(1, 20, 1).copy_(y)

    y2 = ss_model(history[0:4])
    y2 = (y2 * (max_a[1][0] - min_a[1][0])) + min_a[1][0]
    print(y2)
    loss2 = (F.mse_loss(y2, desired))
    loss2.backward()
    optimizer.step()

That should at least be faster. But if you want to keep around the history, you can just .clone() that Tensor whenever you need that value saved.