First and second derivates of the output with respect to the input inside a loss function

Tushar_Gautam · October 18, 2020, 4:36am

I have defined a custom loss function which depends on the First and second derivatives of the output of the NN with respect to the input.

def PDE(x,y):
    first_derivative = torch.autograd.grad(y, x, 
                                     grad_outputs=y.data.new(y.shape).fill_(1),
                                     create_graph=True, retain_graph=True)[0]
    print("dydx: \n", first_derivative)
    # We now have dy/dx
    
    second_derivative = torch.autograd.grad(first_derivative, x,
                                      grad_outputs=first_derivative.data.new(first_derivative.shape).fill_(1),
                                      create_graph=True, retain_graph=True)[0]
    print("d2ydx2: \n", second_derivative)
    # This computes d/dx(dy/dx) = d2y/dx2

    eq  = 4*first_derivative[:,1] - second_derivative[:,0]

    loss = torch.mean(eq**2)

    return loss

When I run loss.backward() I get an error saying

RuntimeError: leaf variable has been moved into the graph interior

I can’t understand why this is happening.

Here’s my complete code:

# Define the NN model to solve the problem
class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.lin1 = nn.Linear(2,10)
        self.lin2 = nn.Linear(10,1)

    def forward(self, x):
        x = torch.sigmoid(self.lin1(x))
        x = self.lin2(x)
        return x

model = Model()

y = model(X_train)

loss = PDE(X_train, y)

loss.backward()

Nikronic · October 18, 2020, 7:12am

Hi,

I think the error is related to the input’s leaf state (although I am not exactly sure, this may help).

But you can try enable grads for input x, by using:

x = x.clone().detach().requires_grad_(True)

as the first line of forward method.

Bests

Tushar_Gautam · October 18, 2020, 8:28am

Hi, I tried what you said but it’s not working. I’m getting an error that one of the input is not differentiable. I think it’s happening because the input doesn’t have required_grads=True.
As I’m just using NN to fit a PDE, I’ve to create X_train as follows:

x = np.arange(0.1,2.,0.1)
t = np.arange(0.1,20.,0.1)

X_train = torch.zeros((256,2), requires_grad=True)
for i in range(X_train.shape[0]):
    X_train[i,0]=x[np.random.randint(x.shape[0])]
    X_train[i,1]=t[np.random.randint(t.shape[0])]

I thought that the problem might be due to second derivative so I removed it but there’s same error. Attaching my complete code below (I think, looking at the complete code might help):

def PDE(X_train,model):
    y = model(X_train)
    first_derivative = torch.autograd.grad(y, X_train, 
                                     grad_outputs=y.data.new(y.shape).fill_(1),
                                     create_graph=True, retain_graph=True, allow_unused=True)[0]

    eq  = first_derivative[:,1] + first_derivative[:,0]

    bc1_inp1 = torch.zeros((X_train.shape))
    bc1_inp1[:,1] = X_train[:,1]
    bc1_inp2 = torch.ones((X_train.shape))*2.
    bc1_inp2[:,1] = X_train[:,1]
    bc1 = model(bc1_inp1) - model(bc1_inp2) 
    
    ic_inp = torch.zeros((X_train.shape))
    ic_inp[:,0] = X_train[:,0]
    ic = model(ic_inp) - torch.sin(np.pi*X_train[:,0])

    loss = torch.mean(eq**2) + torch.mean(bc1**2) + torch.mean(ic**2)

    return loss

opt = optim.Adam(model.parameters(),lr=0.1,amsgrad=True) # Got faster convergence with Adam using amsgrad

# Iterative learning
epochs = 1000
for epoch in range(epochs):
    opt.zero_grad()
    loss = PDE(X_train, model)
    #print("LOSS: ", loss)
    #print("X_train: ", X_train)
    opt.zero_grad()

    loss.backward()
    opt.step()
    #print("One step done")
    
    if epoch % 10 == 0:
        print('epoch {}, loss {}'.format(epoch, loss.item()))

Nikronic · October 18, 2020, 10:32am

Everything looks fine to me.
Maybe computation of derivatives is causing the issue.

In this post I have references to a notebook that solves possion equation which contains solving second order PDEs. I hope it helps.

Twilight · November 17, 2022, 11:52am

“grad_outputs=y.data.new(y.shape).fill_(1)” This destroys the computational graph, making it impossible to derive derivatives.