Can I find grad on the output of a nn wrt input?

Shresth_Somya · June 21, 2022, 7:28am

class fun():
    def __init__(self, x):
        self.x = torch.tensor(x, requires_grad=True).double()
        self.loss_history = []
        self.model = nn.Sequential(
            nn.Linear(2,10),
            nn.ReLU(),
            nn.Linear(10,10),
            nn.ReLU(),
            nn.Linear(10,1)).double()

    def cost_function(self):
        output = self.model(self.x)
        temp = output*100
        dx = torch.autograd.grad(temp, self.x[:,0], torch.ones(self.x.size()[0], 1, device=device), create_graph=True, retain_graph=True, allow_unused=True)[0]
        loss = torch.sum(dx**2)
        
    def closure(self):
        self.optimizer.zero_grad()
        loss = self.cost_function()
        self.loss_history.append(loss)
        loss.backward(retain_graph=True)
        return loss

    # Training function
    def train(self, epochs, opt_func=torch.optim.Adam):
        torch.autograd.set_detect_anomaly(True)
        self.optimizer = opt_func(self.model.parameters())

        for epoch in range(epochs):
            self.optimizer.step(self.closure)

fun_model = fun(position_61x61)

fun_model.train(1000)

I am getting the following error

TypeError : unsupported operand type(s) for ** or pow(): ‘NoneType’ and ‘int’

swap · June 21, 2022, 11:20am

Your dx is None. Could you print dx? If its None, storing dx as “Variable” might help!

AlphaBetaGamma96 · June 21, 2022, 11:48am

As @swap said, your error emerges from the fact that dx is of type None and None ** 2 is undefined. The use of Variable() is deprecated and shouldn’t be used, also when you define self.x as torch.tensor(x, requires_grad=True).double() you’re breaking your computation graph. Simply define it is as,

self.x = x.double().requries_grad=True

If you’re trying to get the gradient of the output w.r.t the inputs you can just do dx = torch.autograd.grad(output, self.x, torch.ones_like(self.x), ...) and redefine the loss to be torch.sum( (100*dx)**2 ). You can then use torch.einsum to format the output into the shape you want.

Also, your cost_function doesn’t return the loss so that’ll be a future error as well. Add return loss, and be careful with retain_graph=True as that might result in a memory leak if used improperly.

Finally, do you need to have everything within a class? Because it might be significantly easier to just write it as a script!

Shresth_Somya · June 22, 2022, 6:09am

x is an numpy object and need to be converted to tensor.
so I tried this
self.x = torch.from_numpy(x).double() self.x.requires_grad=True
Again the grad returns None type. Can you explain why I am getting none as output for grad or a way to counter this problem. As far as I know that the output is an function of input, therefore we can find grad of output wrt to input. Also is reshaping an in place operation?

AlphaBetaGamma96 · June 22, 2022, 9:59am

Did you correct this mistake? The cost_function does return anything and hence will default to None.

Shresth_Somya · June 22, 2022, 12:12pm

I corrected the cost_function but the error is same.

AlphaBetaGamma96 · June 22, 2022, 12:14pm

Can you share an updated version of your code? So I know what you’ve changed?

Shresth_Somya · June 22, 2022, 12:15pm

class fun():
    def __init__(self, x):
        self.x = torch.from_numpy(x).double()
        self.x.requires_grad=True
        self.loss_history = []
        self.model = nn.Sequential(
            nn.Linear(2,10),
            nn.ReLU(),
            nn.Linear(10,10),
            nn.ReLU(),
            nn.Linear(10,1)).double()

    def cost_function(self):
        output = self.model(self.x)
        temp = output*100
        dx = torch.autograd.grad(temp, self.x[:,0], torch.ones_like(self.x[:,1].unsqueeze(-1)), create_graph=True, retain_graph=True, allow_unused=True)[0]
        loss = torch.sum(dx**2)
        return loss
        
    def closure(self):
        self.optimizer.zero_grad()
        loss = self.cost_function()
        self.loss_history.append(loss)
        loss.backward(retain_graph=True)
        return loss

    # Training function
    def train(self, epochs, opt_func=torch.optim.Adam):
        torch.autograd.set_detect_anomaly(True)
        self.optimizer = opt_func(self.model.parameters())

        for epoch in range(epochs):
            self.optimizer.step(self.closure)
            self.epoch_end(epoch, epochs, print_every=100)
    
    def epoch_end(self, epoch, epochs, print_every=100):
        if epoch == 0 or epoch == (epochs - 1) or epoch % print_every == 0 or print_every == 'all':
            print("Epoch [{}/{}], train_loss: {:.4f}".format(epoch, epochs, self.loss_history[epoch]))

this is the changed code.

AlphaBetaGamma96 · June 22, 2022, 12:29pm

Can you replace the line above with this below, and print out the shape of it?

dx = torch.autograd.grad(temp, self.x, torch.ones_like(temp), create_graph=True, retain_graph=True)[0]
print(dx.shape)

Shresth_Somya · June 22, 2022, 12:45pm

The dx is no more none.

But I want to do calculations for only one column doing for both unnecessary increases complexity. Can you suggest something. Can you explain why I was getting none earlier.

AlphaBetaGamma96 · July 12, 2022, 6:02pm

If I had to guess, you can’t used torch.autograd.grad with respect to a part of an input like x[:,0]. x[:,0] isn’t part of the graph, but x is.

(Only guess so if someone does know, please do correct it)