How to self define a backward function for a net in libtorch? (I tested some code, but failed.)

Snade_Qian · June 26, 2019, 10:55am

I’m using libtorch (the frontend of pytorch in c++) to train a network. To monitor the gradient, I created checker to catch the backward process, but it does not enter the self-defined backward() function.

I compile the libtorch from sources on Ubuntu16.04.

The checkBP class is below.

#include <torch/torch.h>

#include <cstddef>
#include <cstdio>
#include <iostream>
#include <string>
#include <vector>

class checkBP: public torch::nn::Module
{
public:
    checkBP(int show, std::string label):show(show), label(label) {std::cout<<" created checkBP"<<std::endl; }

    torch::Tensor forward(torch::Tensor inputs)
    {
        std::cout<<"\n\n\n\n checkBP forward passed\n\n\n\n"<<std::endl;
        return inputs.clone();
    }
    torch::Tensor backward(torch::Tensor grad_output){
        std::cout<<" In checkBP backward."<<std::endl;
        auto grad_mean = grad_output.abs().mean().data<float>();
        float grad_mean_float = *grad_mean;

        if ( show == 1)
        {
            std::cout<<"!!!! checkBP grad of "<<label<<" is(show=1): "<<grad_mean_float<<std::endl;
        }

        return grad_output;
    }

public:
    int show;
    std::string label;
};

class Net_model: public nn::Module
{
public:
    Net_model(int n_input,
              int n_hidden,
              int n_out){
        fc1 = register_module("fc1_net",torch::nn::Linear(n_input, n_hidden));
        fc2 = register_module("fc2_net",torch::nn::Linear(n_hidden, n_out));
    }

    nn::Linear fc1{nullptr}, fc2{nullptr};
    torch::Tensor forward(torch::Tensor input){
        auto x = fc1->forward(input);
        x = checkBP(/*show*/1,"net_model_checkBP").forward(x);
        x = fc2->forward(x);
        return x;
    }
};

int main()
{

    int n_input = 10, n_hidden = 8, n_out = 2, n_sample =  3;
    torch::Tensor inputs = torch::ones({n_sample, n_input});
    torch::Tensor target = torch::zeros({n_sample, n_out});

    auto net = Net_model(n_input,n_hidden,n_out);
    std::cout<<net<<std::endl;
    torch::optim::SGD optimizer(
        net.parameters(), torch::optim::SGDOptions(0.01).momentum(0.5));
    for( int i = 0; i < 10; ++ i)
    {

        auto res = net.forward(inputs);
        optimizer.zero_grad();
        auto loss = torch::mse_loss(res, target);
        loss.backward();
        optimizer.step();
    }
return 0;
}

In the training loop: after executing “loss.backward()”, it seems that the checkBP::backward() does not execute. What is the correct definition of backward in libtorch then?

albanD · June 27, 2019, 3:17pm

Hi,

The ‘nn.Module’ only use the autograd to compute gradients.
If you want to specify gradients for an operation you need to use a ‘Function’.

Snade_Qian · June 28, 2019, 2:35am

Thank you for in-time reply.

I have test your suggestion in Python env by creating a function first as below.

#submodules.py
import numpy as np
import torch
from torch.autograd import Function
class CheckBP(Function):

    def __init__(self, label, show):
        super(CheckBP, self).__init__()
        self.label = label
        self.show = show

    def forward(self, input):
        # print(self.label + ': forward passed.')
        # print(input)
        return input.clone()

    def backward(self, grad_output):
        grad_mean = grad_output.abs().mean()
        if self.show == 1:
            print('grad_' + self.label + ': ' + str(grad_mean))
        return grad_output

Then, a class is used to wrap the function as below:

# modules
import numpy as np
import torch
import torch.nn as nn
from torch.autograd import Variable
import submodules as F

class CheckBP(nn.Module):

    def __init__(self, label='a', show=1):
        super(CheckBP, self).__init__()
        self.check_bp = F.CheckBP(label, show)

    def forward(self, input):
        return self.check_bp(input)

Testing the CheckBP in Python is ok.

However, in c++(libtorch),

class checkBP :public torch::autograd::Function
{
...(something)
}

I got this error " error: invalid use of incomplete type ‘struct torch::autograd::Function
struct checkBP : public torch::autograd::Function "

I checked “torch::autograd::Function” in /torch/csrc/autograd/edge.h . There is forward() or backward() to inherit.

So, how to define the checkBP module in c++ like in python with self-defined backward()?

Thank for your reply again!

albanD · July 4, 2019, 2:26pm

Hi,

Note that on the python side, the Function have changed slightly as you can see in the tuto.

For cpp it is a bit more complex. a Function does only one way and its “apply” method should be implemented. It is either implemented in pure autograd by performing operations on Variables or the output should be wrapped and the backward Function specified.
You will need 2 functions if you want a custom backward. For example here, “DelayedError” is the forward function and “Error” is the backward.

yf225 · July 21, 2019, 9:12pm

@Snade_Qian This is currently work-in-progress tracked in https://github.com/pytorch/pytorch/pull/23020. We will provide documentation / tutorial for this once we have the API ready.