Why need implementation of backward method?

Since we already have Variable class in pytorch, which can track the gradients automatically. If we want to customize a layer, just inherent from nn.Module, and implement forward(), the backward() is automatically done.
In which case, we need to implement backward() as showed in torch extension part of official docs? As far as I understand, if you store your weight in Variable, you do not have to do anything wrt. backward. Is there anything i missing?

2 Likes

As you said, the Variable class tracks gradients. The computation of the gradient has been defined by someone else for some ops, which is why you only need to write the forward method, as long as all your operations comprise only these particular ops.

You can see here that the backward method has been defined for some of the ops that you use: https://github.com/pytorch/pytorch/blob/master/torch/autograd/_functions/pointwise.py

So if you are going to define a custom operation that cannot be composed of these predefined ops, you will need to write your backward function.

1 Like

that makes sense, thanks for giving reference of the defined ops.

On Module level, you don’t need to since autograd doesn’t operate on this level. However, if you implement a Function, then you should supply a backward function because each Function is a link in the computation graph.

3 Likes