Relationship among torch.nn, torch.nn.functional, and torch.autograd.Function?

I am new to pytorch. In the course of reading the tutorial and docs, I feel rather confused about one thing. That is what’s the relationship among torch.nn, torch.nn.functional, and torch.autograd.Function? what follows is my understanding. Hope that somebody tells me whether I am right or not. torch.nn consists of modules (layers). These modules are constructed by using the operations provided by torch.nn.functional. Furthermore, these operations are constructed based on the rules specified by torch.autograd.Function. My understanding is right?

torch.nn are non-functional, stateful network modules. They are used as functors, typically within a large network module class, which is itself a functor, and derives from nn.Module:

class MyNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        # create a nn.Linear functor:
        self.h1 = nn.Linear(3, 2)

     def forward(self, x):
           # call the functor:
           x = self.h1(x)
           return x

The functional versions are stateless, and called directly, eg for softmax, which has no internal state:

x = torch.nn.functional.soft_max(x)

There are functional versions of various stateful network modules. In this case, you have to pass in the state yourself. Conceptually, for Linear, it’d be something (conceptually) like:

x = torch.nn.functional.linear(x, weights_tensor)

(I havent looked to see if this actually exists, but conceptually it’d be like that).

torch.autograd.Functional is used to create the torch.nn.functional modules, but you can ignore it for now. I’ve never looked at it yet… Only if you want to create some new functional method, might you need to look at it, but not even necessarily in fact.

^^^ the above might not be entiely complete/correct, so someone else can patch up any holes/inaccuracies, but it’s apprpxoaimtely correct, I think

1 Like