NN layers and autograd.function


I was wondering if all the nn layers have a corresponding torch.autograd.Function associated with them (in python side)?



No they don’t. Even at the cpp level, some of them are implemented with a single elementary Function but many of them have multiple.

1 Like

Thanks @albanD.

BTW I have been trying to figure out how exactly pytorch constructs the backward graph?
Is it appends to a tree data structure while going through the forward pass?
or does each tensor holds meta data about how they were originated (I guessing this could be grad_func) and from which tensors and backtracks until reaching a source node?

Yes it’s a tree data structure :slight_smile:
You can check this graph using this package: https://github.com/szagoruyko/pytorchviz/

Is it though? :slight_smile:

Because when I checked the code (https://github.com/szagoruyko/pytorchviz/blob/46add7f2c071b6d29fc3d56e9d2d21e1c0a3af1d/torchviz/dot.py#L56), It felt like, there isn’t a data structure per se, but each tensor keeps a list of functions or something like that? :thinking: Isn’t that how this graph is built in the pytorchviz package?

It is an implicit tree :smiley: We don’t have a centralized structure for it.

Exactly! thanks! :slight_smile:

@albanD I have another follow up question.

I was wondering if the forward activations are exposed in the nn.Module level?
I have seen activations need to be explicitly saved to the ctx if we are using the autograd.Function. But I am wondering how PT would keep the activations when someone’s using the nn.Module API?

The autograd engine works below the nn.Module level.
So if you use regular python function or nn.Module, the autograd does not see any difference and will save the required values properly.