Both torch.nn and functional have methods such as Conv2d, Max Pooling, ReLU etc. However, many public codes writes Conv and Linear layer in a class __init__ and call it with ReLU and Pooling in forward(). Is there a good reason for that ?
I am guessing that because Conv and Linear consist of learnable parameters which wrapped within functional module. And then define them in __init__ as members for the class. For ReLU, Pooling which do not require learnable parameters just to be called in forward() method. Is it like that ?
So that it makes things easier to call weight values in run-time e.g. net.conv1.data
Yes, you are spot on. the difference between torch.nn and torch.nn.functional is a matter of convenience and taste. torch.nn is more convenient for methods which have learnable parameters.
torch.nn is a namespace for a lot of modules as well as the functional API. torch.autograd.Function can be used to create a new function with a custom forward and backward pass as described here.