Advantages of using an nn.Module

Hello, I am trying to conceptually understand what are the advantages in terms of speed in using an nn.Module.

What is the difference between subclassing an nn.Module compared to calling the functions such as nn.Conv2d, nn.Linear etc directly on the tensors? Assuming that all tensor are sent to a cuda device, are these equivalent in terms of speed, and is the nn.Module just a convenience class then?



There is no benefit in terms of speed. It is mostly for convenience:

  • Easy to move many parameters to the right device at once
  • Easy to hold and work with these parameters
  • Easy to interface with the optimizer
  • etc
1 Like