Is nn.relu() more computationally heavy than F.relu()?

Huy_Ngo · January 9, 2020, 1:35pm

As far as I understand, nn.relu() is a layer that has weights and bias whereas F.relu is just a activation function. Doesn’t that make nn.relu() a bit more computationally heavy than F.relu because optimizer has to update the redundant weights and bias for that layer too?

albanD · January 9, 2020, 3:10pm

Hi,

nn.ReLU() is a layer, but it has not weights or bias.
The two are exactly the same.

The version as a nn.Module is convenient to be able to add it directly into a nn.Sequential() construct for example.
The functional version is useful when you write a custom forward and you just want to apply a relu.

Huy_Ngo · January 9, 2020, 5:17pm

Thank you for your answer.

gtimothee · October 29, 2020, 12:45pm

Be warned, Google shows " relu() is a layer that has weights and bias whereas F. relu is just a activation function." instead of @Huy_Ngo s answer to my research “nn.ReLU or F.relu ?” so if someone does not click on the page he/her will see a bad answer