nn.Linear default weight initialisation assumes leaky relu activation

The intent / original code doesn’t assume leaky relu, it does initialization according to the paper Efficient Backprop (Lecun et. al. 1998). It so happens that the initialization from that paper can also be implemented as a kaiming draw with leaky relu assumption.

The context is in the original github PR, but I guess it wouldn’t have hurt to write a github comment I guess, sorry for the misunderstanding.

4 Likes