How to use torch.nn.init.calculate_gain?

HappyStorm · May 1, 2018, 7:25am

Hi all,

According to the official document as follows:

I am quite confused about the choice of the parameter called “nonlinearity”:

If I have a network structure like this:
(Conv2D → BN → LeakyReLU) → (Conv2D → BN → LeakyReLU) → (Conv2D → BN → LeakyReLU)

How do I choose which option to use? Should it be:
```
nn.init.xavier_normal_(m.weight.data, gain=nn.init.calculate_gain('conv2d'))
```
or
```
nn.init.xavier_normal_(m.weight.data, gain=nn.init.calculate_gain('leaky_relu'))
```
?
What about if I have RNN layers in my network (maybe GRU or LSTM)?
Should it use “sigmoid” due to the ouputs of GRU & LSTM are activated by sigmoid function?
```
nn.init.xavier_normal_(m.weight.data, gain=nn.init.calculate_gain('sigmoid'))
```
Or the “tanh” may be the better one as follow?
```
nn.init.xavier_normal_(m.weight.data, gain=nn.init.calculate_gain('tanh'))
```
Is it good enough to use the default parameter for all kinds of layers?
(no matter Conv{1, 2, 3}D, RNN, etc.)

Many thanks!

SimonW · May 1, 2018, 9:19am