for the convolution (https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/conv.py) the implementation is:

```
def reset_parameters(self):
n = self.in_channels
for k in self.kernel_size:
n *= k
stdv = 1. / math.sqrt(n)
self.weight.data.uniform_(-stdv, stdv)
if self.bias is not None:
self.bias.data.uniform_(-stdv, stdv)
```

which I take is the `n=nb_chan*k_1*k_2`

. However, why isn’t it `n=nb_chan+k_1+k_2`

? What is wrong with the sum?

My question is based on the fact that the linear seems to be the total of “in units”:

```
def __init__(self, in_features, out_features, bias=True):
super(Linear, self).__init__()
self.in_features = in_features
self.out_features = out_features
self.weight = Parameter(torch.Tensor(out_features, in_features))
if bias:
self.bias = Parameter(torch.Tensor(out_features))
else:
self.register_parameter('bias', None)
self.reset_parameters()
```

but my notion of “total” seems to be captured better by sums than by products…