For example,

```
layer = nn.Linear(10, 20)
weight = layer.weight
print(weight)
```

What distribution does weight subject to?

Is that normal distribution？

For example,

```
layer = nn.Linear(10, 20)
weight = layer.weight
print(weight)
```

What distribution does weight subject to?

Is that normal distribution？

Let’s check the source code.

```
def __init__(self, in_features, out_features, bias=True):
...
self.weight = Parameter(torch.Tensor(out_features, in_features))
def reset_parameters(self):
stdv = 1. / math.sqrt(self.weight.size(1))
self.weight.data.uniform_(-stdv, stdv)
...
```

So the answer is: a uniform distribution of values between `-1/sqrt(in_features)`

and `1/sqrt(in_features)`