Weight initialization variance (cnn) different from expected?

I wanted to see the variance of 1 weight of a CNN layer with input size=1 and output size = 1:

import torch.nn as nn
variance=0
samplesize=10000
for i in range (samplesize):
    layer=nn.Conv2d(in_channels=1,out_channels=1,kernel_size=1)
    variance+=(layer.weight.item())**2

print(variance/samplesize)

But this seems to converge to 1/3.

In reset parameter method of torch/nn/modules/conv.py it calls init.kaiming_uniform_(self.weight, a=math.sqrt(5)), and in kaiming uniform, it sets nonlinearity to ‘leaky_relu’, so after this code:

    gain = calculate_gain(nonlinearity, a)
    std = gain / math.sqrt(fan)
    bound = math.sqrt(3.0) * std  # Calculate uniform bounds from standard deviation
    with torch.no_grad():
        return tensor.uniform_(-bound, bound, generator=generator)

I think calculate gain is returning ~ sqrt(2), and fan = 1, so bound=sqrt(6). But the variance of a uniform distribution U(-sqrt(6),sqrt(6)) should be (sqrt(6)^2)/3=6/3=2, right? I’m confused why I’m getting 1/3 for the variance. Did possibly I make any math or code mistakes?

I also notice when reset_parameters initializes the biases, ultimately it does use a uniform distribution U(-1,1), and the resulting variance matches up with (1^2)/3 =1/3. But I get this result from both the variance on layer.weight and layer.bias, so I wasn’t sure. Thx

(I realized the answer is because when a=sqrt(5) is passed into Kaiming uniform, nonlinearity is set to “leaky_relu”, and the first elif under nonlinearity=“leaky_relu” sets negative_slope = sqrt(5) and returns sqrt(2/(1+5))=sqrt(1/3))