How the weights are initialized in torch.nn.Conv2d?

ascii1203 · November 21, 2018, 3:58am

Hi, I am new in PyTorch.
When I created the weight tensors by calling torch.nn.Conv2d, I saw that its weights are initialized by some way. its values are not similar to non-initialized version. (see the captured image)

Could you explain how these weights are initialized? I could not find any hint in docs file…

Mona_Jalal · November 21, 2018, 6:40am

So if you use torch.Tensor it does not initialize it but puts some garbage. If you want some sort of random initialization, you should use torch.rand

>>> torch.rand(4,4)
tensor([[0.8693, 0.5824, 0.3661, 0.1016],
        [0.4629, 0.7107, 0.1525, 0.9696],
        [0.0603, 0.8134, 0.3207, 0.8813],
        [0.0799, 0.1383, 0.3611, 0.3585]])
>>> torch.Tensor(4,4)
tensor([[-0.0000,  0.0000, -0.0000,  0.0000],
        [ 0.0000,  0.0000,  0.0000,  0.0000],
        [-0.0000,  0.0000, -0.0000,  0.0000],
        [ 0.0000,  0.0000,  0.0000,  1.3564]])
>>> torch.Tensor(4,4)
tensor([[-0.0000,  0.0000, -0.0000,  0.0000],
        [ 0.0000,  0.0000, -0.0000,  0.0000],
        [-0.0000,  0.0000,  0.0000,  0.0000],
        [ 0.0799,  0.1383,  0.3611,  0.3585]])
>>> torch.Tensor(4,4)
tensor([[-0.0000,  0.0000, -0.0000,  0.0000],
        [ 0.0000,  0.0000, -0.0000,  0.0000],
        [-0.0000,  0.0000,  0.0000,  0.0000],
        [ 0.0000,  0.0000,  0.0000,  1.3564]])
>>> torch.rand(4,4)
tensor([[0.9820, 0.4046, 0.1679, 0.9906],
        [0.3627, 0.8029, 0.2323, 0.5281],
        [0.2631, 0.0574, 0.6574, 0.3545],
        [0.8812, 0.9527, 0.6162, 0.3369]])

But I can feel why you are confused because it looks as if it was randomly initialized.

This blog has a good writeup:

beneyal · November 21, 2018, 6:47am

The docs usually don’t mention the initialization method, but if you look at PyTorch’s source code (https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/conv.py#L41), you can see the weights are initialized with Kaiming uniform initialization.

ascii1203 · November 23, 2018, 1:50pm

Your answer is exactly what i was wondering . Thanks!!!