Understanding how pytorch connects densenet layers

Guilherme_Martins · May 12, 2020, 10:24pm

Assuming I have a network where I have 2 conv. layers (cv1, cv2) and 1 batch norm layer (bn). That are connected in the following way:
cv1 --> cv2 --> cv3 and cv1 —> cv3
And that cv1 has 64 output layers, cv2 has 32 output layers and bn has 64 +32 = 96 input layers.
Can I say that weight with index 63 is applied to the layer number 64 of the cv1 and that weight with index 64 is being applied to layer number 1 of the cv2 layer?

Thank you!

ptrblck · May 13, 2020, 4:30am

You are not using bn in your figure, so I assume cv3 should be the batchnorm layer?
The weight and bias parameters of the batchnorm layer will be applied per input channel, so your assumption should be correct as shown in this code snippet:

# setup
bn = nn.BatchNorm2d(32+64).eval()

# change the weight and bias to see an effect
with torch.no_grad():
    nn.init.normal_(bn.weight)
    nn.init.normal_(bn.bias)

# use case 1
x1 = torch.randn(1, 32, 4, 4)
x2 = torch.randn(1, 64, 4, 4)
x = torch.cat((x1, x2), dim=1)
out = bn(x)

# manual approach
out1 = x1 * bn.weight[:32][None, :, None, None] + bn.bias[:32][None, :, None, None]
out2 = x2 * bn.weight[32:][None, :, None, None] + bn.bias[32:][None, :, None, None]
out_manual = torch.cat((out1, out2), dim=1)

# print absolute error
print((out - out_manual).abs().max())
> tensor(3.3379e-05, grad_fn=<MaxBackward1>)

Guilherme_Martins · May 13, 2020, 8:27pm

Yes thats right cv3 should be bn.
Alright, thank you!