Assuming I have a network where I have 2 conv. layers (cv1, cv2) and 1 batch norm layer (bn). That are connected in the following way:
cv1 --> cv2 --> cv3 and cv1 —> cv3
And that cv1 has 64 output layers, cv2 has 32 output layers and bn has 64 +32 = 96 input layers.
Can I say that weight with index 63 is applied to the layer number 64 of the cv1 and that weight with index 64 is being applied to layer number 1 of the cv2 layer?
You are not using
bn in your figure, so I assume
cv3 should be the batchnorm layer?
bias parameters of the batchnorm layer will be applied per input channel, so your assumption should be correct as shown in this code snippet:
bn = nn.BatchNorm2d(32+64).eval()
# change the weight and bias to see an effect
# use case 1
x1 = torch.randn(1, 32, 4, 4)
x2 = torch.randn(1, 64, 4, 4)
x = torch.cat((x1, x2), dim=1)
out = bn(x)
# manual approach
out1 = x1 * bn.weight[:32][None, :, None, None] + bn.bias[:32][None, :, None, None]
out2 = x2 * bn.weight[32:][None, :, None, None] + bn.bias[32:][None, :, None, None]
out_manual = torch.cat((out1, out2), dim=1)
# print absolute error
print((out - out_manual).abs().max())
> tensor(3.3379e-05, grad_fn=<MaxBackward1>)
Yes thats right cv3 should be bn.
Alright, thank you!