Number of activations for Linear and Conv2d layer: comparison

How do you calculate the number of activations for :

l = nn.Linear(10, 100)
c = nn.Conv2d(1, 2, kernel_size=(3,4))

I would say 1000 for l, if this is the number of weights. But I am not sure.

The linear layer is pretty straightforward, as the output activation will have the shape [batch_size, *, out_features].
The activation of the conv layer is a bit trickier, as the shape depend on the spatial size of your input:

x = torch.randn(2, 10)
out = l(x)
print(out.nelement())  # batch_size * out_features
> 200

x = torch.randn(2, 1, 24, 24)
out = c(x)
print(out.nelement())  # batch_size * out_channels * (h - (kW - 1)) * (w - (kW - 1))
> 1848
> torch.Size([2, 2, 22, 21])

Unbelievable, I never though this calculation will be so interesting.
Shows that even when we involve padding, stride, and dilation it really becomes complex:

c = nn.Conv2d(1, 2, kernel_size=(3,4), padding=1, stride=2, dilation=2)

I see now that bs has key role, but I wonder why only w is taken in account.
Why b part is ignored?

(I know in some cases b can be ignored.)

You can find the actual computation for the output shape in the docs.

Iā€™m a bit confused. Are you looking for the number of parameters (weight + bias) or the output activation shape?

I was unaware what number of activations mean till now. :frowning:

In case you would like to count the number of parameters, you could use:

lin_params = 0
for param in l.parameters():
    lin_params += param.nelement()

conv_params = 0
for param in c.parameters():
    conv_params += param.nelement()
1 Like

Tested and it works great. Thanks.

1 Like