# How to collapse one convolution and a dense layer into just one linear layer?

Hello,

I would like to collapse a 2d convolution and a dense layer to a single linear layer.

For example:

Given the network:

``````class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 5, 5, 2, 1)
self.conv2 = nn.Conv2d(5, 50, 5, (2, 2), 0)
self.fc1 = nn.Linear(1250, 100)
self.fc2 = nn.Linear(100, 10)

def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = self.conv2(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = F.relu(x)
x = self.fc2(x)
return x
``````

how to generate a network:

``````class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 5, 5, 2, 1)
self.fcnew = nn.Linear(845, 100)
self.fc2 = nn.Linear(100, 10)

def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = torch.flatten(x, 1)
x = self.fcnew(x)
x = F.relu(x)
x = self.fc2(x)
return x
``````

where fcnew layer consist of conv2 and fc1 layers collapsed. How calculate the weights for the fcnew layer?

Thanks Based strictly on what you have above, you would need to know the image input size.

https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html

This would give you the output of the Conv2d layer:

But if you want something more flexible, you can use an AdaptiveAvgPool2d to ensure the output is equal to the number of out_channels.

Based on your new edits, it’s still not possible to determine the `fcnew` size because you haven’t specified the image input size.

Hi Bernardo!

First, for context, yes, because there is no intervening nonlinearity between
`conv2` and `fc`, the two may be replaced by (collapsed into) a single `Linear`.
This is because a `Linear` is the most general linear (technically speaking,
affine) transformation that has the given numbers of input and output
variables. So whatever net linear (technically affine) transformation `conv2`
and `fc1` generate when applied sequentially (without an intervening
nonlinearity), this transformation may be exactly reproduced (up to some
numerical round-off error) by a single `Linear`.

Second, why do you even want to start with a separate `conv2` and `fc1`?
Your desired `fcnew` contains fewer individual parameters than do `conv2`
and `fc1` together (largely because of the relatively large number of
`out_channels` in `conv2` that then just get combined back together by `fc1`).
My intuition is that the single `fcnew` layer will train more efficiently than
`conv2` and `fc1`, so why not just train `fcnew` from scratch, rather than build
`fcnew` from the weights of some pre-existing `conv2` and `fc1`?

Having said that, probably the easiest way to collapse `conv2` and `fc1`
together – that is, to compute the weights of your `fcnew` layer – will be
to pass a series of “single-pixel” images through `conv2` and `fc1`. The
values of the pixels of the “output” images so obtained will be, roughly
speaking, the individual values in `fcnew`’s `.weight` tensor.

Consider:

``````>>> import torch
>>> torch.__version__
'1.13.0'
>>>
>>> _ = torch.manual_seed (2023)
>>>
>>> # shape of intermediate "input" image
>>> in_channels = 5
>>> h = 13
>>> w = h
>>>
>>> # convolution parameters
>>> out_channels = 50
>>> kernel = 5
>>> stride = 2
>>>
>>> # fully-connected parameters
>>> in_features = int (out_channels * (((h - kernel) / 2) + 1)**2)
>>> out_features = 100
>>>
>>> # create layers to collapse
>>> conv2 = torch.nn.Conv2d (in_channels, out_channels, kernel, stride)
>>> fc1 = torch.nn.Linear (in_features, out_features)
>>>
>>> # create collapsed bias from conv2 and fc1
>>> bias = fc1 (torch.flatten (conv2 (torch.zeros (1, in_channels, h, w))))
>>>
>>> # create collapsed weight from conv2 and fc1 (and bias)
>>> # batch of images, each with only a single pixel turned on
>>> n_pixels = in_channels * h * w   # number of pixels (including channels) in input image
>>> pixel_batch = torch.eye (n_pixels).reshape (n_pixels, in_channels, h, w)
>>> weight = (fc1 (torch.flatten (conv2 (pixel_batch), 1)) - bias).T
>>>
>>> # create collapsed Linear
>>> fcnew = torch.nn.Linear (n_pixels, out_features)   # Linear of correct shape
>>> # copy in collapsed weight and bias
...     _ = fcnew.weight.copy_ (weight)
...     _ = fcnew.bias.copy_ (bias)
...
>>> # check on example batch of images
>>> input = torch.randn (5, in_channels, h, w)
>>> out_two_layer = fc1 (torch.flatten (conv2 (input), 1))
>>> out_collapsed = fcnew (torch.flatten (input, 1))
>>> torch.allclose (out_collapsed, out_two_layer, atol = 1.e-6)
True
``````

Best.

K. Frank

1 Like