For example say my starting layer is
Conv2d(3, 256, 1) -> Conv2d(256, 512, 4, 2, 1)
and I want to grow my layer on the left side of it so the next layer’s structure should be
Conv2d(3, 128, 1) -> Conv2d(128, 256, 4, 2, 1) -> Conv2d(256, 512, 4, 2, 1)
.
In this case should I slice the old conv layer or copy over the state_dict? Currently I have a working model where I slice off the old layer, destroy the old network, and add the 2 new layers plus the old copy in a list then unpack in nn.Sequential() so I can make a new network. Finally I would grab that new network’s params and feed it to a new optimizer. My model works and it produces good results but I am not sure if I am following best practice.