I had a complicated setup with replace and split, before using regex, but things still weren’t working. Hopefully I can make my code a lot more simple and easier to understand now!
But I want to save the kernel at the end of training, so only the 3x3 weights to be able to initialize the layer using this kernel in another network with requires_grad=False
Based on the setup of nn.Conv2d you would use 128 kernels each with a shape of [in_channels=128, height=3, width=3]. Your layer won’t have a single 3x3 kernel, so you should instead save the layer.weight tensor (or better save layer.state_dict()).