Adding input to each conv layer in resnet pre-trained model

Does someone know how we can give more then one input at each conv layer i,e., not only the tensor?

How should the additional input look like?
If it has the same shape as the “first” input, you could just append it to the batch.

it is scalar not the same shape

If the conv layer has a kernel shape of 1x1 and expects in_channels=1, you could reshape the scalar value into a tensor with the shape [batch_size, 1, 1, 1] and feed it directly to the conv layer.
If that’s not the case, you won’t be able to directly feed a scalar value to it and would need to either add/copy/subtract/etc. it to another input or use another layer.

Thanks a lot!
Actually, having the known resnet network with sequential and BasicBlock. I just want to add a parameter that will be change during the training, say the padding and I want to control it from the beginning. For example. assume that at each epoch I want to change the padding for specific layer in the network. It looks like I need to open the network into Modules, right?

You can change some attributes such as the padding directly by setting it in the module as seen here:

model = models.resnet18()
model.conv1.register_forward_hook(lambda m, input, output: print(output.shape))
> (3, 3)

x = torch.randn(1, 3, 224, 224)
out = model(x)
> torch.Size([1, 64, 112, 112])

model.conv1.padding = (0, 0)
> (0, 0)

out = model(x)
> torch.Size([1, 64, 109, 109])

Note however, that changing the padding will also change the output activation shape as seen in the example.
While this approach works in resnet18 due to the usage of adaptive pooling layers before applying the linear layer, you might run into shape mismatches for other models.

And what is that was a parameter? how can I change it and keep working with the new number keep updating it during the training. As I see, when choosing the new number it get stuck there. Need to wrote it as model.padding = nn.Parameter(3.0)

No, padding cannot be a trainable parameter, if I’m not mistaken.
If you are running into shape mismatches after changing the model, you would have to change the architecture e.g. by applying an adaptive pooling layer (as done in the resnet) or by using different linear layers later in the model.

The reason the for mismatch is that you are changing the shape of the intermediate activations, while the linear layers need a fixed input shape to work properly (set by in_feautres).

1 Like

Assuming that we will solve the mismatch issue, is there any chance to distribute the padding during the training, say, having the loss, and a given backward function for each layer, if the “padding parameter” had to enlarged, set +1 to the padding in a given layer.

No, I don’t think the padding is trainable, at least I’m not aware of a method to learn it.
You might want to look into meta-learning, which can be used to “learn” new architectures and might for your use case. However, I don’t think you can directly use gradient descent to optimize the padding.

Another related issue, if we set some parameter using nn.Parameter, pytorch is automatically calculate its gradients. When do we need to set a backward function?

Autograd will calculate the gradients of the loss w.r.t. all used parameters used to calculate this loss during the backward() call.
Without it, there won’t be any gradients.

so in fact if I set padding as nn.parameter I do get gradients, and then, I set it in the conv layer, what are we learning in that way?

How did you achieve it? Could you post your approach, please?

I didn’t try it just trying to understand. In general I can set nn.parameter and then use it as the padding, still it is nn.parameter you will get gradients.

No, that won’t work. The padding parameter is (a tuple of) integer(s), which are not trainable and also I don’t know how Autograd could even calculate the gradients for it at the moment.