Adding input to each conv layer in resnet pre-trained model

Frida · October 23, 2020, 3:24pm

Hi,
Does someone know how we can give more then one input at each conv layer i,e., not only the tensor?
Thanks!

ptrblck · October 24, 2020, 8:45am

How should the additional input look like?
If it has the same shape as the “first” input, you could just append it to the batch.

Frida · October 24, 2020, 8:59am

it is scalar not the same shape

ptrblck · October 24, 2020, 9:02am

If the conv layer has a kernel shape of 1x1 and expects in_channels=1, you could reshape the scalar value into a tensor with the shape [batch_size, 1, 1, 1] and feed it directly to the conv layer.
If that’s not the case, you won’t be able to directly feed a scalar value to it and would need to either add/copy/subtract/etc. it to another input or use another layer.

Frida · October 24, 2020, 1:49pm

Thanks a lot!
Actually, having the known resnet network with sequential and BasicBlock. I just want to add a parameter that will be change during the training, say the padding and I want to control it from the beginning. For example. assume that at each epoch I want to change the padding for specific layer in the network. It looks like I need to open the network into Modules, right?

ptrblck · October 25, 2020, 1:15am

You can change some attributes such as the padding directly by setting it in the module as seen here:

model = models.resnet18()
model.conv1.register_forward_hook(lambda m, input, output: print(output.shape))
print(model.conv1.padding)
> (3, 3)

x = torch.randn(1, 3, 224, 224)
out = model(x)
> torch.Size([1, 64, 112, 112])

model.conv1.padding = (0, 0)
print(model.conv1.padding)
> (0, 0)

out = model(x)
> torch.Size([1, 64, 109, 109])

Note however, that changing the padding will also change the output activation shape as seen in the example.
While this approach works in resnet18 due to the usage of adaptive pooling layers before applying the linear layer, you might run into shape mismatches for other models.

Frida · October 25, 2020, 8:22am

And what is that was a parameter? how can I change it and keep working with the new number keep updating it during the training. As I see, when choosing the new number it get stuck there. Need to wrote it as model.padding = nn.Parameter(3.0)

ptrblck · October 25, 2020, 9:04am

No, padding cannot be a trainable parameter, if I’m not mistaken.
If you are running into shape mismatches after changing the model, you would have to change the architecture e.g. by applying an adaptive pooling layer (as done in the resnet) or by using different linear layers later in the model.

The reason the for mismatch is that you are changing the shape of the intermediate activations, while the linear layers need a fixed input shape to work properly (set by in_feautres).

Frida · October 25, 2020, 9:15am

Assuming that we will solve the mismatch issue, is there any chance to distribute the padding during the training, say, having the loss, and a given backward function for each layer, if the “padding parameter” had to enlarged, set +1 to the padding in a given layer.

ptrblck · October 25, 2020, 9:41am

No, I don’t think the padding is trainable, at least I’m not aware of a method to learn it.
You might want to look into meta-learning, which can be used to “learn” new architectures and might for your use case. However, I don’t think you can directly use gradient descent to optimize the padding.

Frida · October 25, 2020, 9:52am

Another related issue, if we set some parameter using nn.Parameter, pytorch is automatically calculate its gradients. When do we need to set a backward function?

ptrblck · October 25, 2020, 9:55am

Autograd will calculate the gradients of the loss w.r.t. all used parameters used to calculate this loss during the backward() call.
Without it, there won’t be any gradients.

Frida · October 25, 2020, 9:56am

so in fact if I set padding as nn.parameter I do get gradients, and then, I set it in the conv layer, what are we learning in that way?

ptrblck · October 25, 2020, 10:00am

How did you achieve it? Could you post your approach, please?

Frida · October 25, 2020, 10:01am

I didn’t try it just trying to understand. In general I can set nn.parameter and then use it as the padding, still it is nn.parameter you will get gradients.

ptrblck · October 25, 2020, 10:12am

No, that won’t work. The padding parameter is (a tuple of) integer(s), which are not trainable and also I don’t know how Autograd could even calculate the gradients for it at the moment.