I’m trying to have a linear combination of activations, which I do via the following function (this is probably the most stupid way to do it, but I haven’t yet completely understood the matrix manipulations in PyTorch):

def LinearFilterCombination(activations, A):
raw_outputs = []
for i in range(A.size(0)):
this_tensor = A[i,0]*activations[:,0,:,:]
for j in range(1, A.size(1)):
this_tensor+= A[i,j]*activations[:,j,:,:]
raw_outputs.append(this_tensor)
return torch.stack(raw_outputs, dim=1)

Here, for a given set of activations (batch_size * channels * width * height) I want to create a new set of activations where each output channel is a linear combination of the input channels for each sample, producing activations of the size (batch_size * out_channels * width * height), for a matrix A of dimensions (out_channels * channels). There are two issues that I’m facing:

Even after initializing A to Variable(torch.randn(out_channels, channels), requires_grad=True), my A does not seem to be changing at all.

This for-loop based technique is super-slow, I’m sure there’s a faster way to do it, I’m not completely down with the tensor manipulations in PyTorch yet to figure it out myself.

Any suggestions would be immensely helpful. Thanks!

The thing is that the only parameters that get updated are those passed to the optimizer. The usual method of passing parameters to the optimizer is via mymodel.parameters(), like this, for example…

You can also add parameters manually using add_param_group like this…

optimizer.add_param_group({'params': A}) # taking A from your example

By wrapping the LinearCombo in a class, I have ensured that if I use it in my model in the usual way its parameters will be included in mymodel.parameters().

That said, if you prefer to use nn.torch.functional to build your model, then you will have to organise the code a little differently, for an example, see here Are torch.nn.Functional layers learnable?