Learnable parameter as int not tensor

Hi, I just made learnable parameter of torch.
the tensor name is self.z
Here is my neural network’s definition.

class LSTM(nn.Module):
    def __init__(self, n_lstm=2048, n_reduced=128, n_hidden=32):
        super(LSTM, self).__init__()
        self.z = nn.Parameter(torch.tensor(1.0), requires_grad=True)

In my case, I want to learn the “best kernel size(1 to 12)” of max_pool1d as belows, via training of the learnable parameter, z.

def forward(self, input):
    output = F.max_pool1d(torch.cat((min_tensor, input), 2), z, stride=1)

z is in the range of 1 to 12.
So I can train 12 times, changing z to 1 from 12 And choose the best z value looking for best validation loss.
However, I want to learn and decide the z via backpropagation.
Cause z should be int, not tensor, I cast-type z to int from tensor as follows.

    z = (int)(self.z.cpu().detach().numpy())
    F.max_pool1d(torch.cat((min_tensor, -QoS), 2), z, stride=1)

However, the z value is not change… It always return 1.0, it is static…
I’m worrying that the typecast to int from tensor caused un-learnable state.
but I want to find best z value via backpropagation.
It is possible? What should be edit in my code?


The intent here looks like some kind of differentiable neural architecture search (e.g., DARTS: Differentiable Architecture Search – Google Research)

However, the shape parameters to layer declarations like max_pool1d are not differentiable; the output of the model doesn’t directly depend on them! For example, it is similar to writing y = a + b + c and expecting to take a derivative to optimize how many variables we are using in our equation. One workaround (as shown in the DARTS paper) is to consider a model with several weighted branches (where the weights are differentiable and learnable), where each branch has a different maxpool_1d configuration. The weights of each branch are updated during the architecture search training process and the branch(es) with the highest weight(s) are chosen.

1 Like