Trainable parameter to select training frame in input (attention mechanism?)


I am trying something which is new for me, and I don’t know if what I want to do makes sense or is possible to train with pytorch and autograd.

Basically, I would like to have a float scalar parameter used to define the start of the input data frame
on which the forward function is applied (I guess this is kind of a attention mechanism?).

I want this scalar parameter to be trained together with the other parameters, so I define it as a torch.nn.Parameter.

To simplify, let’s say I want this parameter to select the start index of the frame between [0, 100]. And to simplify let’s say I manage to keep this parameter in the interval [0, 1]

So I just multiply 100 by the start parameter to select the beginning of my frame.

Then I simply apply the rest of my forward function on this particular frame of the input.

class Net(nn.Module):
    def __init__(self):

    self.layer1 = nn.Linear(...)
    # ... normal MLP network

    self.start_param = torch.nn.Parameter(torch.tensor(0.1))

    def forward(self, x):
        start_index = torch.floor(self.start_param * 100).type(       # seems start parameter not updated
        self.frame = range(0, start_index)
        x = x[:, self.frame]
        # continue and apply forward logic to the selected frame
        # ...

However, it seems my parameter self.start_param is not updated or trained. grad is always None.

Seems it is out of the graph scope.

  1. I am not sure how to check if a given parameter is part of the graph. How can I check is a given parameter (here my scalar parameter) is part of the parameters tracked during the forward pass?

  2. I am not expert in autograd and pytorch, but I feel what I’m doing cannot be managed by autograd. Because it is based on frames, I guess the frame selection is not tracked by the graph and so the frame parameter won’t be updated during backward. Could you clarify this for me?

  3. Is there another way to do what I want to do?