Why nn.Parameter variables seem unchanged during training?

I have a simple 2-layer MLP and I’m trying to optimise some hyperparams that govern the network weights but they seem to remain unchanged during training.


model = MLP(fan_in=2, hidden=50, fan_out=3)
# hyper-params to optimise
a = torch.nn.Parameter(torch.tensor([1.3]))
b = torch.nn.Parameter(torch.tensor([2.4]))
opt = torch.optim.SGD([a, b], lr, momentum, weight_decay)

for epochs
  for minibatch
      for p in model.parameters():
          p.data = p.data * a + b
      logits = model(X)
      loss = cross_entropy(logits, y)

In every iteration I print a and b and they seem to still contain their initial values, i.e. (1.3, 2.4) equivalently.

Any idea what I’m missing?

why do you only perform 1 optimizer step every epoch and not every minibatch?

That is just wrong typo in the space formatting from copy pasting. I’ll fix it in the original question but still the point remains that the params (a, b) defined through nn.Parameter are not changing. Why?

It seems that the values of params a, b do change if they are incorporated into the input to the loss since loss.backward() computes gradients and changes the values of a, b accordingly.

logits = model(X)
logits += a * b
loss = cross_entropy(logits, y)

But what if we want to optimise over params which implicitly affect the loss and are not a direct input to the loss function?