I have a simple 2-layer MLP and I’m trying to optimise some hyperparams that govern the network weights but they seem to remain unchanged during training.
model = MLP(fan_in=2, hidden=50, fan_out=3) # hyper-params to optimise a = torch.nn.Parameter(torch.tensor([1.3])) b = torch.nn.Parameter(torch.tensor([2.4])) opt = torch.optim.SGD([a, b], lr, momentum, weight_decay) for epochs for minibatch for p in model.parameters(): p.data = p.data * a + b logits = model(X) loss = cross_entropy(logits, y) opt.zero_grad() loss.backward() opt.step()
In every iteration I print
b and they seem to still contain their initial values, i.e. (1.3, 2.4) equivalently.
Any idea what I’m missing?