I have a simple 2-layer MLP and I’m trying to optimise some hyperparams that govern the network weights but they seem to remain unchanged during training.
E.g.
model = MLP(fan_in=2, hidden=50, fan_out=3)
# hyper-params to optimise
a = torch.nn.Parameter(torch.tensor([1.3]))
b = torch.nn.Parameter(torch.tensor([2.4]))
opt = torch.optim.SGD([a, b], lr, momentum, weight_decay)
for epochs
for minibatch
for p in model.parameters():
p.data = p.data * a + b
logits = model(X)
loss = cross_entropy(logits, y)
opt.zero_grad()
loss.backward()
opt.step()
In every iteration I print a
and b
and they seem to still contain their initial values, i.e. (1.3, 2.4) equivalently.
Any idea what I’m missing?