I have a simple training loop that looks like this.
optim = torch.optim.Adam(verify_net.parameters())
# The training is designed to continue untill timeout or goal
while True:
optim.zero_grad()
# Forward pass
outputs = verify_net(inputs)
# Compute how much bigger every label is be
# from the true label in the worst case
losses = torch.stack(
[upper_bound(outputs[i] - outputs[true_label]) \
for i in range(len(outputs)) if i != true_label])
if (losses < 0).all():
return True
loss = torch.sum(losses)
print(loss)
loss.backward()
optim.step()
The optimization goal is to make outputs[true_label]
greater then all other outputs in a worst case that is computed by the upper_bound
function
upper_bound = lambda x: x[0] + torch.sum(torch.abs(x[1:]))
If after any step all upper bounds losses[i] < 0
then the loop may break and we are happy. But till then the optimizer is to do its task. The output I get by printing the loss after each step is this
tensor(632.8606, grad_fn=<SumBackward0>)
tensor(668.1876, grad_fn=<SumBackward0>)
tensor(698.2267, grad_fn=<SumBackward0>)
tensor(733.7394, grad_fn=<SumBackward0>)
tensor(764.9390, grad_fn=<SumBackward0>)
tensor(799.8583, grad_fn=<SumBackward0>)
tensor(834.7762, grad_fn=<SumBackward0>)
tensor(861.3510, grad_fn=<SumBackward0>)
tensor(895.9908, grad_fn=<SumBackward0>)
tensor(930.6293, grad_fn=<SumBackward0>)
tensor(965.2657, grad_fn=<SumBackward0>)
tensor(1000.8009, grad_fn=<SumBackward0>)
It seems like Adam
is doing a fantastic job at maximizing the loss however this really isn’t the behavior I’d expect or want. My first attempt was to change loss
into -loss
but I had no luck because Adam
starting to minimize the now negated loss
Any suggestions would be much appreciated. Thanks