Hi.
I am working on one project, and the update formula is parameter = parameter + learning_rate * gradient
rather than the usual term which is subtracting.
I tried to make the learning rate negative to implement the idea but this is forbidden. So my question how to add the gradient when updating?
Besides flipping the loss, you can modify the gradient update rule by using a for loop over network parameters and updating the gradient accordingly. This will give you more flexibility.
learning_rate = 0.01
for f in net.parameters():
f.data.sub_(f.grad.data * learning_rate * -1.0)