How to implement TD(λ)

alexis-jacq · June 14, 2018, 9:37pm

As far as I understand, using gradient hook should only allow you to modify the new gradient. But you want to modify the previous one, so I don’t see a simpler way than:

# multiply previous gradient by lambda:
for p in model.parameters():
    p.grad *= lambda
loss.backward() # add the new gradient
optimizer.step()