My team is trying to use torch.autograd.grad()
with torch
optimizers, but optimizers only consume .grad
properties, not arbitrary lists of gradients. To work around this, we manually set .grad
properties from stuff we compute with grad()
.
Is there a better pattern?