Hi,
I am trying to perform optimization on the backward computational graph. Put another way, I want to treat the backward computational graph as a forward path. This way, I would have two .grad for each parameter, one calculated by the regular forwad loss and the other for a backward loss.
Here is a pseudo-code:
output1 = model(input)
loss1 = criterion(output1, targets1)
loss1.backward()
output2 = backward_comp_graph(output1.detach())
loss2 = criterion(output2, targets2)
loss2.backward()
I would appreciate your suggestions on what implementation strategy is the nicest!
Thanks,
ttoosi