Optimizing the backward computational graph

Hi,

I am trying to perform optimization on the backward computational graph. Put another way, I want to treat the backward computational graph as a forward path. This way, I would have two .grad for each parameter, one calculated by the regular forwad loss and the other for a backward loss.
Here is a pseudo-code:

output1 = model(input)
loss1 = criterion(output1, targets1)
loss1.backward()

output2 = backward_comp_graph(output1.detach())
loss2 = criterion(output2, targets2)
loss2.backward()

I would appreciate your suggestions on what implementation strategy is the nicest!

Thanks,
ttoosi