How to implement an architecture which has 2 different branches and losses?

Hi all,

I’m trying to implement the paper => Unsupervised Domain Adaptation by Backpropagation by Gannin et. al.

The architecture is as follows -

How do we implement something like this in PyTorch. More specifically:

  1. An architecture which has 2 different outputs that are trying to classify different things.
  2. Run back-propagation taking into account the individual losses from both “branches”.
  3. In this paper, the authors multiply the backprop values by -1 in the gradient reverse layer. How do we do something like that in PyTorch?

It’s a bit abstract but simple.

Gradients are accumulative in pytorch. This means unless you call optim.zero_grad() to set gradients to zero, gradients will keep accumulating.

You need to run both networks, backward pink using retain_graph=True. Then iterate through the gradients (for p in model.parameters(): p.grad *=-1)
Then backward blue loss and then use optimizer step and zero grad.

Note: I do highlight the ordering as if you backprop blue first you would be multiplying them by -1 too.