How to implement the combined loss (Supervised + Unsupervised)?

If you look at this post: How to combine multiple criterions to a loss function? - #16 by ElleryL
code from your second point seems to be in line with what has been said above. From the tutorials, when you call backward() on the loss it does the following:

The backward function receives the gradient of the output Tensors with respect to some scalar value, and computes the gradient of the input Tensors with respect to that same scalar value.

When you call step() for optimiser it does the following:

Calling the step function on an Optimizer makes an update to its parameters

2 Likes