I have a vector of different loss functions [loss1(model, inputs), …, lossN(model, inputs)]
I want to get the gradients of each loss function separately w.r.t. model parameters.
From looking at similar questions, I see that calling lossN.backward(), storing the gradients, and doing the same again, once for each loss function, is a possibility.
The problem is, my training approach relies on minibatch updates i.e. split a batch of data into several minibatches and for each minibatch, compute the loss, compute the gradients, update, rinse-repeat. The final gradients for a given loss function are the result of processing all minibatches.
Which means I’d have to do the whole minibatch process repeatedly, once for each loss function, which is inefficient, so I wanted to see if there was a better way.