Hi all,
I am trying to calculate and manually modify gradients for a resnet50 model that outputs predictions for two classification tasks (a primary task that I care about and an auxiliary task). The model is modified such that the output on forward pass is the predictions for each task using cross-entropy loss.
When I’m using pytorch’s default implementation of backprop, my current implementation of the training loop in pytorch is working and is as follows:
optimizer.zero_grad()
outputs = net(inputs) #will have 2 outputs
loss_primary = train_gt_criterion(outputs[0], gt)
loss_aux = train_bin_criterion(outputs[1], bin)
loss_primary.backward()
loss_aux.backward()
optimizer.step()
However, I would like to change the gradients of the auxiliary task based on a function (i.e. weighted cosine) and was wondering how I could do this in Pytorch.
I know in Tensorflow you can do something like this to pass a modified gradient:
primary_loss = primary_function(x)
auxiliary_loss = auxiliary_function(x)
primary_grad = tape.gradient(primary_loss, x)
auxiliary_grad = tape.gradient(auxiliary_loss, x)
new_grad = modify_gradient(auxiliary_grad, primary_grad) #Dummy function that incorporates gradient from my primary task and auxilliary task
optimizer.apply_gradients([(primary_grad + lam*new_grad, x)])
I was wondering if it was possible do something along these lines in pytorch, specifically take the gradients calculated from both the primary and auxiliary losses and use them to modify an auxiliary gradient. Right now, I’m not sure how this is possible with the way I’ve written the code using the torch function backward.
Thank you for your time and help!
jschen