Implementation of PCGrad in Pytorch Lightning

haldunbalim · April 19, 2022, 4:22pm

Hello,

I would like to implement the paper [2001.06782] Gradient Surgery for Multi-Task Learning.
In summary, I am working on a multitask learning setting so there are more than one loss function. I need to calculate separate gradients for each of these losses and manipulate them depending on their cosine similarity.

In standart pytorch I can do it using torch.autograd.grad; however, after calculating these gradients and manipulating them I would like to pass them to optimizer (no need to recalculate). How can I do it in torch lightning? Can I call manual_backward for each loss in a single train_step, or can I pass the summed up calculated gradients to optimizer?

Thanks for the help
Haldun

panpan · August 9, 2022, 5:24am

I encountered similar things recently, have you solved the problem?

haldunbalim · August 9, 2022, 9:21am

Unfortunately, no.

I was using simple MLP’s after convolutional backbone and I calculated their derivatives in runtime with torch.autograd which is redundant but allowed me to implement PCGrad without changing the structure of my code a lot.

An alternative solution can be implemented using hooks to check derivatives but may involve more tedious design.

panpan · August 10, 2022, 3:17am

I see. Thanks a lot.

MingChaoXu · May 12, 2023, 8:59am

Does Pcgrad work? i also try to use this algorithm, but it doesn’t work in my experiments