Applying the gradient of a task onto another task for identifying task groupings for multitask learning

kek · March 17, 2022, 1:26pm

Hello,

I am a complete newbie on pytorch. I am trying to implement “Efficiently Identifying Task Groupings for Multi-Task Learning” paper for a bioinformatics task. As far as I understand, it requires applying the gradient of one task onto others and calculating an inter task affinity score based on applied gradient from this other task.

My model consists of a hard parameter sharing backbone that splits to 30-40 individual sub modules for each task.

Can I apply the gradient of one of the tasks onto another? And if so how?

Thank you in advance

My model is like this:

class NN(nn.Module):
def init(self, input_size, num_classes):
super(NN, self).init()
self.num_classes = num_classes
self.shared_network = nn.Sequential(
nn.Dropout(0.5),
nn.Linear(input_size, 256),
nn.BatchNorm1d(num_features=256),
nn.ReLU(),
nn.Dropout(0.7)
)
self.sub_networks = nn.ModuleList()
for _ in range(self.num_classes):
self.sub_networks.append(nn.Sequential(
nn.Linear(256,32),
nn.Linear(32,1),
nn.Sigmoid()
)
)

def forward(self, x):
    outputs = []
    representation = self.shared_network(x)
    for i in self.sub_networks:
        outputs.append(i(representation))
    return torch.cat(outputs,axis=1)