Multitask learning in a self-supervised contest

catt_ale · November 19, 2020, 2:10pm

Hello to everyone!

I have a base model which is a cycle GAN, I am using some layer of the generator to reuse learned features to fed a classifier. Train the two networks separately is not a problem. My doubt is related to how to build the cost function in order to optimize my cycleGAN based on how is performing the classifier.

Do you think this pseudo-code could be the right approach?
Both the loss functions optimize the base model w.r.t. the classification result. Which one is better?

basemodel = Mybasemodel()
basemodel.load_state_dict(torch.load('...'))
basemodel.train()

calssifier = Classifier()
calssifier.train()

loss_basemodel_fn = torch.nn...()
loss_classifier_fn = torch.nn...()

opt_basemodel = torch.optim...(basemodel.parameters())

# create an optimizer which optimizes both the networks
param_chain = itertools.chain(calssifier.parameters(), basemodel.parameters())
opt_classifier = torch.optim...(param_chain)

for i, data in enumerate(dataset):
    
    # Load data
    input = data['input']
    label_basemodel = data['label_basemodel']
    label_classifier = data['label_classifier']

    
    # use part of the base model to get a feature map
    feature_map = basemodel.model[first_layer:N_layer](input)
    # get the prediction
    pred = calssifier(feature_map)
    
    output_base_model = basemodel.model(input)
    
    opt_basemodel.zero_grad()
    opt_classifier.zero_grad()

    # BOTH THESE LOSSES WILL OPTIMIZE MY BASENODEL? WHICH ONE IS BETTER?
    loss_classifier = loss_classifier_fn(pred, label_classifier)
    # combine the two loss function so that the base model optimizer
    # take into consideration the error of the calssifier when optimizes
    # the basemodel
    loss_basemodel = loss_basemodel_fn(output_base_model, label_basemodel) + loss_classifier

    loss_basemodel.backward()
    loss_classifier.backward()

    opt_basemodel.step()
    opt_classifier.step()