Weighted L1 loss in parallel training

WERush · September 25, 2017, 1:52pm

I define a weighted L1 loss and want to train a model using multi-GPUs.
The weighted L1 loss code is

class Crit_L1_weight_model(nn.Module):
    def __init__(self):
        super(Crit_L1_weight_model, self).__init__()

    def forward(self, weight1, real, fake):
        """
        :param real: real_img
        :param fake: fake_img
        :param weight:
        :return:
        """
        out1 = torch.abs(real - fake)
        out1 = out1 * weight1
        loss = out1.mean()
        return loss

I call the criterion using

crit_L1_for_input = Crit_L1_weight_model()
crit_L1_for_input = nn.DataParallel(crit_L1_for_input, device_ids=range(opt.num_gpus), output_device=opt.num_gpus-1)

When I forward the loss function,

g_input = crit_L1_for_input(Weight1, fake_input, real_input) # Weight1, fake_input, real_input are three variables.

I got the error: “RuntimeError: grad can be implicitly created only for scalar outputs”.

Can you give me some suggestions?

smth · September 28, 2017, 3:19am

are you sure this is in the forward? it should happen in the backward and it happens because you might be calling x.backward() where x is not a 1-element Variable, but has more elements.

WERush · September 28, 2017, 4:51am

Yes, you are right. I have fixed the question. Thanks very much