Weighted L1 loss in parallel training

I define a weighted L1 loss and want to train a model using multi-GPUs.
The weighted L1 loss code is

class Crit_L1_weight_model(nn.Module):
    def __init__(self):
        super(Crit_L1_weight_model, self).__init__()

    def forward(self, weight1, real, fake):
        """
        :param real: real_img
        :param fake: fake_img
        :param weight:
        :return:
        """
        out1 = torch.abs(real - fake)
        out1 = out1 * weight1
        loss = out1.mean()
        return loss

I call the criterion using

crit_L1_for_input = Crit_L1_weight_model()
crit_L1_for_input = nn.DataParallel(crit_L1_for_input, device_ids=range(opt.num_gpus), output_device=opt.num_gpus-1)

When I forward the loss function,

g_input = crit_L1_for_input(Weight1, fake_input, real_input) # Weight1, fake_input, real_input are three variables.

I got the error: “RuntimeError: grad can be implicitly created only for scalar outputs”.

Can you give me some suggestions?

are you sure this is in the forward? it should happen in the backward and it happens because you might be calling x.backward() where x is not a 1-element Variable, but has more elements.

Yes, you are right. I have fixed the question. Thanks very much