The problem of DataParallel outputs

When I use DataParalle, I find the first dim of outputs is batch_size * gpu_nums, and when I calculate the loss, the error coming

ValueError: Expected input batch_size (32) to match target batch_size (16).

my code is:

model = DataParallel(model, device_ids=gpus, output_device=gpus[0])
model.to(config.cuda_id)
outputs = model(input_ids, token_type_ids, attention_mask)
loss = loss_fct(outputs, labels.cuda(config.cuda_id))

I think the first dim of outputs should be batch_size
How to fix this, anybody can help me? thx

Thanks! Checking some other sources: ValueError: Expected input batch_size (1) to match target batch_size (64) and https://stackoverflow.com/questions/56719867/pytorch-expected-input-batch-size-12-to-match-target-batch-size-64, and ValueError: Expected input batch_size (324) to match target batch_size (4), there is likely a bug in how you’ve defined the shapes in the implementation of your forward pass.

If your model works without DataParallel but breaks with it, it’s likely due to your model implicitly hardcoding a specific batch size it expects, likely in the beginning of the forward pass (maybe somwhere in self.bert().