Use 4 gpu to train model, loss batch_size = batch_size * 4

my code:

batch_size = 64
model = nn.DataParallel(model, device_ids=[0,1,2,3], dim=0)
criterion = nn.BCEWithLogitsLoss()
criterion = cuda()

for s, t in loader:
    logits = model(s, t)
    loss = model.module.compute_loss(logits, tgt, criterion)

when I compute_loss, raise ValueError, say logits.shape is (256, num_classes), but t.shape is (64, num_classes), I want to know why


I think your code is missing some important bits.
Could you give a small code sample that we can run that shows the issue please?

Thank you very much :grinning:, I have solve this problem.
There is something wrong with my dataloader function, when I load data, I use padding to process my data, but I forgot to turn list into tensor, as a result nn.Dataparallel to split data wrong in batch dim. :sweat_smile:

1 Like