UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector. warnings.warn('Was asked to gather along dimension 0, but all ’
You are probably using DataParallel but returning a scalar in the network. You should return a batched output.
1 Like
i faced the same issue, and you’re right, i am using data parallel, but could you please elaborate how to tackle this?
Hey Simon. Could you please share how to returned a batched output? Here’s my finetune
function
# Step 4: Fine-tuning
def fine_tune(model, train_loader, optimizer, num_epochs, save_dir):
model.train()
for epoch in range(num_epochs):
progress_bar = tqdm(enumerate(train_loader), total=len(train_loader), desc=f"Epoch {epoch+1}/{num_epochs}", unit="batch")
for batch_idx, batch in progress_bar:
optimizer.zero_grad()
input_ids = batch['input_ids'].to(device)
attention_mask = batch['attention_mask'].to(device)
labels = batch['label'].to(device)
outputs = model(input_ids=input_ids, attention_mask=attention_mask, labels=labels)
loss = outputs.loss
loss.backward()
optimizer.step()
# Update progress bar
progress_bar.set_postfix({'loss': loss.item()}, refresh=True)
# Save model weights
save_path = os.path.join(save_dir, f"model_epoch_{epoch+1}.pt")
torch.save(model.state_dict(), save_path)
I’m aware I need to do loss.sum().backwards()
to get the sum of all losses across all machines.