Hi, I’m trying to calculate MseLoss between 2 tensors of size torch.Size(),torch.Size() as follows:
criterion = nn.MSELoss(reduction = 'sum') for epoch in range(5): total_loss = 0 n_batches =0 for batch_idx, data in enumerate(train_dataloader): s1, s2, labels, s1_length, s2_length = data optimizer.zero_grad() outputs, att1, att2 = model(s1,s2,s1_length,s2_length) print(outputs.shape) #torch.Size() print(labels.shape) #torch.Size() loss = criterion(outputs,labels) + pen_s1 + pen_s2 print(loss) loss.backward()
I keep getting this error : grad can be implicitly created only for scalar outputs , despite giving the
reduction = sum argument inside nn.MSELoss. Could someone tell me where am I going wrong with this?