Calculate MseLoss over 2 tensors - errors

Hi, I’m trying to calculate MseLoss between 2 tensors of size torch.Size([64]),torch.Size([64]) as follows:

criterion = nn.MSELoss(reduction = 'sum')
  
    for epoch in range(5):
        total_loss = 0
        n_batches =0
    for batch_idx, data in enumerate(train_dataloader):
        
        s1, s2, labels, s1_length, s2_length = data
        optimizer.zero_grad()
        outputs, att1, att2 = model(s1,s2,s1_length,s2_length)
        print(outputs.shape) #torch.Size([64])
        print(labels.shape) #torch.Size([64])
        loss = criterion(outputs,labels) + pen_s1 + pen_s2
        print(loss)
        loss.backward()

I keep getting this error : grad can be implicitly created only for scalar outputs , despite giving the reduction = sum argument inside nn.MSELoss. Could someone tell me where am I going wrong with this?

This line makes me wonder:

loss = criterion(outputs,labels) + pen_s1 + pen_s2

Could you elaborate on pen_s1 and pen_s2?

I’m asking about this as loss.backward() works if loss contains single element and here something happens :wink:

Does loss.sum().backward() work?

1 Like

Hey, turns out my pen_s1 implementation was returning a matrix instead of the scalar output. It worked when I used pen_s1.mean()

Thanks for the reply, made me look again!

1 Like