L1_loss error RuntimeError: cudaEventSynchronize in future::wait: device-side assert triggered


It suddenly show this error, and i have check the shape of l1loss parameters, anyone helps? @ptrblck
my code as follows:


class RegL1Loss(nn.Module):
    def __init__(self):
        super(RegL1Loss, self).__init__()

    def forward(self, output, mask, ind, target):
        try:
            pred = _tranpose_and_gather_feat(output, ind)
            mask = mask.unsqueeze(2).expand_as(pred).float()

            loss = F.l1_loss(pred * mask, target * mask, size_average=False)
        except:
            print('pred', pred)
            print('mask:', mask)
            print('target:', target)
            print(pred.shape, mask.shape, target.shape)
            temp1 = pred*mask
            temp2 = target*mask
            print(temp1, temp2)
            print(temp1.shape, temp2.shape)
        # except:
        #     print('output shape:', output.shape, 'ind shape:', ind.shape, 'mask shape:', mask.shape)
        #     print('ind:', ind)
        #     print('output:', output)
        #     assert 1==2
        loss = loss / (mask.sum() + 1e-4)
        return loss

The error points to an invalid index in gather, most likely thrown by _transpose_and_gather_feat.

Assertion `indexValue >= 0 && indexValue < stc.sizes[dim]` failed
1 Like

Does you mean the gather may out of the boundary? I will check it, thanks very much

@ptrblck Thank you very much, and i have solved this problem. As you said, i found an out of boundary issue in _transpose_and_gather_feat function, and fix it, thanks again:grinning:

1 Like