IndexError: index -9223372036854775808 is out of bounds for dimension 1 with size 2

Hello,

I am trying to train a siamese network with Binary Cross Entropy.

I have the following error in train_epoch:

y_true_2[range(y_true_2.shape[0]), y_true.long()] = 1
IndexError: index -9223372036854775808 is out of bounds for dimension 1 with size 2

Following is the code snippet for reference:

def train_epoch(train_loader, model, loss_fn, optimizer, cuda, log_interval, metrics, logging):
    for metric in metrics:
        metric.reset()

    model.train()
    losses = []
    total_loss = 0

    for batch_idx, ((x0, x1), y) in enumerate(train_loader):

        x0, x1, y_true = x0.cpu(), x1.cpu(), y.cpu()
        gc.collect()
        optimizer.zero_grad()
        output1, output2 = model(x0, x1)

        '''Distance metric - PairwiseDistance'''
        p_dist = torch.nn.PairwiseDistance(keepdim=True)

        dy = p_dist(output1, output2)
        dy = torch.nan_to_num(dy)
        y_true = torch.nan_to_num(y_true)

        '''2 lines indicated the normalization of dy to 0 and 1 by dividing it with max value'''

        maximum_dy = torch.max(dy)
        maximum_dy = torch.nan_to_num(maximum_dy)
        dy = dy / maximum_dy

        maximum_y_true = torch.max(y_true)
        maximum_y_true = torch.nan_to_num(maximum_y_true)

        y_true = y_true / maximum_y_true

        dy = torch.squeeze(dy, 1)

        'Output tensor of dimension [4,2] and input tensor of dimension [4] to BCE loss function'
        input_dy = torch.empty(dy.size(0), 2)
        input_dy[:, 0] = 1 - dy
        input_dy[:, 1] = dy

        y_true_2 = torch.zeros(dy.size(0), 2)
        y_true_2[range(y_true_2.shape[0]), y_true.long()] = 1

        m = nn.Sigmoid()
        loss = loss_fn(m(input_dy), y_true_2)

        loss.backward()
        optimizer.step()

        losses.append(loss.item())
        total_loss += loss.item()

        input_dy_metric = torch.round(input_dy)


        for metric in metrics:
            metric(input_dy_metric, y_true_2)
            metric.total += y_true_2.shape[0]

        if batch_idx % log_interval == 0:
            message = 'Train: [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                batch_idx, len(train_loader),
                100. * batch_idx / len(train_loader), np.mean(losses))
            for metric in metrics:
                message += '\t{}: {}'.format(metric.name(), metric.value())

            print(message)
            losses = []

    total_loss /= (batch_idx + 1)
    return total_loss, metrics

Please help me with possible solution.
Thanks in advance.

Based on the error message it seems that the target might be uninitialized. Which PyTorch version are you using and if it’s an older one, could you update it to the latest release (1.9.0) or the nightly binary?

My torch version is already 1.9.0. Do we have any other solution for this? @ptrblck

Thanks for the update. Could you try to create a minimal, executable code snippet to reproduce this issue so that we could try to debug further, please?

I think the entire code is needed to debug this issue. If you need any further information please let me know @ptrblck