Why RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes' failed

Swisher · November 21, 2019, 7:31pm

I implemented the code like below:

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()

        self.conv = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2))

        self.dense = nn.Sequential(
            nn.Linear(16*16*128, 1024),
            nn.ReLU(),
            nn.Dropout(p=0.5),
            nn.Linear(1024, 10)
        )

    def forward(self, input):
        output = self.conv(input)
        output = output.view(-1, 16*16*128)
        output = self.dense(output)

        return output

model = Model()
print(model)

cost = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())

n_epochs = 5

for epoch in range(n_epochs):
    running_loss = 0.0
    running_correct = 0
    print("Epoch {}/{}".format(epoch, n_epochs))
    print("-" * 10)
    for data in data_loader_train:
        images, labels = data
        images, labels = Variable(images), Variable(labels)
        outputs = model(images)
        _,pred = torch.max(outputs.data, 1)
        optimizer.zero_grad()
        loss = cost(outputs, labels)
        print(loss)

        loss.backward()
        optimizer.step()
        running_loss += loss.data.item()
        running_correct += torch.sum(pred == labels.data)
    
    testing_correct = 0
    for data in data_loader_test:
        images, labels = data
        images, labels = Variable(images), Variable(labels)
        outputs = model(images)
        _,pred = torch.max(outputs.data, 1)
        testing_correct += torch.sum(pred == labels.data)
    print("Loss is:{:.4f}, Train Accuracy is:{:.4f}%, Test Accuracy is:{:.4f}".format(running_loss/len(data_train), 100*running_correct/len(data_train), 100*testing_correct/len(data_test)))

But I got the error code like this(line 94 is loss = cost(outputs, labels)):

RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes' failed.  at C:\w\1\s\tmp_conda_3.7_055457\conda\conda-bld\pytorch_1565416617654\work\aten\src\THNN/generic/ClassNLLCriterion.c:94

How can I solve this problem? Could you help me? Thank you!

ptrblck · November 21, 2019, 7:38pm

Could you check the values of your labels?
It should contain class indices in the range [0, nb_classes-1], which is [0, 9],based on your output layer.
Apparently some batches contain label indices outside of this range.

Swisher · November 22, 2019, 5:39am

Thanks for your solution! But sorry I’m a newcomer to Pytorch, how can I check the values of labels and make all batches contain label indices within range?

ptrblck · November 22, 2019, 5:42am

It depends a bit on how you’ve created the Dataset or the targets in particular.
E.g. if you are dealing with image data (images stored in two folders), torchvision.datasets.ImageFolder will read the data and create the appropriate targets.

You could check the range of the labels by printing them in the DataLoader loop, e.g. simply by:

print(labels.min())
print(labels.max())

If you could post your Dataset, we could have a look at it here.

Swisher · November 22, 2019, 5:55am

Thanks for your help, I check the range of the labels, tensor(2), tensor(98), It looks like far beyond the range.
I implemented my Dataset like below:

train_transformations = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomCrop(32, padding=4),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

data_train = datasets.CIFAR100(root="./data/", transform=train_transformations, train=True, download=True)

test_transformations = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

data_test = datasets.CIFAR100(root="./data/", transform=test_transformations, train=False)
data_loader_train = torch.utils.data.DataLoader(dataset=data_train, batch_size=64, shuffle=True)
data_loader_test = torch.utils.data.DataLoader(dataset=data_test, batch_size=64, shuffle=True)

Thank you!

ptrblck · November 22, 2019, 5:56am

CIFAR100 contains images for 100 classes (this is what the 100 stands for), so you need to change the output units to 100 in your last linear layer. If you want to use just 10 classes, you should use CIFAR10 instead.

Swisher · November 22, 2019, 6:02am

It works! I really appreciate your help! I will try harder to learn Pytorch!!

ptrblck · November 22, 2019, 6:03am

Happy to help and don’t hesitate to ask questions here in case you get stuck somewhere.

Ahana_Singh · March 29, 2022, 6:05pm

Hi, can you please show code on how to shift label to start with 0 and not 1, mine start with 1?
Thanks.

ptrblck · March 29, 2022, 6:07pm

Sure, you can just subtract 1 from the tensor:

x = torch.randint(1, 10, (20,))
x = x - 1