CNN classification for 4D data: Large number of classes (Training accuracy always zero)

Input Shape: 6 * 512 * 512

Input data have 6 channels

Number of data= 50000
classes = 2000|

batch_size = 32
torch.manual_seed(0)

data_loader = D.DataLoader(ds, batch_size=batch_size, shuffle=True, num_workers=8)
val_loader = D.DataLoader(ds_val, batch_size=batch_size, shuffle=True, num_workers=8)
tloader = D.DataLoader(ds_test, batch_size=batch_size, shuffle=False, num_workers=0)

classes = 2000|
keep_prob = 0.5

class CNN(torch.nn.Module):

    def __init__(self):
        super(CNN, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(6, 64, kernel_size=11, stride=4, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            
            nn.Conv2d(64, 192, kernel_size=5, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            
            nn.Conv2d(192, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            
            nn.Conv2d(384, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
        )
        self.avgpool = nn.AdaptiveAvgPool2d((15, 15))
        self.classifier = nn.Sequential(
            nn.Dropout(),
            nn.Linear(256 * 15 * 15, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Linear(4096, classes),
        )

    def forward(self, x):
        x = self.features(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.classifier(x)
        return x

model = CNN().cuda(0)

print('Training the Deep Learning network ...')

learning_rate = 0.01
print(learning_rate)
criterion = torch.nn.CrossEntropyLoss().cuda(0)    # Softmax is internally computed.
optimizer = torch.optim.Adam(params=model.parameters(), lr=learning_rate)

train_cost = []
train_accu = []
batch_size = 32
training_epochs = 300
total_batch = df_train.shape[0] // batch_size

print('Size of the training dataset is {}'.format(df_train.shape[0]))
print('Size of the testing dataset'.format(df_test.shape[0]))
print('Batch size is : {}'.format(batch_size))
print('Total number of batches is : {0:2.0f}'.format(total_batch))
print('\nTotal number of epochs is : {0:2.0f}'.format(training_epochs))

def compute_accuracy(Y_target, hypothesis):
    Y_prediction = hypothesis.data.max(dim=1)[1]
    accuracy = ((Y_prediction.data == Y_target.data).float().mean())    
    return accuracy.item()


for epoch in range(training_epochs):
    avg_cost = 0
    for i, (batch_X, batch_Y) in enumerate(data_loader):
        
        # Select a minibatch
        X = Variable(batch_X).cuda(0)    # image is already size of (28x28), no reshape
        Y = Variable(batch_Y).cuda(0)    # label is not one-hot encoded
        
        # initialization of the gradients
        optimizer.zero_grad()
        
        # Forward propagation: compute the output
        hypothesis = model(X)
        
        # Computation of the cost J
        cost = criterion(hypothesis, Y) # <= compute the loss function
        
        # Backward propagation
        cost.backward() # <= compute the gradients
        
        # Update parameters (weights and biais)
        optimizer.step()
        
        # Print some performance to monitor the training
        train_accu.append(compute_accuracy(Y, hypothesis))
        train_cost.append(cost.item())   
        if i % 200 == 0:
            print("Epoch= {},\t batch = {},\t cost = {:2.4f},\t accuracy = {}".format(epoch+1, i, train_cost[-1], train_accu[-1]))
        
        avg_cost += cost.data / total_batch

    print("[Epoch: {:>4}], averaged cost = {:>.9}".format(epoch + 1, avg_cost.item()))


print('Learning Finished!')

After 80 epochs, training accuracy still zero.

Can anyone please check am I doing something wrong here? Why is there no improvement in the accuracy?

Update:
Input data have 6 Channels.

For 32 batch size, train data shape:

[32, 6, 512, 512]

Actually, I am using Conv2d for this 4D data. Maybe this is wrong.

Can you please tell me how can I fix it?
I have no idea how to handle 4D data for CNN?

What about the cost function, is it decreasing properly?

I would advise you to use some Tensorboard logging to better visualize these values, it’s quite easy to use and scales better in the long run. Check the tutorial.

@alex.veuthey, loss value is always in the same range (7.9 to 7.01).

I don’t see model.train() in your code, that could explain it. Since you have no evaluation/validation protocol, you can call model.train() in your first for loop (iterating over epochs).

Models are by the default in train mode, so I don’t think that’s the issue.

How large is your training and testing database? 2000 is not that much, but depending on the data, you’ll need matching sized database. Is the accuracy exactly 0 or near random (0.0005)?

The code seems good to me on the first glance.

1 Like

Shouldnt you be doing

def compute_accuracy(Y_target, hypothesis):
    Y_prediction = hypothesis.data.max(dim=1)[1]
    accuracy = torch.mean( (Y_prediction.data == Y_target.data).float() )     
    return accuracy.item()

Also I guess you should Increase your batchsize, your batchsize for such large number of classes is very small.
I was training on imagenet with batchsize of 10! and this happened to me, I increased the batchize to 64 and all was good again!
Give that a try for sure.

Also, you have very little data for that huge number of classes! 50K for 2K classes means you have 25 samples per each class! and you are not finetuning either! you are also discarding lots of information in earlier layers and have a relatively shallow network! the earlier layers only get you primitive filters, so instead, decrease the input size but downsample much less frequently with a deeper net + finetuning and you should be getting better results imho.

@alex.veuthey, @mmisiur, @Shisho_Sama

Maybe I gave you wrong information.

I just forgot to mention that my data is 4D.

For 32 batch size, train data shape:

[32, 6, 512, 512]

Actually, I am using Conv2d for this 4D data. Maybe this is wrong.

Can you please tell me how can I fix it?
I have no idea how to handle 4D data for CNN?

Please Help

Your data is 4D. Not 6D