Help with Multiclassification CNN

Hi all,
I currently have a set of over 10000 1-D images, each with 3832 pixels. I’d like to create a neural network that will categorize them into one of 136 classes. I have created a CNN model and training loop. I have selected Cross Entropy for my loss, and since my labels are one hot encoded, I put in a function one_hot_ce_loss that turns the one hot encoded labels into a proper input format for CrossEntropyLoss. When I run this, my loss does not decrease. I also calculate accuracy at the end of the code snippet, and that stays at 0%. Any suggestions on what might be going wrong would be appreciated!

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.act1 = nn.ReLU()
        self.pool1 = nn.MaxPool1d(kernel_size=5)
        self.conv2 = nn.Conv1d(3832, 766, kernel_size=5, stride=1, padding=1)
        self.act2 = nn.ReLU()
        self.pool2 = nn.MaxPool1d(kernel_size=5)
        self.flat = nn.Flatten(0,1)
        self.fc3 = nn.Linear(116432, 136) 
        self.act3 = nn.ReLU()
        self.fc4 = nn.Linear(136, 136)
        self.act4 = nn.ReLU()

    def forward(self, x):
        x = self.act1(nn.Conv1d(x.size()[0], 3832, kernel_size=5, stride=1, padding=1)(x))
        x = self.pool1(x)
        x = self.act2(self.conv2(x))
        x = self.pool2(x)
        x = self.flat(x)
        x = self.act3(self.fc3(x))
        x = self.act4(self.fc4(x))
        return x

model = CNN()
param = model.parameters()
def one_hot_ce_loss(outputs, targets):
    criterion = nn.CrossEntropyLoss()
    _, labels = torch.max(targets, dim=0)
    return criterion(outputs, labels)
loss_fn = one_hot_ce_loss
optimizer = torch.optim.SGD(param, lr=0.01, momentum=0.9)

device = "cuda" if torch.cuda.is_available() else "cpu"
next(model.parameters()).device
model.to(device)

train_size = int(0.8 * len(full_dataset)) #80% training data
test_size = len(full_dataset) - train_size
trainset, testset = torch.utils.data.random_split(full_dataset, [train_size, test_size])

batch_size = 1024
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=True)

n_epochs = 500
epoch_count = [] 
train_loss_values = []
test_loss_values = []
acc_values = []
for epoch in range(n_epochs):
    model.train()
    for inputs, labels in trainloader:
        # forward, backward, and then weight update
        y_pred = model(inputs)
        loss = loss_fn(y_pred, labels[0])
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
    model.eval()
    acc = 0
    count = 0
    for inputs, labels in testloader:
        test_pred = model(inputs)
        test_loss = loss_fn(test_pred, labels[0])
        acc += (torch.argmax(nn.LogSoftmax(test_pred).dim) == torch.argmax(labels[0])).float()
        count += len(labels[0])
    acc /= count
    if epoch % 10 == 0:
        print(f"Epoch: {epoch} | Train loss: {loss} | Test loss: {test_loss}")
        print("Epoch %d: model accuracy %.2f%%" % (epoch, acc*100))
    train_loss_values.append(loss.detach().numpy())
    test_loss_values.append(test_loss.detach().numpy())
    epoch_count.append(epoch)
    acc_values.append(acc)

One potential issue I see is that you are reinitializing your first conv layer every forward pass. From my intuition and also this post, that is causing a learning issue.

I have also tried something like this, if that’s what you’re suggesting, and it didn’t change anything. If it’s not, could you clarify what you’re suggesting?

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv1d(1360, 3832, kernel_size=5, stride=1, padding=1)
        self.act1 = nn.ReLU()
        self.pool1 = nn.MaxPool1d(kernel_size=5)
        self.conv2 = nn.Conv1d(3832, 766, kernel_size=5, stride=1, padding=1)
        self.act2 = nn.ReLU()
        self.pool2 = nn.MaxPool1d(kernel_size=5)
        self.flat = nn.Flatten(0,1)
        self.fc3 = nn.Linear(116432, 136) 
        self.act3 = nn.ReLU()
        self.fc4 = nn.Linear(136, 136)
        self.act4 = nn.ReLU()

    def forward(self, x):
        x = self.act1(self.conv1(x))
        x = self.pool1(x)
        x = self.act2(self.conv2(x))
        x = self.pool2(x)
        x = self.flat(x)
        x = self.act3(self.fc3(x))
        x = self.act4(self.fc4(x))
        return x

[/quote]

What you changed did address the issue I was referring to, however if the model is still not learning there is more to fix. Perhaps I am confused, but with first conv layer:

self.conv1 = nn.Conv1d(1360, 3832, kernel_size=5, stride=1, padding=1)

Do your images have 1360 input channels?

I admit, I was a bit confused about that as well. I found that the code wouldn’t run unless that number was set to the number of images in the batch. So if the batches include 1360 images, I set that number to 1360. Perhaps I am using the batches wrong?

(My images only have one channel, as far as I understand.)

If I try with 1:

RuntimeError: Given groups=1, weight of size [3832, 1, 5], expected input[1, 1360, 3832] to have 1 channels, but got 1360 channels instead

Ah, I looked into it and it seems like conv1d needs the shape [batch, 1, 3832]. I changed that, and now it’s saying I don’t have enough memory, lol.

Seems you are moving in the right direction, glad you solved your bug. FYI, reducing batch size can help with memory issues

Thanks for helping me work through potential issues, and for the note on batch size. I will give it a try!

You might also want to consider removing the last activation to return the raw logits.

Thanks for the suggestion! I’ll try that as well.