Sound event detection model not learning

Hello everyone,

I am fine tuning a model for sound event detection, taken from, on the Urbansed dataset. In this task, the model should predict a [batch, n_classes, time_steps] matrix, with a value of one indicating the presence of an event at a certain time step.
However, my network does not seem to train. Specifically, after about the first 10 epochs, my loss stops decreasing. If i check the predictions of the model, the output is composed entirely of 0.5s.
I have tried:

  • Changing the amount of l2 regularization, even turning it off completely
  • Changing the learning rate
  • Doing a mock training with only 2 samples to see if the network could learn the simple problem. The result was the same matrix of 0.5.
  • Different optimizers (Adam and SGD so far)
  • BCEWithLogitsLoss with reduction = mean and sum (using this loss as in theory multiple classes can be active at a time)

My loss and optimizer:

criterion = nn.BCEWithLogitsLoss(reduction='sum')
optimizer = optim.SGD(model.parameters(), lr=0.001)

My training loop:

    for i, data in enumerate(dataloader_train):
        inputs, labels = data
        inputs = inputs.type(torch.FloatTensor)
        outputs = model(inputs).cpu()
        loss = 0
        loss = criterion(outputs, labels)
        running_loss += loss.item()

I can’t figure out what’s wrong. Any thoughts?

Try viewing ur data to see what it’s actually training with

Thank you for your suggestion. Tried to visualize the data and nothing seems out of place, the spectrograms appear correct.

I think u should manually initalize ur networks weights and try again
Btw what is ur loss func, learning rate and optimizer?

Thanks, I will try that, although that would mean I cannot do transfer learning. As for loss function, optimizer and lr I’m using:

criterion = nn.BCEWithLogitsLoss(reduction='sum')
optimizer = optim.SGD(model.parameters(), lr=0.001)

Although I also tried Adam and different learning rates and l2 regularization values.

Well try using the Adams optimization included with the weights initialization (weight values should be very close to zero by not too small and uniformly distributed)
If it still continues not to learn then try changing network architecture