I’m trying to train a neural network to solve a problem that I framed as a multi-label classification problem. Everything is almost working fine after reading a lot about the problem. However, I got bad results when evaluating the model, and I realized that the training loss is a very low (almost zero) starting from the second iteration in training.
- Loss function: BCEWithLogitsLoss with a linear output layer
- Optimizer: Adam
- Note1: The dataset in very sparse as I have over 14k labels and about 10 true labels in every sample. Does this affect the result?
- Note2: As I was getting a memory error when loading the whole dataset, I had to train on part of data, save the model and optimizer and re-load them to train on the second part. (I’ve tried all the other solutions in these topics but none of them worked)
Any idea would be much appreciated,