Resnet for binary classification

I have modified a resnet18 network as follows:

model = torchvision.models.resnet18()
model.conv1 = nn.Conv2d(num_input_channel, 64, kernel_size=7, stride=2, padding=3,bias=False)
model.avgpool = nn.AdaptiveAvgPool2d(1)
model.fc = nn.Linear(512 * torchvision.models.resnet.BasicBlock.expansion,2)

and I use nn.CrossEntropyLoss() as the loss function and I provide the labels just as class numbers (0 or 1), but the performance is very poor (worse than a dummy classifier). I would like to make sure if the resnet modification is correct for binary classification.

Are you using pretrained model?
Are you modifying the net after loading the weights?
Have you trained the model after modifying it?
Why are you changing model.conv1?

No i dont use pretrained models, so the training is from the scratch.
I have modified model.conv1 to have a single channel input.
I have trained the model with these modifications but the predicted labels are in favor of one of the classes, so it cannot go beyond 50% accuracy, and since my train and test data are balanced, the classifier actually does nothing.

I have also seen this issue Binary Classification on Resnet? on pytorch/vision github, but its not clear to me the solution.

how do you initialize the weights of the layers you added?
Do you normalize the inputs?
How many examples do you have?
Do you shuffle your training set?

I didnt do any specific initialization just use the resnet18() and I think it handles the weigh initialization itself.
I didn’t do normalizations since my inputs are not actual images, they are matrices which are very very sparse with the size 784x162. Almost all the values are zero except a few of them which have real values, could that be a reason?
I have 2400 samples for training, it is probably very small for such networks, but the results are quite far away from what I have expected.
I have shuffled the data for training.

  1. Please verify that the weights of the layers you edited (conv1 and fc) are indeed initialized to non-zero values.
  2. 2,400 examples is too little for training a large model from scratch. Can you obtain more data?

I have checked the values with model.fc.weightand model.conv1.weight and they are initialized to non-zero values.

I am asking for more data but the process of data generation is a bit expensive I think. The other problem is the sparseness of the matrices, do you think resnet works fine with sparse data?
Thank you

The only modification you really need is in the linear layer which you have already done. So that should be fine. Maybe it’s an issue with your dataset?

I have implemented the ResNet-34 (50, 101, and 151) with some slight modifications from there and it works fine for binary classification. So, I don’t think it’s an issue with the architecture. I have an example here (for binary classification on gender labels, getting ~97% acc):

I suggest maybe trying your implementation on a different dataset where you know you should be getting good results to see if there’s maybe an implementation bug.

You might need to put the resnet’s batch norm layers in to eval mode. This made a massive difference for me when using resnet as a feature extractor.

I don’t think using eval model for BatchNorm when training from scratch is a good idea.

No, certainly not. I missed the fact that you were training from scratch, my mistake.

Yes I think the problem is the size of the data set. Thanks for your suggestion I ll try on another standard dataset.

Yes I think the problem is the size of the data set. Thanks for your suggestion I ll try on another standard dataset.

I just see above that you only have 2400 examples, which could be the main reason like you suggest.

Almost all values being 0 could be a problem, but it’s probably not the main reason. MNIST images also contain lots of 0’s. Another thing though is, besides the small dataset size, that 784x162 is very large for a convenet (typically, even for images, standard resnets for e.g,. face recognition operate on images between ~60x60 and ~200x200.

Since you are mentioning that these are not images, I wonder if it is a tabular dataset, in which case you might be better off using a network with only a few (e.g., 1-3) fully connected layers with dropout.

1 Like

Thank you so much for your suggestions, yes my data is tabular with integer numbers.
And I have switched to work with 1 or 2 cnn layers followed by 1-3 fully connected layers. Thank you

Why do you say this? I am training a densenet from scratch and I check the validation accuracy by switching the model to .eval().

Densenet implementation is here and it has batch norm layers in it (which are also present during training I believe):

This is not the issue here, clearly one must use “eval” mode for validation/testing.
The issue here is using “eval” for training as well: it is common practice to use “eval” when fine-tuning, but not for training from scratch

Yes, for some architectures it might not matter, but ResNet has BatchNorm layers, so it should be set to train() during training to parameterize these layers correctly, and eval() for testing to not update these on the test set.

1 Like

There are other layers besides Batchnorm that behave differently on train and eval; or instance, dropout layer or layers with spectral norm etc. Therefore it s a very good practice to always set the model to eval when testing and to train when training

Yes of course. BatchNorm was just an example specific to ResNet