ROC-AUC is high but PR-AUC value is very low

ina · December 17, 2019, 7:47pm

Hello,

I am working on DNA sequences data and using CNN. My dataset is hugely imbalanced.
positive class samples (~500)
negative class samples (~150,000)

So I am using WeightedRandomSampler to oversample and balance classes before feeding to data loader.

I use a 5-fold cross-validation. When I did few test runs, I could get a decent ROC value but the PR-AUC value seems to be really low.

For fold 1:
roc auc 0.9667848699763594
precision auc 0.055329116326074484

For fold 2:
roc auc 0.8476321207961566
precision auc 0.03307627288669479

For fold 3:
roc auc 0.9528898540612085
precision auc 0.05020178518546394

I suspect that there are lot of false negatives. Since the positive class samples (~500) is very low compared to negative class samples (~150,000) the model learns the negative class better and predicts most of the test samples as negative.

I tried weighing the positive class using
weight = [50.0]
class_weight = torch.FloatTensor(weight).to(device)
criterion = nn.BCEWithLogitsLoss(pos_weight=class_weight)
By doing this, almost all samples are predicted as positive.

I tried Adaptive learning rates as well but the precision-recall values do not seem to improve.
Can someone guide me and let me know the ideas to improve Precision and Recall values?

Thanks!

ptrblck · December 18, 2019, 3:13am

That’s generally a tough problem.
Could you post the confusion matrix, so that we get a feeling about the predictions?

ina · December 18, 2019, 2:47pm

I understand :
Since the imbalance is too high, the model predicts most of the samples as negatives (class 0)
A sample of my confusion matrix:
[[1023 0]
[ 1 0]]

[[1022 0]
[ 2 0]]

[[1018 0]
[ 6 0]]

ptrblck · December 18, 2019, 7:16pm

I would try to lower the impact of the imbalance a bit and subsample the negative samples before training to e.g. 15000 samples (or even lower), and then retry the WeightedRandomSampler.

Btw. could you post the code snippet you are using to create the class counts, weights, and WeightedRandomSampler?
I would like to make sure nothing goes wrong there, as your model still overfits badly on the negative class.

ina · December 18, 2019, 7:23pm

Yeah sure.

class_sample_count = np.array([len(np.where(Ytr == t)[0]) for t in np.unique(Ytr)])
weight = 1. / class_sample_count
samples_weight = np.array([weight[t] for t in Ytr])
samples_weight = torch.from_numpy(samples_weight)
sampler = torch.utils.data.sampler.WeightedRandomSampler(samples_weight.type('torch.DoubleTensor'), len(samples_weight))
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, num_workers=1, sampler=sampler)

Maybe I will try to under sample negative sets and try to see how it behaves.

ptrblck · December 18, 2019, 7:28pm

The code looks alright. Let me know, how the experiments worked out.

ina · December 18, 2019, 10:52pm

Sure Thank you for your response

ina · December 20, 2019, 8:49pm

Hello!

I did the downsampling way too. The results are still the same

ptrblck · December 20, 2019, 9:41pm

Ok, thanks for the information.
Let’s maybe try to scale down the problem and have a minimal working version.
Could you post the model definition, so that we could use it as a starter?
If that’s not possible due to licensing etc., feel free to create a “similar” dummy model.

ina · December 20, 2019, 11:38pm

Yeah sure. Thanks for your help.
Here it is

class ClDataset(Dataset):
  def __init__(self, X, Y):
        self.len = len(X)
        temp = np.asarray(X, np.float64)
        self.x = torch.from_numpy(temp)
        self.y = torch.from_numpy(np.asarray(Y))

  def __getitem__(self, index):
        return self.x[index], self.y[index]

  def __len__(self):
        return len(self.y)


class ClCNN(nn.Module):

    def __init__(self):
        super(ClCNN, self).__init__()
        # convolutional layer
        self.layer11 = nn.Sequential(
            nn.Conv2d(1, 8, kernel_size=(4, 3), stride=1, padding=(2, 2)),
            nn.BatchNorm2d(8),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))

        self.layer12 = nn.Sequential(
            nn.Conv2d(1, 8, kernel_size=(4, 5), stride=1, padding=(2, 2)),
            nn.BatchNorm2d(8),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2))

        self.layer21 = nn.Sequential(
            nn.Conv2d(8, 16, kernel_size=(3, 2), stride=1, padding=(2, 2)),
            nn.BatchNorm2d(16),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=1, stride=1))

        self.layer22 = nn.Sequential(
            nn.Conv2d(8, 16, kernel_size=(3, 3), stride=1, padding=(2, 2)),
            nn.BatchNorm2d(16),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=1, stride=1))

        self.fc1 = nn.Linear(6944, 512)
        self.fc2 = nn.Linear(512, 2)

    def forward(self, x):
        x = x.float()
        out11 = self.layer11(x)
        out12 = self.layer12(x)

        out21 = self.layer21(out11)
        out21 = out21.reshape(out21.size(0), -1)
        out22 = self.layer22(out12)
        out22 = out22.reshape(out22.size(0), -1)
        
        out = torch.cat((out21, out22), 1)
        out = self.fc1(out)
        out = F.dropout(out, p=0.5, training=self.training)
        out = F.log_softmax(self.fc2(out), dim=-1)

        return out

ina · December 20, 2019, 11:40pm

I am using SGD optimizer and CrossEntropyLoss with lr=0.001.

Maybe I should try to tune my hyperparameters. Is it possible for you to share some example of how to use Hyperopt or Hypersearch in Pytorch CNN??

ptrblck · December 21, 2019, 12:06am

Thanks for the code!
nn.CrossEntropyLoss applies F.log_softmax and nn.NLLLoss internally, so could you please remove the F.log_softmax from your model or use nn.NLLLoss as the criterion, and rerun the experiment?

ina · December 21, 2019, 1:02am

Aah ok. I am just running it. Will update you in some time. Thank you for your suggestions so far

ina · December 21, 2019, 1:49am

Did 2 runs so far and the results are definitely better

roc auc 0.8597332329593886
precision auc 0.1152646061940882

roc auc 0.8950204024201491
precision auc 0.29831894063835274

Thank you so much!!

ptrblck · December 21, 2019, 1:50am

Puh, I was running out of ideas and couldn’t believe that the model is overfitting that much even with weighted sampling.
Feel free to post updates on your experiments, as I’m always interested in these imbalanced cases.

ina · December 21, 2019, 1:52am

Sure
I got this using under sampling. I will also try using WeightedRandomSampler and let you know Thank you once again !!

ina · December 21, 2019, 1:55am

Do you think using a hyperparameter optimization technique can improve the results? I am reading about different techniques available like hypersearch and hyperopt but since I am new to DL I am not able to get a grip of it. Is it posible for you to post any sample implementation of hypersearch or hyperopt for pytorch CNN?

ptrblck · December 21, 2019, 1:57am

These techniques might help, but I’m unfortunately inexperienced in this topic and just played around with some architecture search methods. So let’s better wait for some experts and their opinion.

ina · December 21, 2019, 1:59am

Oh Okay sure

ina · December 25, 2019, 7:49am

Hey!! Happy Christmas Hope your Christmas is going good!!

Incase you are working this week, I would just like to post a question. I managed to get the PR-AUC to say like around 23%. Do you have any ideas to increase it more?? Something like for example

making the network more complex or simple
or increasing the batch size etc.