@ptrblck so I was quick to judge that this method works.
When I used it now all val_preds are stuck at 0 instead.
Here’s the code:
class Classifier(nn.Module):
def __init__(self, n_class, batch_size):
super(Classifier, self).__init__()
self.batch_size = batch_size
self.transformer = VisionTransformer()
self.criterion = nn.BCEWithLogitsLoss(reduce=False) # weighted loss
def forward(self, X, labels, mask):
out = self.transformer(X)
labels = torch.tensor(labels, dtype=torch.float32) # we need float labels for BCEWithLogitsLoss
weight = torch.tensor([0.2, 0.8]) # is this correct assignment of weights?
weight_ = weight[labels.data.view(-1).long()].view_as(labels)
m = nn.Sigmoid()
with torch.cuda.amp.autocast():
loss = self.criterion(m(out[:,1]-out[:,0]), labels.cuda())
loss_class_weighted = loss * weight_.cuda()
loss_class_weighted = loss_class_weighted.mean()
loss = loss_class_weighted
pred_labels = out.data.max(1)[1]
#pred_labels = out.argmax(dim=1)
labels = labels.int()
return pred_labels, labels, loss
Do you know what accounts for all val_preds getting stuck at 0 or previously at 1?
Also:
- Is the weights I have selected correctly if class 0 is 20% of data and class 1 is 80% of data?
weight = torch.tensor([0.2, 0.8])
- I am not exactly sure what is the logic behind
out[:,1]-out[:,0]
proposed by mMagmer
Also, here’s an example of out from transformer. For example, if my batch size is 16, I have:
transformer out: tensor([[ 0.5873, -0.5521],
[ 0.6407, -0.6954],
[ 0.1806, -0.3317],
[-0.1862, -0.1044],
[ 0.0688, -0.7443],
[-0.1022, -0.3273],
[ 0.3243, -0.5698],
[ 0.1828, -0.3642],
[ 0.0833, -1.0877],
[ 0.0405, -0.1679],
[ 0.2729, -0.3107],
[ 0.2521, -0.7700],
[ 0.3601, -0.4803],
[-0.0508, -0.4775],
[ 0.2773, -0.6211],
[ 0.1521, -0.6477]], device='cuda:0', grad_fn=<AddmmBackward0>)
labels: tensor([1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1], device='cuda:0',
dtype=torch.int32)
pred labels: tensor([0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='cuda:0')
loss: tensor(0.3672, device='cuda:0', grad_fn=<MeanBackward0>)
epoch is 0
train accuracy: 0.19
train micro precision: 0.19
train micro recall: 0.19
train micro F1-score: 0.19
train macro precision: 0.59
train macro recall: 0.51
train macro F1-score: 0.17
As you see in train phase, not all train_preds are stuck at either of zero or one, but in validation phase everything is stuck at 1 using the weighted BCEWithLogitLoss.
val epoch preds: [tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1), tensor(1)]
evaluator.get_scores 0.8088235294117647
Here’s an example of Transformer out from evaluation phase:
transformer out: tensor([[-0.1766, 1.3507],
[-0.1280, 1.2671],
[ 0.0400, 1.4123],
[-0.1593, 1.4637],
[-0.2360, 1.3756],
[-0.2181, 1.3562],
[-0.1042, 1.3980],
[-0.0483, 1.4103],
[-0.2289, 1.2945],
[-0.0376, 1.4060],
[-0.2179, 1.2876],
[-0.1700, 1.3776],
[ 0.1045, 1.4502],
[-0.1199, 1.3978],
[-0.1731, 1.3738],
[-0.1940, 1.2998]], device='cuda:0')
labels: tensor([1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1], device='cuda:0',
dtype=torch.int32)
pred labels: tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], device='cuda:0')
loss: tensor(0.2717, device='cuda:0')
Do you know what could be fixed?