Low mAP in multi-label classification

Hi, everyone.
I got a problem in multi-label classification.
I have 20 numbers of classifications. For each picture, we may find some different categories at same time. For example, [0,1,1,0…] means in this picture, we can find dog and cat.
However, when I use mAP to measure my model, I only get very very low mAP. However, I don’t know what is going on and what is wrong with my code.

Here is my code:
Loss function is nn.MultiLabelSoftMarginLoss().
And after using logits = classifier(images.to(device)), I use sigmoid and make thresholds.
Can anyone help me why I get very low accuracy?

classifier = resnet101().to(device)
optimizer = torch.optim.Adam(classifier.parameters(), lr= args.lr ,weight_decay=5e-4)
criterion = nn.MultiLabelSoftMarginLoss()
for epoch in range(1, NUM_EPOCHS+1):
print("Starting epoch number " + str(epoch))
train_loss = train_classifier(train_loader, classifier, criterion, optimizer)
print("Loss for Training on Epoch " +str(epoch) + " is “+ str(train_loss) + " lr:” + str(args.lr))

def train_classifier(train_loader, classifier, criterion, optimizer):
train_loss = 0
correct = 0
total = 0
losses = []
y_true = np.zeros((0,20))
y_score = np.zeros((0,20))
for i, (images, labels, _) in enumerate(train_loader,0):
images, labels = images.to(device), labels.to(device)
#zero the parameter gradients
logits = classifier(images.to(device))
predict = nn.functional.sigmoid(logits)
zero = torch.zeros_like(logits)
one = torch.ones_like(logits)
predict = torch.where(logits >= 0.5, one, predict)
predict = torch.where(logits < 0.5, zero, predict)
y_true = np.concatenate((y_true, labels.cpu().detach().numpy()), axis=0)
y_score = np.concatenate((y_score, predict.cpu().detach().numpy()), axis=0)
loss = criterion(predict, labels)

output is that:
------- Class: aeroplane AP: 0.0452 -------
------- Class: bicycle AP: 0.0488 -------

mAP: 0.0806