Hello everyone,
I am working with semantic segmentation where I end up comparing very large tensors of labels against the predictions (each containing more than 5.6 million pixels). At the moment, I just convert the labels and predictions into python lists and use sklearn to calculate Matthews Correlation Coefficient but this is super slow. I would really appreciate any support to speed up this. Pytorch is fast running epochs but when I try to use sklearn, it gets super slow.
The portion of my code
y_true = []
y_pred = []
with torch.no_grad():
for data in data_loader:
images, labels = data
images = images.to(device)
y_true = y_true + torch.flatten(labels).tolist() # Tensor is still in CPU to avoid copying it to CPU list y_true
labels = labels.to(device)
outputs = net(images)
_, predicted = torch.max(outputs.data, 1)
predicted = predicted.to('cpu')
y_pred = y_pred + torch.flatten(predicted).tolist()
mcc = matthews_corrcoef(y_true, y_pred) # The line that makes the whole code slow
Any help regarding this will be highly appreciated.