I’m learning PyTorch basics. I have trained a ResNet50-based model successfully in a Jupyter Notebook using cuda. It achieves around 85% accuracy, and I’d like to explore whether there are patterns to which classes it struggles with. Adapting code from a Microsoft Azure tutorial to obtain the predictions and labels so that I can go on to make a confusion matrix to begin the exploration, I have
#Pytorch doesn't have a built-in confusion matrix metric, so we'll use SciKit-Learn
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
%matplotlib inline
# Set the model to evaluate mode
model.eval()
# Get predictions for the test data and convert to numpy arrays for use with SciKit-Learn
print("Getting predictions from test set...")
truelabels = []
predictions = []
#probabilities = []
for data, target in test_loader:
for label in target.cpu().data.numpy():
truelabels.append(label)
for prediction in model.cpu()(data).data.numpy().argmax(1):
predictions.append(prediction)
The problem I’m encountering is that when I use target.cpu() and model.cpu() in the above code, it starts running very slowly then throws the following error:
RuntimeError: [enforce fail at …\c10\core\CPUAllocator.cpp:79] data. DefaultCPUAllocator: not enough memory: you tried to allocate 62914560 bytes.
Or crashes the Google Chrome window that the Jupyter Notebook is open on.
I trained the model using cuda, but if I switch to target.cuda() and model.cuda() it throws the following error:
TypeError: can’t convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
Since I adapted this from a tutorial that used a much smaller dataset as a proof of concept, I’m not sure if this is even a proper way to explore the results. Can someone please advise me on best practice in this regard, or recommend a workaround for the issues I’m encountering?