How to fix: 'can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.'

Hi guys,

I’m trying to make a plot and I get this error, here’s a summary of my code:

### Visualize

x_arr = np.arange(len(model_sum[0])) + 1
fig = plt.figure(figsize=(12, 4))
ax = fig.add_subplot(1, 2, 1)
ax.plot(x_arr, model_sum[0], '-o', label='Train Loss')
ax.plot(x_arr, model_sum[1], '--<', label='Validation Loss')
ax.legend(fontsize=15)
ax = fig.add_subplot(1, 2, 2)
ax.plot(x_arr, model_sum[2], '-o', label='Train Acc.').unsqueeze(0).cuda().cpu()
ax.plot(x_arr, model_sum[3], '-<', label='Validation Acc.')
ax.legend(fontsize=15)
ax.set_xlabel('Epoch', size=15)
ax.set_ylabel('Accuracy', size=15)
plt.show()

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [49], in <cell line: 11>()
      9 ax.legend(fontsize=15)
     10 ax = fig.add_subplot(1, 2, 2)
---> 11 ax.plot(x_arr, model_sum[2], '-o', label='Train Acc.').unsqueeze(0).cuda().cpu()
     12 ax.plot(x_arr, model_sum[3], '-<', label='Validation Acc.')
     13 ax.legend(fontsize=15)

File ~\anaconda3\envs\myai\lib\site-packages\matplotlib\axes\_axes.py:1635, in Axes.plot(self, scalex, scaley, data, *args, **kwargs)
   1393 """
   1394 Plot y versus x as lines and/or markers.
   1395 
   (...)
   1632 (``'green'``) or hex strings (``'#008000'``).
   1633 """
   1634 kwargs = cbook.normalize_kwargs(kwargs, mlines.Line2D)
-> 1635 lines = [*self._get_lines(*args, data=data, **kwargs)]
   1636 for line in lines:
   1637     self.add_line(line)

File ~\anaconda3\envs\myai\lib\site-packages\matplotlib\axes\_base.py:312, in _process_plot_var_args.__call__(self, data, *args, **kwargs)
    310     this += args[0],
    311     args = args[1:]
--> 312 yield from self._plot_args(this, kwargs)

File ~\anaconda3\envs\myai\lib\site-packages\matplotlib\axes\_base.py:488, in _process_plot_var_args._plot_args(self, tup, kwargs, return_kwargs)
    486 if len(xy) == 2:
    487     x = _check_1d(xy[0])
--> 488     y = _check_1d(xy[1])
    489 else:
    490     x, y = index_of(xy[-1])

File ~\anaconda3\envs\myai\lib\site-packages\matplotlib\cbook\__init__.py:1311, in _check_1d(x)
   1305 # plot requires `shape` and `ndim`.  If passed an
   1306 # object that doesn't provide them, then force to numpy array.
   1307 # Note this will strip unit information.
   1308 if (not hasattr(x, 'shape') or
   1309         not hasattr(x, 'ndim') or
   1310         len(x.shape) < 1):
-> 1311     return np.atleast_1d(x)
   1312 else:
   1313     return x

File <__array_function__ internals>:180, in atleast_1d(*args, **kwargs)

File ~\anaconda3\envs\myai\lib\site-packages\numpy\core\shape_base.py:65, in atleast_1d(*arys)
     63 res = []
     64 for ary in arys:
---> 65     ary = asanyarray(ary)
     66     if ary.ndim == 0:
     67         result = ary.reshape(1)

File ~\anaconda3\envs\myai\lib\site-packages\torch\_tensor.py:757, in Tensor.__array__(self, dtype)
    755     return handle_torch_function(Tensor.__array__, (self,), self, dtype=dtype)
    756 if dtype is None:
--> 757     return self.numpy()
    758 else:
    759     return self.numpy().astype(dtype, copy=False)

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

I’ve tried different methods suggested by forum members and none seem to work

Thanks

Hi @demmzy419,

Could you try,

ax.plot(x_arr, model_sum[0].cpu().detach().numpy(), '-o', label='Train Loss')

When you have your data on the GPU, and you pass it to a function which contains a numpy operation, you need to first move your Tensor to the CPU then detach to numpy via, .cpu().detach().numpy() as numpy is a CPU only python package

Thanks for your response. After changing I got an attribute error:

AttributeError: 'list' object has no attribute 'cpu'

So, model_sum[0] is a list which you might need to un-pack this further via model_sum[0][0] but that depends how model_sum is created. Can you share the code that creates model_sum?

In short, you just need to extract out a 1d-array so that you can plot it via matplotlib.

My model_sum was used to train the model, here’s the code:

# Train the Model:

def train(model, num_epochs, train_dl, valid_dl):
    loss_hist_train = [0]*num_epochs
    accuracy_hist_train = [0]*num_epochs
    loss_hist_valid = [0]*num_epochs
    accuracy_hist_valid = [0]*num_epochs
    
    for epoch in range(num_epochs):
        model.train()
        for x_batch, y_batch in train_dl:
            x_batch = x_batch.to(device)
            pred = model(x_batch)
            y_batch = y_batch.to(device)
            loss = loss_func(pred, y_batch)
            loss.backward()
            optimizer.step()
            optimizer.zero_grad()
            loss_hist_train[epoch] += loss.item()*y_batch.size(0)
            is_correct = (torch.argmax(pred, dim=1)==y_batch).float()
            accuracy_hist_train[epoch] += is_correct.sum()
        loss_hist_train[epoch] /= len(train_dl.dataset)
        accuracy_hist_train[epoch] /= len(train_dl.dataset)
        
        model.eval()
        
        with torch.no_grad():
            for x_batch, y_batch in valid_dl:
                x_batch = x_batch.to(device)
                pred = model(x_batch)
                y_batch = y_batch.to(device)
                loss = loss_func(pred, y_batch)
                loss_hist_valid[epoch] += loss.item()*y_batch.size(0)
                is_correct = (torch.argmax(pred, dim=1)==y_batch).float()
                accuracy_hist_valid[epoch] += is_correct.sum()
            loss_hist_valid[epoch] /= len(valid_dl.dataset)
            accuracy_hist_valid[epoch] /= len(valid_dl.dataset)
            
            print(f'Epoch {epoch+1} accuracy: '
                  f'{accuracy_hist_train[epoch]:.4f} val_accuracy: '
                  f'{accuracy_hist_valid[epoch]:.4f}')
    return loss_hist_train, loss_hist_valid, accuracy_hist_train, accuracy_hist_valid

then:

model_sum = train(model, num_epochs, train_dataset, valid_dataset)

Thanks

So, the issue isn’t with model_sum[0] it’s with model_sum[2], so remove the .cpu().detach().numpy() comment I stated earlier and replace,

with,

accuracy_hist_valid[epoch] += is_correct.sum().item()

which is a Tensor on the GPU but needs to be moved to the CPU for matplotlib which you can do via .item() as the value is a scalar (in general it’s .cpu().detach().numpy()). The stacktrace shows this here,

(Also, remove the .unsqueeze(0).cuda().cpu() too, that isn’t applicable to matplotlib)

I changed it but my error looked the same:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [56], in <cell line: 11>()
      9 ax.legend(fontsize=15)
     10 ax = fig.add_subplot(1, 2, 2)
---> 11 ax.plot(x_arr, model_sum[2], '-o', label='Train Loss')
     12 ax.plot(x_arr, model_sum[3], '-<', label='Validation Acc.')
     13 ax.legend(fontsize=15)

File ~\anaconda3\envs\myai\lib\site-packages\matplotlib\axes\_axes.py:1635, in Axes.plot(self, scalex, scaley, data, *args, **kwargs)
   1393 """
   1394 Plot y versus x as lines and/or markers.
   1395 
   (...)
   1632 (``'green'``) or hex strings (``'#008000'``).
   1633 """
   1634 kwargs = cbook.normalize_kwargs(kwargs, mlines.Line2D)
-> 1635 lines = [*self._get_lines(*args, data=data, **kwargs)]
   1636 for line in lines:
   1637     self.add_line(line)

File ~\anaconda3\envs\myai\lib\site-packages\matplotlib\axes\_base.py:312, in _process_plot_var_args.__call__(self, data, *args, **kwargs)
    310     this += args[0],
    311     args = args[1:]
--> 312 yield from self._plot_args(this, kwargs)

File ~\anaconda3\envs\myai\lib\site-packages\matplotlib\axes\_base.py:488, in _process_plot_var_args._plot_args(self, tup, kwargs, return_kwargs)
    486 if len(xy) == 2:
    487     x = _check_1d(xy[0])
--> 488     y = _check_1d(xy[1])
    489 else:
    490     x, y = index_of(xy[-1])

File ~\anaconda3\envs\myai\lib\site-packages\matplotlib\cbook\__init__.py:1311, in _check_1d(x)
   1305 # plot requires `shape` and `ndim`.  If passed an
   1306 # object that doesn't provide them, then force to numpy array.
   1307 # Note this will strip unit information.
   1308 if (not hasattr(x, 'shape') or
   1309         not hasattr(x, 'ndim') or
   1310         len(x.shape) < 1):
-> 1311     return np.atleast_1d(x)
   1312 else:
   1313     return x

File <__array_function__ internals>:180, in atleast_1d(*args, **kwargs)

File ~\anaconda3\envs\myai\lib\site-packages\numpy\core\shape_base.py:65, in atleast_1d(*arys)
     63 res = []
     64 for ary in arys:
---> 65     ary = asanyarray(ary)
     66     if ary.ndim == 0:
     67         result = ary.reshape(1)

File ~\anaconda3\envs\myai\lib\site-packages\torch\_tensor.py:757, in Tensor.__array__(self, dtype)
    755     return handle_torch_function(Tensor.__array__, (self,), self, dtype=dtype)
    756 if dtype is None:
--> 757     return self.numpy()
    758 else:
    759     return self.numpy().astype(dtype, copy=False)

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Is x_arr stored on the GPU too?

Check that both x_arr and model_sum[2] are numpy arrays via print(type(x_arr)) (and same for model_sum[2])

x_arr outputted <class 'numpy.ndarray'> and model_sum[2] <class 'list'>

I did train on the GPU

Can you do the same for accuracy_hist_train too?

You currently have a list of Tensors on the GPU whereas matplotlib will require a list of scalars on the CPU.

My accuracy_hist_train and accuracy_hist_valid are both <class 'list'>

yes but they are a list of GPU tensors, do print(accuracy_hist_train[0])

For example,

import torch
 
mylist = [torch.randn(1, device='cuda') for _ in range(10)]

print(type(mylist)) #returns <class 'list'>
print(type(mylist[0])) #returns <class 'torch.Tensor'>
print(mylist[0].device) #returns cuda:0 <-------- a problem with matplotlib

Ahhhh, accuracy_hist_train prints <class 'torch.Tensor'> whilst accuracy_hist_valid prints <class 'float'>

So give this a go, and see if it solves your problem.

Thanks for the help, it worked now!

Cheers!