Tensor to numpy on pytorch 1.8

shuuny-matrix · March 22, 2021, 11:42am

Hi, I tried to train a yolov4 pytorch model with a reference to this repo:

Then I run into a error like this

Traceback (most recent call last):
  File "train.py", line 438, in <module>
    train(hyp, opt, device, tb_writer)
  File "train.py", line 301, in train
    results, maps, times = test.test(opt.data,
  File "/home/kamal/Documents/PyTorch_YOLOv4/test.py", line 206, in test
    plot_images(img, output_to_target(output, width, height), paths, str(f), names)  # predictions
  File "/home/kamal/Documents/PyTorch_YOLOv4/utils/general.py", line 907, in output_to_target
    return np.array(targets)
  File "/home/kamal/.virtualenvs/pytorch-yolov4/lib/python3.8/site-packages/torch/tensor.py", line 621, in __array__
    return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

My pytorch version is 1.8.0 and cuda is 11.2
Can somebody look into this?

Dwight_Foster · March 22, 2021, 1:06pm

In test.py just change this line

    plot_images(img, output_to_target(output, width, height), paths, str(f), names)  # predictions

to this

    plot_images(img, output_to_target(output.cpu(), width, height), paths, str(f), names)  # predictions

Alexey_Demyanchuk · March 22, 2021, 1:25pm

They actually have the conversion part in the code of output_to_target function if the output argument is a tensor. Cuda tensor is definitely a torch.Tensor as well, so this part of code should put it on CPU and convert to NumPy. Are you sure, you are using the latest version of their GitHub repo?

def output_to_target(output, width, height):
    # Convert model output to target format [batch_id, class_id, x, y, w, h, conf]
    if isinstance(output, torch.Tensor):
        output = output.cpu().numpy()

shuuny-matrix · March 22, 2021, 10:19pm

Actually this was the first thing I tried but it results into error like this:

Starting training for 300 epochs...

     Epoch   gpu_mem      GIoU       obj       cls     total   targets  img_size
     0/299     8.92G   0.03446    0.1387         0    0.1732         2      1280
               Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95:   0%|                                                                                 | 0/124 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "train.py", line 438, in <module>
    train(hyp, opt, device, tb_writer)
  File "train.py", line 301, in train
    results, maps, times = test.test(opt.data,
  File "/home/kamal/Documents/PyTorch_YOLOv4/test.py", line 206, in test
    plot_images(img, output_to_target(output.cpu(), width, height), paths, str(f), names)  # predictions
AttributeError: 'list' object has no attribute 'cpu'

It starts training but somehow when it calls the test function, it fails to load the tensor.

shuuny-matrix · March 22, 2021, 10:21pm

Yes I am using the latest version of their repo and yes they have this output_to_target function but some how it fails. It starts training but somehow fails as soon as it calls the test function.

Alexey_Demyanchuk · March 23, 2021, 7:45am

Based on both of your provided errors this check if isinstance(output, torch.Tensor): doesn’t work because output has type list. I don’t know how this supposed to work, but output coming from non_max_suppression function in test.py is a list. And then you check non_max_suppression function its supposed return is also a list and they are never converting that to tensor somehow.

You should probably dig dipper in the code and rewrite it a bit saving the logic. You may want to move outputs of the model to cpu before non_max_suppression call in test.py or do the same in output_to_target on the output variable iteratively