Torch.from_numpy(im.copy()).to(device) takes time

Hi,

I’ve been using YOLOv5s model for several years.
I converted the model to TensorRT (engine) to increase the detection speed.
It works.
Although the inference time on GPU was halved,
it took 5-6 milliseconds to convert an image array im(640×640×3) for GPU with the script;
im = torch.from_numpy(im.copy()).to(device)
before feeding input to model.

Is this normal for a PC with the following specs?

CPU:11th Gen Intel(R) Core™ i7-1185G7 @3.00GHz 3.00 GHz
RAM:16.0 GB
Win10+Anaconda3
Python 3.9.18
pytorch 1.11.0
cudatoolkit 11.3.1
eGPU:RTX3060Ti in RAZER CORE X

Is there a way to shorten this pre-processing time?

Thanks in advance.

Hi @swharaday,

Another way to do this would be to initialize the tensor on the GPU directly (rather than mounting on the CPU and moving it to the GPU). You could try something like,

im = torch.as_tensor(im, device=device)

Is there a reason why you copy the im tensor before moving to the GPU?

1 Like

Hi AlphaBetaGamma96,

Thank you for your reply.
I’ll try what you suggest and let you know the results.

There is no reason to copy and move. Just following the author’s description.

Best Regards, Yuji

Hi AlphaBetaGamma96,

Just tried and got error as below. I’ll try to find the reason.
Here it is:

im = torch.as_tensor(im, device=model.device)

ValueError: At least one stride in the given numpy array is negative, and tensors with negative strides are not currently supported. (You can probably work around this by making a copy of your array with array.copy().)

Best Regards, Yuji

Try,

im = torch.as_tensor(im.copy(), device=model.device)
1 Like

Hi AlphaBetaGamma96,

Thank you for your response.

Here is results of trial after 2,000 iteration.

  • “from_numpy” ave. 5.509ms stdev. 0.797
  • “as_tensor” ave. 5.562ms stdev. 0.762

Unfortunately there seems to be no significant difference.
If you have more idea, please kindly advise.

Best Regards, Yuji