Torch.from_numpy(im.copy()).to(device) takes time

swharaday · May 7, 2024, 5:50am

Hi,

I’ve been using YOLOv5s model for several years.
I converted the model to TensorRT (engine) to increase the detection speed.
It works.
Although the inference time on GPU was halved,
it took 5-6 milliseconds to convert an image array im(640×640×3) for GPU with the script;
im = torch.from_numpy(im.copy()).to(device)
before feeding input to model.

Is this normal for a PC with the following specs?

CPU:11th Gen Intel(R) Core™ i7-1185G7 @3.00GHz 3.00 GHz
RAM：16.0 GB
Win10+Anaconda3
Python 3.9.18
pytorch 1.11.0
cudatoolkit 11.3.1
eGPU:RTX3060Ti in RAZER CORE X

Is there a way to shorten this pre-processing time?

Thanks in advance.

AlphaBetaGamma96 · May 7, 2024, 9:09am

Hi @swharaday,

Another way to do this would be to initialize the tensor on the GPU directly (rather than mounting on the CPU and moving it to the GPU). You could try something like,

im = torch.as_tensor(im, device=device)

Is there a reason why you copy the im tensor before moving to the GPU?

swharaday · May 8, 2024, 12:31am

Hi AlphaBetaGamma96,

Thank you for your reply.
I’ll try what you suggest and let you know the results.

There is no reason to copy and move. Just following the author’s description.

Best Regards, Yuji

swharaday · May 8, 2024, 12:40am

Hi AlphaBetaGamma96,

Just tried and got error as below. I’ll try to find the reason.
Here it is:

im = torch.as_tensor(im, device=model.device)

ValueError: At least one stride in the given numpy array is negative, and tensors with negative strides are not currently supported. (You can probably work around this by making a copy of your array with array.copy().)

Best Regards, Yuji

AlphaBetaGamma96 · May 8, 2024, 1:26pm

Try,

im = torch.as_tensor(im.copy(), device=model.device)

swharaday · May 8, 2024, 10:37pm

Hi AlphaBetaGamma96,

Thank you for your response.

Here is results of trial after 2,000 iteration.

“from_numpy” ave. 5.509ms stdev. 0.797
“as_tensor” ave. 5.562ms stdev. 0.762

Unfortunately there seems to be no significant difference.
If you have more idea, please kindly advise.

Best Regards, Yuji