Hi all. I am trying to copy two of my models onto the GPU simultaneously using threads. On normally transferring each of the models, they take around 30 ms each. However when I do them parallely, they take approximately 60-65 ms each. Why is this so and how can I solve it ? please help !! I have used multiprocessing also but even that didnt help.
Have you checked the theoretical lowerbound time, based on the size of the data, and the banwidth available between your main memory and the gpu on-card memory?
Yes… All are good except when I do the parallel copy using threads