I am running the code in pytorch using gpu which produces a matrix that is fed to the matlab code where the other logic is implemented.
The code is working fine but takes 10 to 15 min for one epoch using 2 gpu’s. The probable reason might be the data transfer from pytorch cuda tensor to matlab variable and then running on matlab in cpu and from matlab to pytorch tensor.
I am curious to know that is there any way such that I can use the pytorch tensor as it is on matlab code and force the matlab to run gpu with that pytorch cuda tensor.
Any help is highly appreciated.