Hello, Pytorch users!
I am implementing multiagent reinforcement learning and finished to test it,
and I am trying to convert cpu tensor to cuda tensor by myself.
but it became more slow than before… I don’t know what the problem is… It ran without error. but very slowly.
I think that it is working well as I want it to run on gpu. but I think copying data between cpu tensor and cuda tensor takes longer time than saved time through using gpu. I think is there any way to get more performance?