Hi Folks,
I’m having a strange issue that I think I already spent two days :). I guess it related to the example given.
x = queue.get()
x_clone = x.clone()
queue_2.put(x_clone)
So if tensor x on GPU, as far I understand, you need to clone it.
First question, if I have object A, which holds a reference for four tensors, what is the right way to clone an entire object with all tensors ? i.e do you need first move to CPU put the queue and then back to GPU ? Because if detach.clone while tensor on GPU the data always zeroes.
class A
self._observations → tensor
self._actions → tensor
def clone(self, dst):
dst._observations = self._observations.detach().clone().cpu()
dst._actions = self._actions.detach().clone().cpu()
This one I use
lock network_lock:
x = A()
network.forward(x)
new_x = A()
# a a new object. x.clone() take new_x and detach() and move to cpu()
# note that I've tried to detach() and clone() and different combination but can't get it work.
# NOTE when device set to CPU everything is working
queue.put(x.clone(new_x))
My second question is if a tensor on GPU what is right way to put object
A that hold bunch of tensor on GPU ?
def clone(self, dst):
dst._observations = self._observations.detach().clone()
dst._actions = self._actions.detach().clone()
In the second example, I notice as soon as the object goes to a queue.
If a tensor on GPU data becomes zero, the value is zero when the consumer picks up from the queue.