PyTorch multiprocessing with CUDA sets tensors to 0

I have the same issue under windows. However, even when just doing deepcopy of the network and giving each process a copy, the parameters get set to 0.

Did anyone manage to solve it?