Code likes below. It will block at creating x1
. If you remove line print("finish create\n", x)
, it works fine.
This really bothers me.
Thanks if anyone helps
import torch
import torch.multiprocessing as multiprocessing
def run():
print("run")
device=torch.device("cpu")
x1 = torch.rand(720, 1280, dtype=torch.float32, device=device)
print('a', x1)
x2 = x1.to(dtype=torch.float64)
print('b', x2.dtype)
x = torch.rand(1080, 1920,device=torch.device("cpu"))
print("finish create\n", x)
thread = multiprocessing.Process(target=run)
thread.start()
print("wait")
thread.join()
print("finish")
@MoriartyShan
The issue was with this line
print('a', x1)
If you convert this to
print('a', x1.detach().numpy())
I am not sure why this is happening. We need to get to the bottom of this. But printing of tensors within multiprocessing function caused your issue
Thanks for your advice.
But after changing code as yours, it blocked at x2 = x1.to(dtype=torch.float64)
. The line next it never executed.
I think there may be some setting in pytorch causes it.
And if change device in run
to 'cuda:0'
, the problem disappears. But this really not what I want.
@MoriartyShan you are correc. This is weird behavior. For the time being, following is the workaround, but i will try to get to the bottom of this. It will be a good exercise
import torch
import torch.multiprocessing as multiprocessing
import copy
def run():
print("run")
device=torch.device("cpu")
print(torch.__version__)
x1 = torch.rand(720, 1280, dtype=torch.float32, device=device)
x2 = torch.from_numpy(x1.detach().numpy().astype(np.float64))
print('b', x2.dtype)
x = torch.rand(1080, 1920,device=torch.device("cpu"))
print("finish create\n", x)
thread = multiprocessing.Process(target=run)
thread.start()
print("wait")
thread.join()
print("finish")
Thanks.
The example is a simplified code from my work. Changing tensor to numpy array can truly solve the problem above. Hope there is some way to work with pytorch in another thread.
I will try figure it out also.