Hi,
I am trying to modify my A3C code from cpu device version to cuda version.
However I am facing above error.
I have already set the start method to ''spawn"
from torch.multiprocessing import Process, set_start_method
try:
set_start_method(‘spawn’)
except:
pass
and send the model to device
device = torch.device(“cuda” if torch.cuda.is_available() else “cpu”)
model = ActorCritic(params).to(device)
and running the workers by:
for rank in range(params.num_processes):
try:
p = Process(target=train, args=(rank, params, model, optimizer, indices, sc, device, filename))
jobs.append(p)
p.start()
except Exception as e:
print(e)
traceback.print_exc()
var = traceback.format_exc()
f.write("exception:\n"+str(var))
for p in jobs:
p.join()
error stacktrace:
p.start()
File “C:\Users\Granth\anaconda3\lib\multiprocessing\process.py”, line 112, in start
self._popen = self._Popen(self)
File “C:\Users\Granth\anaconda3\lib\multiprocessing\context.py”, line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File “C:\Users\Granth\anaconda3\lib\multiprocessing\context.py”, line 322, in _Popen
return Popen(process_obj)
File “C:\Users\Granth\anaconda3\lib\multiprocessing\popen_spawn_win32.py”, line 89, in init
reduction.dump(process_obj, to_child)
File “C:\Users\Granth\anaconda3\lib\multiprocessing\reduction.py”, line 60, in dump
ForkingPickler(file, protocol).dump(obj)
File “C:\Users\Granth\anaconda3\lib\site-packages\torch\multiprocessing\reductions.py”, line 240, in reduce_tensor
event_sync_required) = storage.share_cuda()
RuntimeError: cuda runtime error (801) : operation not supported at …\torch/csrc/generic/StorageSharing.cpp:247
Traceback (most recent call last):
File “”, line 1, in
File “C:\Users\Granth\anaconda3\lib\multiprocessing\spawn.py”, line 105, in spawn_main
exitcode = _main(fd)
File “C:\Users\Granth\anaconda3\lib\multiprocessing\spawn.py”, line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
I am not sure what am I missing here.Also on online reading some posts say that you can not share cuda tensor.
If this is the case then how do we do multiprocessing with cuda