I’m attempting to save a large number of torch Data objects to disk in parallel with the following block
def to_disk(r):
path = '/data/protein_data_dir/raw/' + r['ID'] + '.pt'
g = convert_nx_to_pyg(create_graph(r))
torch.save(g, path)
return g
NUM_CORE = 35
with mp.Pool(NUM_CORE) as pool:
out = list(tqdm(pool.imap_unordered(to_disk, rows), total=len(rows)))
This will work for some number of iterations and then fail with “MaybeEncodingError: Error sending result: ‘Data(x=[51, 163], edge_index=[2, 92], edge_attr=[92, 10], y=1.0)’. Reason: ‘RuntimeError(‘unable to write to file </torch_29804_423365969_110>: No space left on device (28)’)’”. If I then start the loop again it will save some more objects before running into the same error again. Running this on a single process does not induce the error but is slow. There are several terabytes of space on the disk these objects are being saved to. Anyone know whats going on?