Hello, I’ve experienced a serious data corruption while saving tensors to a pickle file (about 1 giga byte of data). Probably I’ve tried to write to it again before the previous writing was complete? It’s a google colab script that saves data to its google drive environment, and each writing operation came a minute or so after the previous writing… no multithreading involved, just a saving operation in a while loop. How can I make sure that this doesn’t happen again?? Is there a way to check if there’s a file lock, or if there’s a process still writing to it?
Thanks
it saves like this
torch.save(colonne_out, ‘0.pt’)
after saving the file a lot of times, I tried to load the file but it gave an error
colonne_out = torch.load(pickle_file, map_location=torch.device(device)
RuntimeError Traceback (most recent call last)
in ()
1 # se sono offline, carico il file delle colonne di output
----> 2 colonne_out = torch.load(pickle_file, map_location=torch.device(device))
1 frames
/usr/local/lib/python3.7/dist-packages/torch/serialization.py in init(self, name_or_buffer)
240 class _open_zipfile_reader(_opener):
241 def init(self, name_or_buffer) → None:
→ 242 super(_open_zipfile_reader, self).init(torch._C.PyTorchFileReader(name_or_buffer))
243
244
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory