Constructing feature vectors for my current dataset of Portable Executable files is rather slow, so my current workaround is to create and save (using pickle) the feature vector for each file and load them using a custom Pytorch Dataset. I noticed that a significant bottleneck was moving these vectors from CPU to GPU each time they were loaded (using a Pytorch Dataloader).
I tried saving and loading the cuda tensors instead but got the error:
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the ‘spawn’ start method
I did some searching and found that this line should be used to handle cuda multiprocessing:
However, if I insert this into my code or the Dataloader code, I always get the following error:
RuntimeError: context has already been set
So my questions are:
- Is there a way to make Pytorch Dataloaders work with serialized cuda tensors?
- Am I fundamentally misunderstanding something?