Segment Anything Model as pre-processing

TeDataPro · June 23, 2023, 9:34am

For pre-processing my images, I use the Segment Anything Model ( GitHub - facebookresearch/segment-anything: The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model. ) to work with masks from the original image.

In the getitem of my CustomDataset class, the original image is read then sent to the SAM model, then pre-processed from these masks before being returned.

However, when I iterate over loaders I have the following issues

RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

I solved the problem by adding the line

torch.multiprocessing.set_start_method('spawn')

However after this, iteration over my DataLoader only works with num_workers=0. When it is strictly positive, the DataLoader iteration does not converge and does not even show an error.

I have absolutly no idea where it can come from. Please, do you have an idea ?

Thank you

ptrblck · June 23, 2023, 4:30pm

Based on the error message you are trying to initialize a CUDA context in worker processes, which would be the case when data is moved to the GPU inside the __getitem__.
Using the spawn method should work or you could keep the data on the CPU and move the entire batch to the GPU inside the DataLoader loop.