Multiprocessing not working on Pytorch on MacBook

Megh_Bhalerao · May 11, 2020, 8:46am

I am training a model on my local MacBook. My dataloader looks something like this:

data_train_source = MNISTSourceTrain("path1","path2")
source_train_loader = DataLoader(data_train_source,batch_size=64,shuffle=True,num_workers = 0)

When I set num_workers=0, the program runs. But, for any value of num_workers>0, I get the following error:

RuntimeError: 
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.
        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom

        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

However when run the program with num_workers=2 or 4 or any other other number greater than 0, on any other OS i.e. Linux Ubuntu it works just fine.

Can someone please elaborate why does this happen? Or any suggestions how do I do something to stop it?

Thanks

albanD · May 11, 2020, 2:37pm

Hi,

As mentionned in the error, does your program has the proper if __name__ == '__main__': guard?

Megh_Bhalerao · May 11, 2020, 2:57pm

Hi @albanD
Yes, it works after inserting the if __name__ == 'main' guard. But, I am just curious why does it not work otherwise? Also, what does num_workers=0 mean? How can the number of processes be zero when fetching the batches from the data loader?

albanD · May 11, 2020, 3:01pm

This is a quirk of python’s multiprocessing I’m afraid: https://stackoverflow.com/questions/20360686/compulsory-usage-of-if-name-main-in-windows-while-using-multiprocessi