Dataloader worker thread using too much memory with mp.set_start_method("spawn")

I am setting multiprocessing start method as “spawn”, using:
mp.set_start_method("spawn")
and then using dataloader.

However, when I try to increase the num_workers for the dataloader, the memory bloats.
This makes sense because I think what is happening is that each dataloader worker thread
now became a spawned process (maybe even with its own python interpreter) and copies the
entire memory instead of sharing, etc.

However, as you can imagine, this is not what I wanted. I am doing Hogwild! by referencing
the example from here: https://github.com/pytorch/examples/blob/master/mnist_hogwild/train.py
I want the start method to be “spawn” for each hogwild threads, but not for the dataloader worker threads. Is there a way I can use the good old “fork” for the dataloader threads and only use “spawn” for the hogwild threads, so that using many dataloader worker threads does not bloat the memory?
Or if I am understanding the problem incorrectly, let me know!

Any suggestions are highly appreciated.
Thank you in advance!