Dataloader num_workers relate to gpu memory?

You would have to profile your code and check how long the data loading takes. E.g. your overall data loading might be faster than the model training iteration and thus the time to load each batch could already be hidden. In such a case speeding up the data loading would of course not yield any performance improvement. On the other hand you could see a data loading bottleneck, but your system cannot speed it up further e.g. due to the limited read speeds of your SSD etc.
Generally, I would recommend to profile the code to see where the bottlenecks are and then try to optimize it.

EDIT: also explained in this answer from your double post.