Hello,
I have access to one server that I can use one GPU per time with 40 GB memory. As each of my experiments only take 7gb±, sometimes I run multiple experiments with different hparams. But I have been noticing that when I do that, the experiments run too slowly. Even the CPU and GPU have a lot of free resources.
My guess is some strange resource that is not fully isolated when I run multiple processes on the same GPU. Is there a way to fully isolate the process in the same GPU to take advantage of the free memory that I have?
ps: I know that maybe the SLURM can do something like that, emulating the GPU with scheduling for different users, but as I am not the root, I would like to know if there is maybe a simple solution or another possible solution.
Best Regards,