What’s your PyTorch version? It should accept a single GPU. How is that even possible that it uses last two GPUs if you specify device_ids=[0,1]?
If you run your script with CUDA_VISIBLE_DEVICES=2,3 it will always execute on the last two GPUs, not on the first ones. I can’t see how that helps in this case. CUDA_VISIBLE_DEVICES=0,1 would make more sense.
@Seungyoung_Park from my experience, it’s usually nvidia-smi that is reversed with everything else.
For example, on my machine, the numbering from pytorch agrees with the numbering of the deviceQuery nvidia sample (and any cuda program for that matter) while nvidia-smi is the only one giving a different numbering.
Thanks a lot for your answering.
My pytorch version is 0.1.8.
There may be a numbering problem of GPU device, but it does not affect our usages.
My problem is about how to allocate GPU usages, now everything is fine
I have installed Nvidia Cuda 9.0 toolkit with Cudnn to my ubuntu machine.
I have installed pytorch when i am trying to check for gpu usage by running the below code -
~/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/cuda/init.py in _lazy_new(cls, *args, **kwargs)
385 # We need this method only for lazy init, so we can remove it
386 del _CudaBase.new
–> 387 return super(_CudaBase, cls).new(cls, *args, **kwargs)
388
389
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518244421288/work/torch/lib/THC/generic/THCStorage.cu:58
I think pytorch is not communicating with the Nvidia GPU, please advise.
I wouldn’t recommend the first approach, since you would have to make sure these lines of code are imported before any other library, which might take the GPU. If some script imports PyTorch and these lines are executed afterwards, they won’t have any effect anymore.
The second approach makes sure to mask the devices before running the Python script.