I am working on a 2 Titan machine and usingtorchvision.models.resnet50
for feature extracting.
para_model=torch.nn.DataParallel(pretrained_model)
is used and I expect both of my gpu work together. But when check withnvidia-smi
I found only gpu0 is working full load while gpu1’s memory usage remains none. However, when I test in python shell( also with same pretrained network it show that both graphic card are working well.
I am wondering if there should be some options other than just call DataParallel
.
UPDATE:
I made simplified test script(only pretrained model which processes random cuda tensors) and find it works as expected. So I guess it might be an issue with multiprocessing
, as I start a process which loads data and passes numpy arrays to main process via Queue
. Anyway to prove this assumption?