How to install PyTorch such that can use GPUs

I went ahead and installed the new .whl that @smth put out, (changing to cu80 since I have cuda 8.0), and the install seemed to go well. I can train a net, but I am wondering if I need to do anything else to make sure that it uses my GPUs? When I do nvidia-smi I do not see any activity… how can I force it to train using my GPUs? thanks.

EDIT:

I use net.cuda() and this seems to work. However when I try to run my script I get this error:

Traceback (most recent call last):
File “test.py”, line 266, in
yEst = net.forward_prop(currentBatchData)
File “test.py”, line 126, in forward_prop
x = F.max_pool2d(F.relu(self.conv1(x)), (2,2))
File “/data/venv/pytorch/local/lib/python2.7/site-packages/torch/nn/modules/module.py”, line 210, in call
result = self.forward(*input, **kwargs)
File “/data/venv/pytorch/local/lib/python2.7/site-packages/torch/nn/modules/conv.py”, line 235, in forward
self.padding, self.dilation, self.groups)
File “/data/venv/pytorch/local/lib/python2.7/site-packages/torch/nn/functional.py”, line 37, in conv2d
return f(input, weight, bias) if bias is not None else f(input, weight)
File “/data/venv/pytorch/local/lib/python2.7/site-packages/torch/nn/_functions/conv.py”, line 33, in forward
output = self._update_output(input, weight, bias)
File “/data/venv/pytorch/local/lib/python2.7/site-packages/torch/nn/_functions/conv.py”, line 88, in _update_output
return self._thnn(‘update_output’, input, weight, bias)
File “/data/venv/pytorch/local/lib/python2.7/site-packages/torch/nn/_functions/conv.py”, line 147, in _thnn
return impl[fn_name](self, self._bufs[0], input, weight, *args)
File “/data/venv/pytorch/local/lib/python2.7/site-packages/torch/nn/_functions/conv.py”, line 219, in call_update_output
bias, *args)
TypeError: FloatSpatialConvolutionMM_updateOutput received an invalid combination of arguments - got (int, torch.FloatTensor, torch.FloatTensor, torch.cuda.FloatTensor, torch.cuda.FloatTensor, torch.FloatTensor, torch.FloatTensor, long, long, int, int, int, int), but expected (int state, torch.FloatTensor input, torch.FloatTensor output, torch.FloatTensor weight, [torch.FloatTensor bias or None], torch.FloatTensor finput, torch.FloatTensor fgradInput, int kW, int kH, int dW, int dH, int padW, int padH)

(If I do not try to run net.cuda() my script works find on the CPU).

Look like your input haven’t gotten sent to GPU yet.
Just to be sure, did you call something akin to input = Variable(input.cuda()) before forward it through the net?

1 Like

@NgPDat Ok so just to be sure, all I did was run myDNN.cuda() after the net was created but before the training loop, and in the forward prop, I have myDNN.forward(input.cuda()). However this still gives me the same error…

Nevermind, I think it works now: I had forgotten to do the same for the labels as well. Thanks!

Hi,

I installed PyTorch by Docker image pytorch-cudnnv6 on a VM following [https://github.com/pytorch/pytorch#installation].

Then I tried to translate a test text using the pretrained model onmt_model_en_fr_b1M published on https://github.com/OpenNMT/OpenNMT-py with the command:

python translate.py -model ../onmt-model/onmt_model_en_fr_b1M-261c69a7.pt -src ../test.txt -output ../test.tok

, but failed with the following error:

Traceback (most recent call last):
  File "translate.py", line 116, in <module>
    main()
  File "translate.py", line 55, in main
    translator = onmt.Translator(opt)
  File "/root/OpenNMT-py/onmt/Translator.py", line 11, in __init__
    checkpoint = torch.load(opt.model)
  File "/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/serialization.py", line 222, in load
    return _load(f, map_location, pickle_module)
  File "/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/serialization.py", line 355, in _load
    return legacy_load(f)
  File "/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/serialization.py", line 300, in legacy_load
    obj = restore_location(obj, location)
  File "/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/serialization.py", line 85, in default_restore_location
    result = fn(storage, location)
  File "/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/serialization.py", line 67, in _cuda_deserialize
    return obj.cuda(device_id)
  File "/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/_utils.py", line 56, in _cuda
    with torch.cuda.device(device):
  File "/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/cuda/__init__.py", line 136, in __enter__
    _lazy_init()
  File "/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/cuda/__init__.py", line 96, in _lazy_init
    _check_driver()
  File "/opt/conda/envs/pytorch-py35/lib/python3.5/site-packages/torch/cuda/__init__.py", line 70, in _check_driver
    http://www.nvidia.com/Download/index.aspx""")
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx

It looks having no GPU support. Yes, I’m using a VM, so don’t have GPU. Is this right?

My question is: how can I use CPU with OpenNMT commands in this environment so as to avoid the error and get successful?

I then tried compiling and installing PyTorch from source without CUDA support on a VM following [https://github.com/pytorch/pytorch#installation], and executed the same command above again. And also I got error below:

Traceback (most recent call last):
  File "translate.py", line 123, in <module>
    main()
  File "translate.py", line 56, in main
    translator = onmt.Translator(opt)
  File "/root/OpenNMT-py/onmt/Translator.py", line 12, in __init__
    checkpoint = torch.load(opt.model)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 229, in load
    return _load(f, map_location, pickle_module)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 362, in _load
    return legacy_load(f)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 307, in legacy_load
    obj = restore_location(obj, location)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 85, in default_restore_location
    result = fn(storage, location)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 67, in _cuda_deserialize
    return obj.cuda(device_id)
  File "/root/anaconda3/lib/python3.6/site-packages/torch/_utils.py", line 57, in _cuda
    with torch.cuda.device(device):
  File "/root/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 129, in __enter__
    _lazy_init()
  File "/root/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 89, in _lazy_init
    _check_driver()
  File "/root/anaconda3/lib/python3.6/site-packages/torch/cuda/__init__.py", line 56, in _check_driver
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Could anyone help? Thanks a lot!

@lifengd i think that example requires GPU or has a command-line flag controlling not requiring it.

@smth

Hi Smth,

Do you mean the Python edition of OpenNMT require GPU? Not ok with CPU only on VM?

And seems there is no such a control flag from the command help:

root@1bf56383b1ca:~/OpenNMT-py# python translate.py --help
usage: translate.py [-h] -model MODEL -src SRC [-tgt TGT] [-output OUTPUT]
[-beam_size BEAM_SIZE] [-batch_size BATCH_SIZE]
[-max_sent_length MAX_SENT_LENGTH] [-replace_unk]
[-verbose] [-n_best N_BEST] [-gpu GPU]

translate.py

optional arguments:
-h, --help show this help message and exit
-model MODEL Path to model .pt file
-src SRC Source sequence to decode (one line per sequence)
-tgt TGT True target sequence (optional)
-output OUTPUT Path to output the predictions (each line will be the
decoded sequence
-beam_size BEAM_SIZE Beam size
-batch_size BATCH_SIZE
Batch size
-max_sent_length MAX_SENT_LENGTH
Maximum sentence length.
-replace_unk Replace the generated UNK tokens with the source token
that had the highest attention weight. If phrase_table
is provided, it will lookup the identified source
token and give the corresponding target token. If it
is not provided (or the identified source token does
not exist in the table) then it will copy the source
token
-verbose Print scores and predictions for each sentence
-n_best N_BEST If verbose is set, will output the n_best decoded
sentences
-gpu GPU Device to run on

Thanks!