I get this when I put my laptop to sleep while in the middle of training. When I put it to sleep, my script stops and I get this error:
THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/generated/../THCReduceAll.cuh line=334 error=4 : unspecified launch failure
Traceback (most recent call last):
File "trytry.py", line 111, in <module>
loss = network.loss(prediction, label_batch) + 10*torch.mean(cheat_amount)
File "trytry.py", line 73, in loss
union = 1e-5 + prediction.sum() + label.sum()
File "/usr/local/lib/python3.5/dist-packages/torch/autograd/variable.py", line 437, in sum
return Sum(dim)(self)
File "/usr/local/lib/python3.5/dist-packages/torch/autograd/_functions/reduce.py", line 16, in forward
return input.new((fn(),))
RuntimeError: cuda runtime error (4) : unspecified launch failure at /b/wheel/pytorch-src/torch/lib/THC/generated/../THCReduceAll.cuh:334
And afterward I get this:
THCudaCheck FAIL file=/b/wheel/pytorch-src/torch/lib/THC/THCGeneral.c line=66 error=30 : unknown error
Traceback (most recent call last):
File "trytry.py", line 77, in <module>
network = Net()
File "trytry.py", line 57, in __init__
self.squeezenet = models.squeezenet1_1(pretrained=True).features.cuda()
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 147, in cuda
return self._apply(lambda t: t.cuda(device_id))
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 118, in _apply
module._apply(fn)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 124, in _apply
param.data = fn(param.data)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 147, in <lambda>
return self._apply(lambda t: t.cuda(device_id))
File "/usr/local/lib/python3.5/dist-packages/torch/_utils.py", line 65, in _cuda
return new_type(self.size()).copy_(self, async)
File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 272, in __new__
_lazy_init()
File "/usr/local/lib/python3.5/dist-packages/torch/cuda/__init__.py", line 85, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (30) : unknown error at /b/wheel/pytorch-src/torch/lib/THC/THCGeneral.c:66
A reboot has fixed the problem. This is with CUDA 8.0 and an nvidia 1060.
I have hp-1000 laptop without GPU. Now how it is possible to handle this error.
"THCudaCheck FAIL file=torch/csrc/cuda/Module.cpp line=32 error=30 : unknown error
Traceback (most recent call last): File “train_nli.py”, line 62, in
torch.cuda.set_device(params.gpu_id)
File “/home/farhatullah/anaconda3/lib/python3.6/site-packages/torch/cuda/init.py”, line 262, in set_device torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (30) : unknown error at torch/csrc/cuda/Module.cpp:32
"
actually I am training InferSent sentence embedding model. There is any availability of https://github.com/facebookresearch/InferSent/blob/master/train_nli.py this code for CPU??