Runtime error occurs when using .cuda(1)

Hi all,

I try to use pytorch on the 2nd GPU,

`a = torch.ones(1).cuda(1)
 b = torch.ones(1).cuda(1)
 c = torch.cat((a,b),0)`

Then an error comes out:

RuntimeError: cuda runtime error (77) : an illegal memory access was encountered at /data/users/soumith/miniconda2/conda-bld/pytorch-0.1.7_1485444530918/work/torch/lib/THC/generic/THCTensorCopy.c:65

How can I fix this?

In addition, how to set learning rates for different layers?
I think use
for param_group in optimizer.state_dict()['param_groups']: param_group['lr'] = lr
can only set the learning rate for hole model.

Yes, we’re aware of the bug in cat. It will be fixed during the weekend.

About the second question, see the per-parameter options section of the optim docs.

1 Like

@apaszke Thank you so much!

Encounter the same problem. I have just updated to the latest version but the error sitll rises. Has it been fixed? If not, is there any workaround?

Whatsmore, this error rises only when I use GPU 1,2,3 on my PC. No error rises if I use gpu 0. Not sure whether this is related to the issue: torch.cat puts result on current GPU rather than GPU of inputs.

For now, it seems that I can workaround by using GPU 0.

A temporary workaround is to wrap the torch.cat calls in with torch.cuda.device_if(tensor) where tensor can be e.g. the first element of the catted sequence. A fix is waitining in this PR.

1 Like