Triple_chevron_launcher::launch on multi-gpu system

Hi,

I’m trying run a code on a multi-gpu system (4 GPUs). I’ve a python code running on first two and am running another on the next two.

When the following code is executed:

loss = loss[ignore_mask].mean()
RuntimeError: after cudaLaunch in triple_chevron_launcher::launch(): too many resources requested for launch

However when I try to run the same code on a single GPU system this error doesn’t occur. I’m using Python 3.6.7 [GCC 7.3.0] on linux with pytorch version ‘0.4.1’

1 Like

Same happens to me, and I have no idea why it occurs, any comment would be appreciated.
RuntimeError: after cudaLaunch in triple_chevron_launcher::launch(): too many resources requested for launch