RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generated/../THCReduceAll.cuh:317 using evaluation from InceptionV3

outputs = model(inputs)
                    # for nets that have multiple outputs such as inception
                    if isinstance(outputs, tuple):
                        loss = sum((criterion(o,labels) for o in outputs))
                        loss = criterion(outputs, labels)


                    # backward + optimize only if in training phase
                    if phase == 'train':
                        _, preds = torch.max(outputs[0].data, 1)
                        _, preds = torch.max(, 1)

I’m using InceptionV3 from pytorch and everything works great until the Evaluation phase. The training finishes but when I try to run the evaluation phase I get this error on _, preds = torch.max(, 1).

Traceback (most recent call last):
  File "", line 212, in <module>
  File "", line 199, in main
  File "", line 146, in train_model
  File "/home/luiscosta/PycharmProjects/wsi_preprocessing/oncofinder_preprocessing/lib/python3.6/site-packages/torch/", line 66, in __repr__
    return torch._tensor_str._str(self)
  File "/home/luiscosta/PycharmProjects/wsi_preprocessing/oncofinder_preprocessing/lib/python3.6/site-packages/torch/", line 277, in _str
    tensor_str = _tensor_str(self, indent)
  File "/home/luiscosta/PycharmProjects/wsi_preprocessing/oncofinder_preprocessing/lib/python3.6/site-packages/torch/", line 195, in _tensor_str
    formatter = _Formatter(get_summarized_data(self) if summarize else self)
  File "/home/luiscosta/PycharmProjects/wsi_preprocessing/oncofinder_preprocessing/lib/python3.6/site-packages/torch/", line 84, in __init__
    nonzero_finite_vals = torch.masked_select(tensor_view, torch.isfinite(tensor_view) &
RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generated/../THCTensorMathCompare.cuh:82

Any idea what’s wrong?

Often this means some CUDA side indexing problem. Can you run again with CUDA_LAUNCH_BLOCKING=1 and report the error you get then?