I ran the model on single GPU with multithreading, and details about this error are as follows.
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:145: operator(): block: [1103,0,0], thread: [94,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
../aten/src/ATen/native/cuda/ScatterGatherKernel.cu:145: operator(): block: [1103,0,0], thread: [95,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
Traceback (most recent call last):
File "/home/cy/data/envs/nnunetv2/bin/nnUNetv2_train", line 33, in <module>
sys.exit(load_entry_point('nnunetv2-cy', 'console_scripts', 'nnUNetv2_train')())
File "/data/cy/projects/nnUNetV2/nnunetv2/run/run_training.py", line 258, in run_training_entry
run_training(args.dataset_name_or_id, args.configuration, args.fold, args.tr, args.p, args.pretrained_weights,
File "/data/cy/projects/nnUNetV2/nnunetv2/run/run_training.py", line 201, in run_training
nnunet_trainer.run_training()
File "/data/cy/projects/nnUNetV2/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py", line 1285, in run_training
train_outputs.append(self.train_step(next(self.dataloader_train)))
File "/data/cy/projects/nnUNetV2/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py", line 886, in train_step
output = self.network(data)
File "/home/cy/data/envs/nnunetv2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/data/cy/projects/nnUNetV2/nnunetv2/network_architecture/e2unet.py", line 62, in forward
return self.decoder(skips)
File "/home/cy/data/envs/nnunetv2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/data/cy/projects/nnUNetV2/nnunetv2/network_architecture/custom_module/refine_decoder.py", line 104, in forward
x = self.feature_refine[s](x)
File "/home/cy/data/envs/nnunetv2/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/data/cy/projects/nnUNetV2/nnunetv2/network_architecture/custom_module/feature_refine/modules/deformable_refine.py", line 119, in forward
updated_x = updated_x.scatter_(dim=2, index=linear_pos, src=x_sampled)
RuntimeError: CUDA error: device-side assert triggered
I have no ideas about the problem, because I’ve checked the dimension size of all the three tensors (updated_x, linear_pos, x_sampled) and the indices in linear_pos, and nothing wrong seemingly.