U-Net for multi-class segmentation

Clara_75 · April 25, 2019, 10:40am

Hi,
I a new to PyTorch and also to deep learning. I am trying to produce a segmentation within four classes (background and 3 objects).
I obtained the U-Net model from online existing ones.
I do not understand the error I get: does it arise from the model, or cuda use:

RuntimeError Traceback (most recent call last)
in
11 l = torch.nn.functional.cross_entropy(prediction, y_train_batch)
12 my_optimizer.zero_grad()
—> 13 l.backward()
14 my_optimizer.step()
15 train_loss += l

/usr/local/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
91 products. Defaults to False.
92 “”"
—> 93 torch.autograd.backward(self, gradient, retain_graph, create_graph)
94
95 def register_hook(self, hook):

/usr/local/lib/python3.6/site-packages/torch/autograd/init.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
88 Variable._execution_engine.run_backward(
89 tensors, grad_tensors, retain_graph, create_graph,
—> 90 allow_unreachable=True) # allow_unreachable flag
91
92

RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THCUNN/generic/Threshold.cu:67

Please could someone give me indications?

AnBucquet · April 25, 2019, 11:47am

Hi,
Can you run your script with CUDA_LAUNCH_BLOCKING=1 python your_script.py to get a more accurate stack trace?

Besides can you give us more information about your problem? likes the labels you use?

Clara_75 · April 25, 2019, 8:49pm

Hi,
Thank you for your answer, and sorry for the delay. I am working from a jupiter notebook on floydhub so I set CUDA_LAUNCH_BLOCKING=1 into the terminal before running the notebook (I guess that is how to do it?). I have a different error:

RuntimeError Traceback (most recent call last)
in
11 l = torch.nn.functional.cross_entropy(prediction, y_train_batch)
12 my_optimizer.zero_grad()
—> 13 l.backward()
14 my_optimizer.step()
15 train_loss += l

/usr/local/lib/python3.6/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
91 products. Defaults to False.
92 “”"
—> 93 torch.autograd.backward(self, gradient, retain_graph, create_graph)
94
95 def register_hook(self, hook):

/usr/local/lib/python3.6/site-packages/torch/autograd/init.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
88 Variable._execution_engine.run_backward(
89 tensors, grad_tensors, retain_graph, create_graph,
—> 90 allow_unreachable=True) # allow_unreachable flag
91
92

RuntimeError: CuDNN error: CUDNN_STATUS_EXECUTION_FAILED

Also, to explain what I do: I try to implement in PyTorch a segmentation protocol with U-Net that was existing in Tensorflow. For the tensorflow code the segmentation results were provided as 4 binary mask images. Since in PyTorch the cross_entropy function does not allow multi-channel target, I multiply each image with a factor (1,100,200,255) and then summed them to obtain a single image in level of gray. Is it a correct way to proceed?

x_train_batch shape : torch.Size([10, 1, 256, 256]) dtype: torch.float32
y_train_batch shape: torch.Size([10, 256, 256]), dtype: torch.int64

U-Net for multi-class segmentation

Hi, Thank you for your answer, and sorry for the delay. I am working from a jupiter notebook on floydhub so I set CUDA_LAUNCH_BLOCKING=1 into the terminal before running the notebook (I guess that is how to do it?). I have a different error:

Hi,
Thank you for your answer, and sorry for the delay. I am working from a jupiter notebook on floydhub so I set CUDA_LAUNCH_BLOCKING=1 into the terminal before running the notebook (I guess that is how to do it?). I have a different error: