Max_unpooling3d_forward_kernel failed with error code 0

Riccardo_De_Feo · February 10, 2020, 1:31pm

After receiving a memory error I ran my training loop using CUDA_LAUNCH_BLOCKING=1 and now receive this rather esoteric error: max_unpooling3d_forward_kernel failed with error code 0. What does error code 0 stand for? Also, this happens after 3 successful forward and backward passes through the network.

As I changed a number of things about my training loop from the last working version, without touching the actual network, having an hint where to look for a bug would really help.

albanD · February 10, 2020, 4:27pm

This is surprising indeed.
Are you running out of memory? Does reducing the batch size help?

Riccardo_De_Feo · February 10, 2020, 6:33pm

When I had low memory issues in the past I would get a much clearer “out of memory” error.

albanD · February 10, 2020, 7:23pm

@ptrblck any idea of what could be causing this?

ptrblck · February 10, 2020, 9:39pm

What do you mean by “memory error”?
Was it an out of memory error or some memory access violation?
Could you post the complete stack trace and, if possible, a code snippet to reproduce this issue, so that we can debug it?

vijay_srinivas_tida · November 9, 2022, 4:01am

Hi,

How does the backpropagation of max unpooling layer will work?

ptrblck · November 9, 2022, 5:14am

The nn.MaxUnpool2d layer will direct the gradients to the inputs directly as seen here:

pool = nn.MaxPool2d(kernel_size=2, return_indices=True)

x = torch.randn(1, 1, 4, 4, requires_grad=True)
print(x)
# tensor([[[[-0.0680, -0.7748, -0.1858,  0.4980],
#           [-1.9051, -1.8637, -0.3653,  1.4009],
#           [-0.5979,  0.5564,  1.4532,  0.9443],
#           [ 0.2864, -0.3723, -0.9621,  1.0871]]]], requires_grad=True)

act, idx = pool(x)
print(act)
# tensor([[[[-0.0680,  1.4009],
#           [ 0.5564,  1.4532]]]], grad_fn=<MaxPool2DWithIndicesBackward0>)
print(idx)
# tensor([[[[ 0,  7],
#           [ 9, 10]]]])

act.retain_grad()
unpool = nn.MaxUnpool2d(2)

out = unpool(act, idx)
print(out)
# tensor([[[[-0.0680,  0.0000,  0.0000,  0.0000],
#           [ 0.0000,  0.0000,  0.0000,  1.4009],
#           [ 0.0000,  0.5564,  1.4532,  0.0000],
#           [ 0.0000,  0.0000,  0.0000,  0.0000]]]],
#        grad_fn=<MaxUnpool2DBackward0>)

out.mean().backward()
print(act.grad)
# tensor([[[[0.0625, 0.0625],
#           [0.0625, 0.0625]]]])
print(x.grad)
# tensor([[[[0.0625, 0.0000, 0.0000, 0.0000],
#           [0.0000, 0.0000, 0.0000, 0.0625],
#           [0.0000, 0.0625, 0.0625, 0.0000],
#           [0.0000, 0.0000, 0.0000, 0.0000]]]])

vijay_srinivas_tida · November 9, 2022, 5:33am

Thank you very much.