Module in a forloop gives an RuntimeError: Unable to find a valid cuDNN algorithm to run convolution

I combined traditional UNET module + my own custom function in my forward model. For some reason, I can’t give data as a batch tensor, but I need to give one by one using forloop.

____________________________________________
optvars = [{'params': custom_var, 'lr':lr_custom}]
optimizer1 = optim.Adam(optvars)
optimizer2 = optim.Adam(UNET.parameters(), lr=lr_UNET)

for a in range(batch_size)
     input_data=batch[a,:,:]
     temp_data=custom_function(input_data, custom_var)
     output_data=UNET(temp_data)
     loss_total=loss_total + loss(output_data)

loss_total.backward()  #  <-----where the error occur
custom_var.retain_grad() # <----- I think I need something like this for  UNET parameters...
optimizer1.step()
optimizer2.step()
____________________________________

I get this error:
RuntimeError: Unable to find a valid cuDNN algorithm to run convolution

It works when batch_size=1, but gives an error when batch_size>=2.
Also, when I don’t use UNET module in the forloop (only using custom_function), it works.
I think it’s something to do with using module more than twice before backward propagation, but I’m not sure what’s the problem.
Can anyone help please?

Thanks!

Which PyTorch version are you using with which CUDA/cuDNN releases?
This error might be raised, if no workspace can be allocated due to a large memory usage.

1 Like

I’m using 1.9.1 + cu111
Now it does look like a memory shortage in allocation for sure, although the message doesn’t say so.