Memory and speed issue (nn.conv3d and functional conv3d)

Hi, I am having trouble with some weird behavior of the Convolution layer… I couldn’t find sufficient information yet.

I found two strange things, and I will write small codes to test it below.

First thing is, (F is torch.nn.functional)

with torch.no_grad():
    test = torch.ones(1, 1, 200, 1200, 1900).cuda().float()
    weight = torch.ones(1, 1, 40, 40, 100).cuda().float()
    answer = F.conv3d(test, weight)
    print(answer.size())

with torch.no_grad():
    test = torch.ones(1, 1, 200, 1200, 1900).cuda().float()
    weight = torch.ones(1, 1, 40, 40, 100).cuda().float()
    answer = F.conv3d(test, weight)
    print(answer.size())

If I execute this code(two block), first block shows result in about 5 seconds as torch.Size([1, 1, 161, 1161, 1801]). However, second print doesn’t appear although several minutes passes. Result is like below. Also, until first print, GPU usage was about 28%(3GB/11GB) and after first print, GPU usage grows until 57%(6.3GB/11GB).

(base) D:\codes>python util.py (deleted private path)
torch.Size([1, 1, 161, 1161, 1801])

.

The second issue is about kernel size… Here is the snippet.

with torch.no_grad():
    test = torch.ones(1, 1, 200, 1200, 1900).cuda().float()
    weight = torch.ones(1, 1, 40, 40, 40).cuda().float() / 100
    weight2 = torch.ones(1, 1, 3, 3, 3).cuda().float() / 100

    answer1 = F.conv3d(test, weight)
    answer2 = F.conv3d(test, weight2)
    print(answer1.size())
    print(answer2.size())

If I only calculate answer1, it consume 3.2GB of gpu memory. However if I calculate answer2, I meet CUDA out of memory, also saying tried to allocate 7.56GiB. I don’t know how this difference made.

Also, I think I need a model that can calculate the huge size of the input so I test with 200, 1200, 1900.

.

Please give me an idea. Thanks.

Hi,

I think the problem in both cases is that these would correspond to very large convolutions.
Also there are known issues (https://github.com/pytorch/pytorch/issues/32370) with cudnn conv3d so might want to try to disable it with torch.backends.cudnn.enabled=False or try the benchmark mode to allow it to discover better algorithms with torch.backends.cudnn.benchmark=True.

1 Like

Thanks!!!

There are still some ‘not understandable’ thing … and maybe I can test some situations and organize them.

Thank you!