Conv3d, High GPU Memory Usage

Hi,

I am trying to replicate the result of C3D model. But I found that it occupies large GPU memory than estimated.

Here is the model:

### self.features_frame
self.features_frame = [ 
            ### part 1
            nn.Conv3d(3, 64, kernel_size=(3, 3, 3), padding=(1, 1, 1)),
            norm_layer(64),  
            nn.MaxPool3d(kernel_size=(1, 2, 2), stride=(1, 2, 2)),
            nn.ReLU(True),

            ### part 2
            nn.Conv3d(64, 128, kernel_size=(3, 3, 3), padding=(1, 1, 1)),
            norm_layer(128),  
            nn.MaxPool3d(kernel_size=(2, 2, 2), stride=(2, 2, 2)),
            nn.ReLU(True),

            ### part 3
            nn.Conv3d(128, 256, kernel_size=(3, 3, 3), padding=(1, 1, 1)),
            norm_layer(256), nn.ReLU(True),
            nn.Conv3d(256, 256, kernel_size=(3, 3, 3), padding=(1, 1, 1)),
            norm_layer(256), 
            nn.MaxPool3d(kernel_size=(2, 2, 2), stride=(2, 2, 2)),
            nn.ReLU(True),

            ### part 4
            nn.Conv3d(256, 256, kernel_size=(3, 3, 3), padding=(1, 1, 1)),
            norm_layer(256), nn.ReLU(True),
            nn.Conv3d(256, 256, kernel_size=(3, 3, 3), padding=(1, 1, 1)),
            norm_layer(256), 
            nn.MaxPool3d(kernel_size=(2, 2, 2), stride=(2, 2, 2)),
            nn.ReLU(True),

            ### part 5
            nn.Conv3d(256, 256, kernel_size=(3, 3, 3), padding=(1, 1, 1)),
            norm_layer(256), nn.ReLU(True),
            nn.Conv3d(256, 512, kernel_size=(3, 3, 3), padding=(1, 1, 1)),
            norm_layer(512), 
            nn.MaxPool3d(kernel_size=(2, 7, 7), stride=(2, 2, 2)),
            nn.ReLU(True),
        ]   

self.features_frame = nn.Sequential(*self.features_frame)

### self.classifier
self.classifier = [ 
                   nn.Linear(512, 128),
                   norm_layer(128),  nn.ReLU(True),
                   nn.Linear(128, 10)
                  ]   
self.classifier = nn.Sequential(*self.classifier)

I am using Pytorch-0.2. Batch size is 1. The input size is (1, 3, 16, 112, 112). It occupies 1035MB gpu memory.

However, if I just modify the number of channels in the conv3d layer in the part 5, from 256 to 512.

            ### part 5
            nn.Conv3d(256, 512, kernel_size=(3, 3, 3), padding=(1, 1, 1)),
            norm_layer(512), nn.ReLU(True),
            nn.Conv3d(512, 512, kernel_size=(3, 3, 3), padding=(1, 1, 1)),
            norm_layer(512), 
            nn.MaxPool3d(kernel_size=(2, 7, 7), stride=(2, 2, 2)),
            nn.ReLU(True),

It occupies 10013MB GPU memory, which is ten times larger then 1035MB.

I have read the previous questions about GPU memory, but still have no idea why it happens in this case. From my calculation, I think the modified network would occupies at most 4 or 5 times larger than the previous one, but not 10 times.

Appreciate if someone helps me. Thanks.

Maybe try disabling/enabling cudnn.benchmark mode? Depending on the algorithm used for the specific image size, it might require a lot of memory. Apart from that, I do not have other ideas at the moment.

@fmassa, Thanks !
After enabling cudnn.benchmark, it is back to normal.