What happens if I replace the last FC layers with a Conv layer

I am working on a regression problem that takes the global correlation of brain volumes into account. So I have to feed the whole image like 172x220x156 size into the network.
So when I use the FC layer:
self.fc = nn.Linear(30720, num_colEls=256)
The cuda out of memory error generates.

I can replace the FC layer with a Conv layer with an exactly the same Kernel size of the input.
self.lastConv = nn.Conv3d(512, 256, kernel_size=(4, 4, 3), stride=1)

In terms of, learning, generalization and all how efficient will it be replacing the FC layer with a Conv?

And in case of Conv layer the final output shape looks like:
torch.Size([2, 256, 1, 1, 1])