When using conv3d, a large amount of video memory is occupied

Could could estimate the memory usage by calculating the number of parameters, forward activations, gradients etc. as described here. You could also use the dispatch mode as described here.