I have implemented a Pixelcnn model for 3 dimension images (volumetric) and the architecture is as follows for input size of 14x14x14:
--------------------------------------------------------------- Layer (type) Output Shape Param # ================================================================ Conv3d-1 [-1, 2, 14, 14, 14] 36 Conv3d-2 [-1, 2, 14, 14, 14] 4 Conv3d-3 [-1, 2, 14, 14, 14] 12 Conv3d-4 [-1, 2, 14, 14, 14] 4 Conv3d-5 [-1, 2, 14, 14, 14] 4 MaskedConv3d_h-6 [-1, 2, 14, 14, 14] 4 Activation-7 [-1, 2, 14, 14, 14] 0 Activation-8 [-1, 2, 14, 14, 14] 0 Activation-9 [-1, 2, 14, 14, 14] 0 Conv3d-10 [-1, 2, 14, 14, 14] 4 Conv3d-11 [-1, 2, 15, 14, 14] 72 Conv3d-12 [-1, 2, 14, 15, 14] 24 Conv3d-13 [-1, 2, 14, 14, 14] 4 Conv3d-14 [-1, 2, 14, 14, 14] 4 Conv3d-15 [-1, 2, 14, 14, 14] 4 Conv3d-16 [-1, 2, 14, 14, 15] 8 Activation-17 [-1, 2, 14, 14, 14] 0 Activation-18 [-1, 2, 14, 14, 14] 0 Activation-19 [-1, 2, 14, 14, 14] 0 Conv3d-20 [-1, 2, 14, 14, 14] 4 StackedConvolution-21 [[-1, 2, 14, 14, 14], [-1, 2, 14, 14, 14], [-1, 2, 14, 14, 14]] 0 Conv3d-22 [-1, 2, 15, 14, 14] 72 Conv3d-23 [-1, 2, 14, 15, 14] 24 Conv3d-24 [-1, 2, 14, 14, 14] 4 Conv3d-25 [-1, 2, 14, 14, 14] 4 Conv3d-26 [-1, 2, 14, 14, 14] 4 Conv3d-27 [-1, 2, 14, 14, 15] 8 Activation-28 [-1, 2, 14, 14, 14] 0 Activation-29 [-1, 2, 14, 14, 14] 0 Activation-30 [-1, 2, 14, 14, 14] 0 Conv3d-31 [-1, 2, 14, 14, 14] 4 StackedConvolution-32 [[-1, 2, 14, 14, 14], [-1, 2, 14, 14, 14], [-1, 2, 14, 14, 14]] 0 Conv3d-33 [-1, 2, 15, 14, 14] 72 Conv3d-34 [-1, 2, 14, 15, 14] 24 Conv3d-35 [-1, 2, 14, 14, 14] 4 Conv3d-36 [-1, 2, 14, 14, 14] 4 Conv3d-37 [-1, 2, 14, 14, 14] 4 Conv3d-38 [-1, 2, 14, 14, 15] 8 Activation-39 [-1, 2, 14, 14, 14] 0 Activation-40 [-1, 2, 14, 14, 14] 0 Activation-41 [-1, 2, 14, 14, 14] 0 Conv3d-42 [-1, 2, 14, 14, 14] 4 StackedConvolution-43 [[-1, 2, 14, 14, 14], [-1, 2, 14, 14, 14], [-1, 2, 14, 14, 14]] 0 Conv3d-44 [-1, 2, 14, 14, 14] 6 BatchNorm3d-45 [-1, 2, 14, 14, 14] 4 Activation-46 [-1, 2, 14, 14, 14] 0 Dropout-47 [-1, 2, 14, 14, 14] 0 Conv3d-48 [-1, 3, 14, 14, 14] 6 ================================================================ Total params: 444 Trainable params: 444 Non-trainable params: 0 ---------------------------------------------------------------- Input size (MB): 0.02 Forward/backward pass size (MB): 14832.59 Params size (MB): 0.00 Estimated Total Size (MB): 14832.61 ----------------------------------------------------------------
What is really confusing for me is the forward/backwad pass size. the model trains well though for a small input like this, but I wonder if 14832 MB make sense at all? I have 444 trainable parameters only…