Why use DxHxW for 3d input data instead of HxWxD?

NightRain · November 25, 2020, 1:57pm

When inputting a 3d array, Conv3d expects as input NxCxDxHxW. Why not use NxCxHxWxD?

This only transposes the input anyway when the dimension is equal for all 3 axes.

ptrblck · November 27, 2020, 7:07am

What would be the advantage of NCHWD?

NightRain · November 27, 2020, 9:04am

It’s probably more intuitive since volumes are usually in the format HWD. In TF they simply specify (dim_1, dim_2, dim_3), so I can’t see the reason for using DHW…

ptrblck · November 28, 2020, 5:29am

Could you explain more about the “usual” format? I.e. does a specific library return volumes as HWD or is there any other convention? I’m currently seeing it more as CHW (PIL) vs. HWC (OpenCV), which might be annoying sometimes but both memory layouts have certain advantages. I’m not sure what the advantage of HWD vs. DHW would be. E.g. is there any expected performance gain using your layout?