RuntimeError: Given groups=1, weight of size [16, 3, 3, 3, 3], expected input[2, 128, 128, 128, 3] to have 3 channels, but got 128 channels instead

You are passing your input in the channels-last memory format while PyTorch expect channels-first inputs. .permute the tensor to [batch_size, channels, depth, height, width] via:

x = x.permute(0, 4, 1, 2, 3).contiguous()

and it should work.