RuntimeError: Given transposed=1, weight of size [192, 64, 4, 4], expected input[8, 252, 258, 258] to have 192 channels, but got 252 channels instead

You should double check your padding approach as you are changing the channels, while I would guess you want to manipulate the spatial size.
Shapes before applying F.pad:

print(x1.shape, x2_1.shape, x2_2.shape, x3_1.shape, x3_2.shape, x4.shape)
# torch.Size([8, 128, 258, 258]) torch.Size([8, 42, 258, 258]) torch.Size([8, 42, 258, 258]) torch.Size([8, 
21, 258, 258]) torch.Size([8, 21, 258, 258]) torch.Size([8, 42, 258, 258])

After:

print(x1.shape, x2_1.shape, x2_2.shape, x3_1.shape, x3_2.shape, x4.shape)
# torch.Size([8, 42, 258, 258]) torch.Size([8, 42, 258, 258]) torch.Size([8, 42, 258, 258]) torch.Size([8, 42, 258, 258]) torch.Size([8, 42, 258, 258]) torch.Size([8, 42, 258, 258])

If this is desired, change deconv1_input_channels = 6 * (inception_out_channels // 3) and it should work.