Concat unequal number of channels

I want to implement U-net from article. https://arxiv.org/pdf/1505.04597.pdf
There is copy and crop operation, then 512 channels need to concatenate with 1024 channels.

I have torch.Size([2, 1024, 32, 32]) after upsample middle layer, and torch.Size([2, 512, 32, 32]) after last encoder layer.

And after concat operation: torch.Size([2, 1536, 32, 32])

Hence, I got:
RuntimeError: Given groups=1, weight of size [1024, 1024, 3, 3], expected input[2, 1536, 32, 32] to have 1024 channels, but got 1536 channels instead , when I gave this input to my decoder layer

What should I do?

I did change input size of layers - train is starting but I got:
RuntimeError: CUDA out of memory. Tried to allocate 400.00 MiB (GPU 0; 11.17 GiB total capacity; 10.24 GiB already allocated; 262.81 MiB free; 10.49 GiB reserved in total by PyTorch)

Is it normal that 12 GB of Collab gpu memory is not enough? And even 16 GB memory is not enough!
Ok, I changed input size and batch size and now model is trainable.