CPU and GPU memory

Thanks for your reply!

I have one more question. For backward propagation, the intermediate outputs will be copied back to GPU. For the U-Net architecture, does the backward propagation calculate the gradient for each layer, including the (forward) input, intermediate outputs, and final output? If so, the idea of saving outputs back to CPU sounds not reasonable, because the GPU still needs to handle the forward outputs in backpropagation…