Hi, I am applying U-Net iteratively, and try to get the gradient update of the very first input, however, the network memory is not increasing significantly (not scale linearly with the number of unrolled steps, every. extra unrolled step only add ~10mb gpu memory). Does this make sense or are there may be some bugs…?
input x → Unet → x_1 → Unet → x_2 → Unet → … → x_n-1 → Unet → x_n → loss
and the gradient calculated is respect to x