LossFunction, optimizer, and accuracy for a neural network that determines the depth map of the image

I am trying to recreate a UNet network for predicting depth from a picture. And I don’t know which one is better to choose. Tell me or send where it is written in more detail about this please! If I made a mistake with the branch, then I apologize.

U-Net sounds like a good choice. You could use something like L1 or L2 loss with the Adam optimizer. These settings seem to work well in general.

