LossFunction, optimizer, and accuracy for a neural network that determines the depth map of the image

Tell me which ones to take LossFunction, optimizer, and accuracy.
I am trying to recreate a UNet network for predicting depth from a picture. And I don’t know which one is better to choose. Tell me or send where it is written in more detail about this please! If I made a mistake with the branch, then I apologize.

U-Net sounds like a good choice. You could use something like L1 or L2 loss with the Adam optimizer. These settings seem to work well in general.

If you want to find studies or comparisons of SOTA models, you can look at:

or