I’m seeing the loss go to inf and the predictions all become nan. I tried shrinking the learning rate way down but that made no difference.
It seems I needed to normalize my RGB values to [0,1]. It looks like the gradients were just way too large otherwise.