I just built a facial keypoints recognition dataset that feeds in a greyscale image into the network and outputs 72 keypoints on the face.
I loaded up a pretrained resnet34 and trained the model on this data.
The loss was all over the place. So what am I doing wrong here? How do I make the model work better?
Try to scale down the problem and try to overfit a small data sample (e.g. just 10 samples) by playing around with the hyperparameters.
Once your model can overfit it, you could try to scale it up again.