Losses end up becoming NAN during training. how to debug and fix them?

Firstly, a good idea might be to debug why you’re getting nans in your landmarks tensor.

Secondly, there might be an issue with the way normalizing is being done. Since landmarks are (x,y) pairs on an image, it might not be suitable to divide by the max landmark value to normalize.

As I mentioned in the below reply:

The idea is to do something on the lines of:

  • For each (x,y) pair of landmarks, find out the image it belongs to
  • Find the image’s width and height
  • Divide the 1st coordinate of the landmark, by height - so you’re normalizing with respect to height
  • Divide the 2nd coordinate of the landmark, by width, so you’re normalizing this with respect to width.

Doing this is analogous to finding the answer to the question:

if the height and width of my image were 1, then where would my landmarks be in the image?

1 Like