Nan Loss coming after some time

richard · December 28, 2017, 1:38am

You could use a normalization layer. Alternatively, you can try dividing by some constant first (perhaps equal to the max value of your data?) The idea is to get the values low enough that they don’t cause really large gradients.