Loss becomes 'nan' after some iterations

I am trying to build Autoencoder whose encoder,decoder are nested TreeLSTM-s. During training after some iterations loss becomes ‘nan’. I did try to decrease learning rate, do gradient clapping,data normalization but still it becomes ‘nan’. What can be wrong?

RuntimeError: Function ‘MulBackward0’ returned nan values in its 0th output. This is error message.

I suspect some your labels or training data might contain Nans. If your data is in a csv file.
Remove Nans before training

data = data.dropna()

My data is in this format pytorch-tree-lstm/example_usage.py at master · unbounce/pytorch-tree-lstm · GitHub look tree in this code. How can I check if it contain Nans here?

Given the following code snippet,

h, c = model(

I suppose you can iterate through the dataset data['features'] to inspect which features contribute to NaNs value and print them out.

I did but there is no nan element in data[‘features’]

That’s okay. Did you try for debugging purpose to include a regularisation layer or term?

I didn’t. Are there any other reasons for nan output?

Depending on your model or your training setup it can be few reasons.
If your model is deep enough it might suffer from vanishing gradients, if you don’t apply regularisation, especially if you’re using sigmoid/tanh as activations.
It can also be that you’re learning rate was too high and the gradients became infinity hence your model diverged.
These are typical scenarios, few tricks should help, like choosing a smaller learning rate or apply regularisation.

My training set is 3000 style and content tensors. When I make my training set 100 and then train my model only on that 100 tensors I don’t get ‘nan’ in my loss. But increasing training set I get. Can there be other problem going on?