Char RNN on tinyShakespeare dataset - Erratic loss

I’m training a char RNN with PyTorch. I could see that the loss is erratic. I compared my code with Andrej Karpathy’s famous 100 line gist. He is displaying a smoothened loss rather than the original. I printed out the original loss and found similar results of mine.

  1. Do we need smoothened loss?
  2. Why is the loss not reducing?
  3. Is it related to char RNN?

I request the help from the community.