LSTM and gradient cliping


What is the best configuration for the max_norm of the gradient clipping?
From what I saw people use 1, 3 or 10 usually.
I also saw on a website that for the LSTM language model a max_norm of 0.25 gives better results.
In my case, I am applying an LSTM to time series, what could be the best value?


Anyone could give an help?
Thanks in advance