Getting NaN in the softmax Layer

It is not a problem of bz, the weight update is unstable, and the gradient backpropagates, resulting in gradient explosion. But I tried gradient cropping before, and it didn’t work. Is there any other way to solve this problem?