Minor bug fix in clip_grad_norm_ for sparse matrix

There seems a minor bug for the clip_grad_norm_ function on sparse matrix in file: torch/nn/utils/clip_grad.py
It is because the p.grad.data belongs to sparse tensor and the clip_coef’s type is tensor.
The mul_ function will throw an error when you tried to multiple these two different type of functions.
It is easy to fix by simple convert the clip_coef to standard float type by: clip_coef = float(clip_coef).
Then for both Tensor for sparse.Tensor, the float variable can be multiplied.
Hope thishelps for people who are using sparse matrix for the embedding layer.