The optimized parameters use different optimizer and learning rate. They are quite different groups so that I want to clip them separately suing
clip_grad_norm_. I made the parameter groups into lists and passed into the
clip_grad_norm_, like setting different learning rate for groups. But this seems not work for the gradient clipping.
The document says the parameter needs to be
an iterable of Tensors or a single Tensor that will have gradients normalized. What if I have a list to do it？