How to implement gradient clipping in FSDP2 (fully_shard)

FullyShardedDataParallel implements the method clip_grad_norm_, but what would the equivalent for FSDP2 (fully_shard)? If there is no method, how could it be implemented?

Thank you