Is there a feature of sharegradinput like torch?

Hi all,
My model needs memory more than 12GB a little. So is there a feature of sharegradinput like torch?

I think PyTorch has already done some kinds of such optimizations on memory allocation.