Sparse tensor and distributed functions (all_reduce all_gather)

vince62s · May 25, 2018, 6:44am

Hello,

Is there any plan for the support of Sparse tensors in distributed mode ?

it would really be helpful for the muti gpu transformer implementation.

cheers.
Vincent