Hi,

First of all, thanks for the great effort in `torch.distributed`

. I found it very useful for my project.

Is there any plan to support `gatherv`

, `scatterv`

, `igather`

, `all_gatherv`

, etc.?

Hi,

First of all, thanks for the great effort in `torch.distributed`

. I found it very useful for my project.

Is there any plan to support `gatherv`

, `scatterv`

, `igather`

, `all_gatherv`

, etc.?

we plan to add collectives as they are needed in many projects. what is your need for the gatherv and scatterv routines?

I am exploring if PyTorch can be used as a quick way to write portable codes involving distributed matrix multiplications.

For example, consider the case where “world size” is 4 (`torch.distributed.get_world_size()`

returns 4). a 1000 x 10 matrix can be represented by a world with 250x10 matrix on each rank. I needed `reduce`

, `all_gather`

, `all_reduce`

for various types of matrix multiplication involving such “thin and tall” matrices. However, if the number of rows is not divisible by world size, e.g. when the size of data matrix is 1001 x 10, I need `all_gatherv`

in place of `all_gather`

.