I’d like to use mpi4py with Pytorch, and while I can use the numpy interface for CPU tensors, this is not possible for GPU ones. This would be a nice trick while we wait for the distributed interface to be polished.
Thanks a lot for your help (and for developing Pytorch),
Séb
I saw that in distributed you use THDPTensorDesc but I guess I can’t access it from Python for now. Do you have a suggestion as to what might be the best alternative ?
do you really need a PyBuffer? We attempted implementing the Buffer interface, but it is slightly different across many versions of python and impossible to implement without thousands of lines of code.
Cant you work with the data pointer and size of tensor ?
The advantage of the Buffer interface is that mpi4py can take advantage of it. (Essentially making the whole of MPI available) I don’t know of a way to use it directly with address and length. Maybe using custom DataTypes, but I’d need to investigate that.
Since I can afford a hacky solution (only need send/recv), I’ll try that while waiting for THDP. In any case, I’ll keep this thread up-to-date.
torch.distributed MPI operations are very limited (+requires building from source), while mpi4py supports more operations and seems like excellent library.