How do I train on two machines, one with 4 gpus and one with 8 gpus? I find that the documentation of torch elastic run (torchrun (Elastic Launch) — PyTorch 1.10.0 documentation) suggests
" 4. This module only supports homogeneous LOCAL_WORLD_SIZE
. That is, it is assumed that all nodes run the same number of local workers (per role)."
Thanks!