How would I go about parallelizing (training in parallel) multiple models, across multiple gpus, which share a common subset of parameters? Any tips?
How would I go about parallelizing (training in parallel) multiple models, across multiple gpus, which share a common subset of parameters? Any tips?