Models for Model Parallelism training

Hi everyone,

Firstly I don’t know whether this question should be asked here or not. I’m sorry if my isn’t relevant in this forum.

I would like to practice training deep learning models in Model parallelism method or Tensor parallelism method. Previously, I worked more in Computer Vision domain. It would be helpful if you can recommend me any CV application/pretrained model or any relevant examples to practice model parallelism or Tensor parallelism.

During my google search, I noticed there are many models available to practice in NLP domain.



I don’t know if there are “stable” model sharding APIs for CNNs, but you could take a look at e.g. this experimental interface. You are also right that the a lot of model sharding approaches are used for LLMs, e.g. in the Megatron repository.