I’m wondering if you use data parallel, will the sub module also be data parallel? For example, I write one model as A and in B I directly used A, when I use data parallel to wrap B, will A also be data parallel?
nn.DataParallel replicates the passed module on your GPUs, so if A is a member of B, both should run in parallel.