I agree. DataParallel assume that each GPU can take at least one data sample on its own.
I did apply dataparallel to the parent module, without specifying any gpu to any module… I found out module2 is assigned to cuda:3 after encourter the error…
I agree. DataParallel assume that each GPU can take at least one data sample on its own.
I did apply dataparallel to the parent module, without specifying any gpu to any module… I found out module2 is assigned to cuda:3 after encourter the error…