Aloccating Required Memory for Each GPU in Data Parallelism Automatically

In my program, I use three pretrained encoders and for training my model I am using DataParallel. I want to use three Gpus that each of them has different free capacity. I am usung DataParallel in the following scenario:

dev = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = nn.DataParallel(model, output_device=[0,1,2], device_ids = [0,1,2])
model = model.to(dev)

Unfortunately, the program doesn’t distribute the required memory on the three device automatically and it wants to use beyond the capacity of one of the devices though other two devices have required free space. It shows the error that one of the devices doesn’t have enough memory but at the same time most of the memory space of other two gpus are free. How can I fix that?