DataParallel or DistributedDataParallel workaround

Hi,

For a complex model, DistributedDataParallel does not work as the support is not yet there as of 06/23. DataParallel throws me something like

RuntimeError: Input type (CUDAComplexFloatType) and weight type (torch.cuda.FloatTensor) should be the same

despite my first layer being a complex Conv2d. Assuming a single server, is there some sort of workaround so that I can train using multiple GPUs?