Hi,
I’m trying to use dataparallel, and I encounter this runtime error:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!
I use some custom parameters in my model (some are training parameters, some are constant),
and when I initialize the module, I save some parameters in gpu using .cuda, while some are initialized using torch.nn.Parameter. The model is written before I try to use dataparallel. SO I suspect that when I wrap my model, it gets confused that some of the parameters are located in a fixed gpu.
Can I get some advice on how I should appropriately initialize my model to use dataparallel ?
Thanks for reading and have a nice day!