My particular usecase is to train a model with some modules on CPU and some on GPU.
I wrap my model with
model = nn.DataParallel(model)
instead of model = nn.DataParallel(model).cuda()
since I don’t want the entire model on GPU.
But doing so results in the following error:
TypeError: Broadcast function not implemented for CPU tensors
in the forward
propagation of model. If I use without nn.DataParallel
, it works just fine
I’m pretty sure DataParallel is for GPUs.
Edit: Found an example. You should be able to get it to run on both apparently.