Force a Tensor to live on a CPU

elanmart · March 27, 2017, 7:25pm

Hi,

what is the most elegant way to force a Tensor to always stay on the CPU?
I have a SparseLinear layer that won’t fit on my GPU, so I’d like that part of the net to stay on the CPU, even when the rest of my model lives on the GPU.

Currently I’m using a rather ugly hack, by simply replacing the cuda() method of the Tensor [1]

And one more question, what’s the reason cuda() / cpu() methods of a Model do not call the same methods of the children? E.g I thought that calling model.cuda() would call model.sparse_layer.cuda() (which would move result of sparse dot product to GPU), but that’s not the case, since self._apply() only works with parameters / buffers.
Is calling model.apply(lambda t: t.cuda()) the solution here, or I shouldn’t call cuda() on all the children?

[1]

def force_cpu_(tensor):
    tensor.cuda = types.MethodType(lambda self, *args, **kwargs: self, 
                                   tensor)
    return tensor

jekbradbury · March 27, 2017, 11:42pm

For now, I would override the cuda method of the model’s top-level Module. That should be slightly less hacky than your (impressive!) monkeypatching.