Run custom components on GPU

Knievel · September 30, 2019, 1:21pm

I wrote a custom network that uses custom RNN cells that I also wrote. Everything works on CPU and I now try to get it to run on a GPU for faster training.
I’m registering all weights as well as a weight mask (with requires_grad=False) as Parameters in my RNN cells so they get passed to the GPU by just using .to(device) on my network. This seems to work. However in the forward pass method of my network I’m using three constructs that cause some trouble. (The code is too large to paste here, so I try to explain as good as I can.)

A python list contexbuffer, where I append hidden states that I computed and need to access again in the same forward pass.
A torch tensor output initialized with zeros that I fill with the final outputs after all hidden states are computed.
Torch tensors state_mask, zero-one masks where at least one is generated per point in the sequence and passed together with input and hidden states to the RNN cell for computation of the next hidden state.

I tried moving the hidden states from the context buffer as well as the state masks to the GPU right before computation in the RNN cell. This works, but I feel like this is far from ideal. The didn’t manage to get the output to be on the GPU however, I always get

Expected object of backend CPU but got backend CUDA for argument #2 ‘target’

when calculating the loss with .to(device) on the target.

Is there some way to initialize those lists/tensors so that they are moved to the GPU with the whole network?

ecdrid · September 30, 2019, 2:56pm

Try this torch.cuda.set_device(0); This will put everything to GPU by default.

Expected object of backend CPU but got backend CUDA for argument #2 ‘target’

The error says that the loss was expecting something to be on CPU but rather it was found to be on GPU.
So share your loss function maybe and the way you are calling it (might help others to help )

Knievel · October 2, 2019, 10:03am

Thanks for the answer, I’ll try when I get home.

To the error: I’m moving input and target to the GPU with .to(device), then pass the input through my network (which has also been moved to GPU) with prediction = model(input).

When I then call loss = criterion(prediction, target), the error is thrown. So it means the prediction I get from my network is still on CPU even though the network should be on GPU. The target is on GPU as it should be.