Torch.nn.lstm lstm layer error in GPU

ptrblck · April 9, 2020, 9:02am

Thanks for the code!

I got an error of a device mismatch, since hidden_state and cell_state are both initialized on the CPU even if you push the model to the GPU.
Could you try to use:

    def forward(self, X):
        X = self.embedding(X)
        
        trans_X = X.transpose(0, 1) # Make it to [sequence length, batch size, input_size]
        
        hidden_state = torch.zeros(1, len(X), self.hidden_size).to(X.device)
        cell_state = torch.zeros(1, len(X), self.hidden_size).to(X.device)
...

Also, the right_count calculation will raise another device mismatch.
You would have to call .cpu() on the torch.argmax operation, while it;s called on the sum in your code:

right_count = torch.sum(Y_batch.cpu() == torch.argmax(y_pred, 1).long().cpu()).item()

After fixing these issues, the code runs fine.

Let me know, if that helps.