@ptrblck That did the trick, thanks! Yes, I changed that earlier and the problem hasn’t appeared anymore.
-
I’ve tried both MSELoss and BCEWithLogitsLoss which you suggested earlier as loss functions, but all the values seems to be more or less the same - do you have any suggestions on what other loss functions might work better for a multi-class domain like this?
-
Since I need to output 0s (key not pressed) and 1s (key pressed), can I just set a threshold in the last output like this?
outputs = torch.where(outputs > 0.5, torch.ones(1, requires_grad=True, device=device), torch.zeros(1, requires_grad=True, device=device))
-
What is the correct way to train everything on the GPU?
device = torch.device('cuda') # Default CUDA device
I’ve followed the docs and loaded the model .to(device) and defined any new tensors using device (torch.randn(…, device=device)). However, the task manager shows that the GPU uses between 0-1 % under training. Even though it is indeed running faster than when I don’t include .to(device), the speed decreases as the training goes on - and the GPU usage drops to 0 %. Any idea why?