I’ve tried training a simple mnist classifier on maps, when training the exact same code on the cpu I get an accuracy of 98%, however on mps I get 0.00%
Anyone an Idea why that is?
That is surprising for sure.
Do you have a small code sample to reproduce this issue by any chance?
0% is very strange since you should get at least 10% for random guesses.
I reproduced your issue here:
the bug seems to happen in:
accuracy = correct_prediction.float().mean()
since the data suggests a mean different from 0.0
correct_predictions tensor([ True, True, True, ..., True, False, True], device='mps:0')
Related to the other .float() bug which @albanD identified yesterday and created a ticket for:
In the meantime it can be solved by using
accuracy = float(correct_prediction.sum())/len(correct_prediction)
Thank you so much, I also found that the bug is happening in the evaluate function as the trainings loss is going down
We sent a fix for that. The nightly build for tomorrow should have a fix for that.
A similar issue is found when executing the sample code here: Quickstart — PyTorch Tutorials 2.0.1+cu117 documentation
Specifically in function test(), line:
correct += (pred.argmax(1) == y).type(torch.float).sum().item()
When device = ‘mps’ it always results in 10% accuracy. when device = ‘cpu’, the accuracy is as expected.