Hi, I been testing various NLP models on MPS, and I observe various strange things. It seems that bidirectional LSTM on MPS is not working correctly.
See the code below:
import torch
import torch.nn as nn
print("Torch version: ", torch.__version__)
torch.manual_seed(123)
embeddings = torch.rand(3,1,2) # L, B, E
model = nn.LSTM(input_size=2, hidden_size=2, bidirectional=True)
print(model(embeddings)[0])
model.to("mps")
print(model(embeddings.to("mps"))[0])
Results are below. CPU above, MPS below. How can results on CPU be different than on MPS?
==> I am missing something?
I’m testing on MacBook Pro with M1PRO, 32GB, 10-core CPU, 16-core GPU.
Torch version: 2.0.0.dev20230205
tensor([[[ 0.0054, 0.2338, -0.1167, 0.2263]],
[[ 0.0200, 0.3305, -0.0967, 0.2075]],
[[ 0.0422, 0.3740, -0.0590, 0.1553]]], grad_fn=<CatBackward0>)
tensor([[[0.2325, 0.2447, 0.4149, 0.4394]],
[[0.3524, 0.3702, 0.3516, 0.3785]],
[[0.4139, 0.4346, 0.2306, 0.2531]]], device='mps:0',
grad_fn=<CatBackward0>)