Running PyTorch MPS acceleration on Apple M1, get "Placeholder storage has not been allocated on MPS device!" error, but all seems to be on device

I’ve noticed that a few folks have had this problem. I’m running the nightly build of PyTorch 2.4.0.dev20240326 and am trying to use my Mac’s GPU to speed up training. I’m encountering the dreaded “RuntimeError: Placeholder storage has not been allocated on MPS device!”. I think I understand that this happens when “the things needed for the computation aren’t properly loaded onto the GPU”.

If I call:

torch.backends.mps.is_available()

I get true.

I’m generally confused about how and when I’m supposed to “move” things to the device to accelerate training, but per this answer I tried the following from a breakpoint:

state
tensor([0., 3., 0.])
ipdb>  state.to(torch.device("mps"))
tensor([0., 3., 0.], device='mps:0')
ipdb>  self.net.to(torch.device("mps"))
Policy_Network(
  (shared_net): Sequential(
    (0): Linear(in_features=3, out_features=4, bias=True)
    (1): Tanh()
    (2): Linear(in_features=4, out_features=4, bias=True)
    (3): Tanh()
  )
  (output_net): Sequential(
    (0): Linear(in_features=4, out_features=2, bias=True)
    (1): Softmax(dim=-1)
  )
)
ipdb>  self.net.state_dict()
OrderedDict({'shared_net.0.weight': tensor([[ 0.2975, -0.2548, -0.1119],
        [ 0.2710, -0.5435,  0.3462],
        [-0.1188,  0.2937,  0.0803],
        [-0.0707,  0.1601,  0.0285]], device='mps:0'), 'shared_net.0.bias': tensor([ 0.2109, -0.2250, -0.0421, -0.0520], device='mps:0'), 'shared_net.2.weight': tensor([[ 0.0725, -0.0020,  0.4371,  0.1556],
        [-0.1862, -0.3020, -0.0838, -0.2157],
        [-0.1602,  0.0239,  0.2981,  0.2718],
        [-0.4888,  0.3100,  0.1397,  0.4743]], device='mps:0'), 'shared_net.2.bias': tensor([ 0.3300, -0.4556, -0.4754, -0.2412], device='mps:0'), 'output_net.0.weight': tensor([[ 0.4391, -0.0833,  0.2140, -0.2324],
        [ 0.4906, -0.2115,  0.3750,  0.0059]], device='mps:0'), 'output_net.0.bias': tensor([-0.2634,  0.2570], device='mps:0')})
ipdb>  action_probs = self.net(state)
*** RuntimeError: Placeholder storage has not been allocated on MPS device!```

(the error is still there). This doesn't make sense to me as it seems like the model and the data (state) are on the MPS device. The network (as you can probably see) is a simple torch.nn.module. Does this make sense/does anyone have any advice as to why this is happening? I feel like [this](https://stackoverflow.com/questions/78168447/placeholder-storage-has-not-been-allocated-on-mps-device) might be relevant, but I haven't identified any other parameters...