Using torch.detach().cpu().numpy() makes 1.0000 (torch.float32) as 1.0000001 (float32). Is there an explanation for this? Is this behavior specific to MPS? And is the only way of mitigating this is to clip the np vector after detaching?
action from actor tensor([[1.0000, 0.2277]], device=‘mps:0’, grad_fn=<AddBackward0>)
action after detach [[1.0000001 0.22771475]]
Comparing str of a tensor is not a proper way of comparing two tensor.
You are showing a different number of digits in your comparison. You should proof they are different beyond fp32 numerical error.