Hi,

I recently had a reproducibility issue. Specifically, the same pytorch model outputs very different results on different machines, even though the **random seed is fixed**. The absolute value of the difference is **quite large**.

After tracking down the issue, it seems to me that `torch.tensor.exponential_()`

is not deterministic even with fixed random seed.

The following is a MWE. The PyTorch version is 1.13.0.

```
import torch
"""
The following two tensors reproduce the inconsistency issue.
"""
logits = torch.zeros(100, 13, 97, dtype=torch.float64, device='cuda:0')
# logits = torch.zeros(1000, 1000, dtype=torch.float64, device='cuda:0')
"""
The shape MATTERS..
The following shapes do not reproduce the issue.
"""
# logits = torch.zeros(100, 10, 97, dtype=torch.float64, device='cuda:0')
# logits = torch.zeros(100, 97, dtype=torch.float64, device='cuda:0')
torch.manual_seed(0)
# g_cuda = torch.Generator(device='cuda:0')
# g_cuda.manual_seed(0)
sample = torch.empty_like(logits).exponential_()
# sample[0] is deterministic across different machines
print(sample[0].sum())
# However, sample[-1] is different on different machines
print(sample[-1].sum())
```

On a RTX 3090 GPU, the output is

```
tensor(1246.9592, device='cuda:0', dtype=torch.float64)
tensor(1304.7903, device='cuda:0', dtype=torch.float64)
```

On a A5000 GPU, the output is

```
tensor(1246.9592, device='cuda:0', dtype=torch.float64)
tensor(1229.9033, device='cuda:0', dtype=torch.float64)
```

Can someone advises on this? Is this potentially a bug?