The manual seed to get matching random numbers passes on my local card and one particular V100 card that I could access. So I wonder if this behavior from pytorch is by design or a bug?
You should get consistent random numbers if you’re using the same seed, PyTorch version, and CUDA version even if it’s run on a different physical GPU. For example:
python -c "import torch; torch.manual_seed(1); print(torch.randn(1, device='cuda'))"
The CPU and GPU random number generators are different and will generate different streams of numbers. Also the PyTorch CPU generator is different from the NumPy generator.