Hi Jeff!
I believe that you are correct. Great detective work! I wasn’t able to
drill down through all the layers to get to the actual implementation.
Yes. (It turns out that you were right about this all along …)
This strikes me as a big problem. As a general rule – and this has
been known since before I was born (Yikes!) – it’s a bad idea to use
one pseudo-random-number generator to seed another.
Yes. This strikes me as a big problem. By today’s standards, 2**32
is
a “small” number. One could easily exhaust such a “small” set of
pseudo-random variates with – by today’s standards – a medium-scale
computation (that I could run on my cell phone …).
You can use torch.random.get_rng_state()
to get the state of
pytorch’s (cpu) global generator. It’s a ByteTensor
of length 5056
,
which is to say the state consists of 632 32-bit words. (This would
be typical of a 32-bit mersenne-twister generator, which I’m pretty
sure is what pytorch uses on the cpu.)
So .exponential_()
(and bernoulli, etc.) take that 632-word state
and pump it through a bottleneck of just a single word. You’ve thrown
away almost all of the state that made the mersenne twister a good
generator and you can’t get that state back (even if the generator
you seed with that single word is a good generator). This really goes
against everything we know (and that we’ve learned the hard way)
about good practice when using pseudo-random generators.
Part of what makes these kinds of bugs so pernicious is that they
don’t obviously catch your eye nor show up in quick tests. In your
case, you had to look for collisions of random permutations derived
from multinomial()
when generating tens of thousand of samples.
I agree completely. I understand that the need for speed is a competing
consideration, but not at the expense of correctness. Although I’m not
a fan of seeding one generator with another, there are legitimate use
cases. But if you’re going to do so, you have to be much more careful.
(And squeezing your generator through a 32-bit “entropy bottleneck”
doesn’t meet the test of adequate care.)
Thanks very much for finding this.
Best.
K. Frank