Bug in torch.multinomial -- generated distribution is modestly incorrect [Edit: This is a regression and appears to be due to an analogous bug in Tensor.exponential_().]

Also here is a small bug that hopefully will go away when the issue is fixed. exponential_kernel() tries to initialize a 32 bit or 64 bit seed based on the precision of the tensor, but vslNewStream looks like it takes a 32 bit seed. This explains why you get the same results for float32 and double tensors:

...
    int64_t seed;
    {
      // See Note [Acquire lock when using random generators]
      std::lock_guard<std::mutex> lock(generator->mutex_);
      if (self.scalar_type() == at::kDouble)
        seed = generator->random64();
      else
        seed = generator->random();
    }
...
          if constexpr (std::is_same<scalar_t, double>::value) {
            vslNewStream(&stream, VSL_BRNG_MCG31, seed);
            vslSkipAheadStream(stream, begin);
            vdRngExponential(VSL_RNG_METHOD_EXPONENTIAL_ICDF, stream, len,
              (double *)(sample_ptr + begin), eps, 1./lambda);
            vslDeleteStream(&stream);
          } else {
            vslNewStream(&stream, VSL_BRNG_MCG31, seed);
            vslSkipAheadStream(stream, begin);
            vsRngExponential(VSL_RNG_METHOD_EXPONENTIAL_ICDF, stream, len,
              (float *) (sample_ptr + begin), eps, 1./lambda);
            vslDeleteStream(&stream);
          }

Here are the docs for vslNewStream - https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/2023-2/vslnewstream.html