Using SobolEngine with cuda?

Hi,

I am wondering if I could generate quasirandom number with SobolEngine directly on gpu, i.e. without memory copy between host and device, in PyTorch 1.8.0?

For example, I tried the following code

a = torch.empty([1],dtype=torch.float,device=torch.device('cuda'))
soboleng = torch.quasirandom.SobolEngine(dimension=1)
soboleng.draw(1,out=a)

Then I got the following error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\zz_ro\Anaconda3\envs\pytorch\lib\site-packages\torch\quasirandom.py", line 91, in draw
    out.resize_as_(result).copy_(result)
RuntimeError: Expected object of backend CUDA but got backend CPU for argument #2 'the_template'

I guess this is because the variable a is allocated on gpu, while in the draw function, the temporary result is generated on CPU. And the error occurs when it tries to copy a CPU tensor (i.e., result) to a cuda tensor (i.e. out). What I am trying here is to call the random number generator directly on GPU, and store the result in a cuda tensor. Is this possible?