Random Number on GPU

Is it anyway I can sample random numbers directly on GPU, avoiding sampling on CPU and then transfer, such as:

torch.tensor.randn(10,10).cuda() #REALLY SLOW

maybe adding to the torch.distribution package a way to tell sampling is performed on GPU, because the source code of this methods always use torch.tensor but not torch.cuda.tensor.

Or add torch.cuda.FloatTensor(10,10).randn_

This would be really useful.
Thanks

3 Likes

Hi,

You can use methods such as uniform_() or normal_() on cuda tensors to generate random numbers directly on gpu.
Keep in mind that torch.rand(10, 10) is equivalent to torch.FloatTensor(10, 10).uniform_() and torch.randn(10, 10) is equivalent to torch.FloatTensor(10, 10).normal_().
So torch.cuda.FloatTensor(10, 10).normal_() will do what you want.

5 Likes

thanks, that was the method I was looking for but did not find in docs.

I actually thinkg you should be more consistent with the names. In this case:

randn(mu,std)->samples a tensor of given shape with a gaussian distribution of mean mu and stdev std

normal(mu_vec,std_vec) Returns a Tensor of random numbers drawn from separate normal distributions who’s mean and standard deviation are given. (directly from your doc).

So there is no way to think that normal_ will do what randn does in a Tensor inplace method. Maybe adding randn_ to do what randn does and normal_ for what normal does. One inplace (GPU, CPU) and one returning a tensor CPU

Thanks you anyway.

Hi, Is there a way to sample from a categorical distribution (with some probability distribution specified) on the gpu? For example from cpu I can do:

c = torch.distributions.Categorical(probs=torch.tensor([1.0, 1.0, 2.0]))
s = c.sample(sample_shape=(10,10))
s = s.cuda()

Is there a way to avoid the cuda copy in this case?

Hi,

I think the output will be on the same device as the input. Simply initialize with a cuda tensor.

1 Like

I am currently using the suggested code (see below)

def create_h(n,z,generator,dtype,device):
    """
    Creates a Tensor (z x n x 1) where each entry ~ N(0,1). Automatically detects the
    precision 64 bits or 32 bits.
    -------------
    :param n:   int
                Dimension of the Space where the Polytope Lives
    :param z:   int
                Padding Parameter
    :param generator:
    :param device:  String, default = cpu
                    Hardware used to make the computations and allocate the result.
                    If equal to cpu then the CPUs are used for computing the inverse.
                    If equal to cuda then the a GPU is used for computing the inverse.
    -------------
    :return:    Torch Tensor
                Tensor (z x n x 1) where each entry ~ N(0,1)Contains a tensor

    """
    if '64' in str(dtype):
        if 'cuda' in device:
            h = torch.cuda.DoubleTensor(n, z).normal_(generator=generator)
        elif 'cpu' == device:
            h = torch.DoubleTensor(n, z).normal_(generator=generator)
    elif '32' in str(dtype):
        if 'cuda' in device:
            h = torch.cuda.FloatTensor(n, z).normal_(generator=generator)
        elif 'cpu' == device:
            h = torch.FloatTensor(n, z).normal_(generator=generator)
    elif '16' in str(dtype):
        if 'cuda' in device:
            h = torch.cuda.HalfTensor(n, z).normal_(generator=generator)
        elif 'cpu' == device:
            h = torch.HalfTensor(n, z).normal_(generator=generator)

    return h

I am refactoring the code, and I got a warning for using the sintax:

/tmp/ipykernel_115130/1586572147.py:34: UserWarning:
  The torch.cuda.*DtypeTensor constructors are no longer recommended. It's best to use methods such
  as torch.tensor(data, dtype=*, device='cuda') to create tensors. (Triggered internally at
  ../torch/csrc/tensor/python_tensor.cpp:83.)

Any idea of how to keep the same behavior while using correct practices?

I’m a bit confused, the default factory function should support that: torch.randn(size, dtype=dtype, device=device, generator=generator)

1 Like

Yes, thank you. I was some old code I used long time ago for some reason. It works perfectly now.

1 Like