Is it anyway I can sample random numbers directly on GPU, avoiding sampling on CPU and then transfer, such as:
torch.tensor.randn(10,10).cuda() #REALLY SLOW
maybe adding to the torch.distribution package a way to tell sampling is performed on GPU, because the source code of this methods always use torch.tensor but not torch.cuda.tensor.
You can use methods such as uniform_() or normal_() on cuda tensors to generate random numbers directly on gpu.
Keep in mind that torch.rand(10, 10) is equivalent to torch.FloatTensor(10, 10).uniform_() and torch.randn(10, 10) is equivalent to torch.FloatTensor(10, 10).normal_().
So torch.cuda.FloatTensor(10, 10).normal_() will do what you want.
I actually thinkg you should be more consistent with the names. In this case:
randn(mu,std)->samples a tensor of given shape with a gaussian distribution of mean mu and stdev std
normal(mu_vec,std_vec) Returns a Tensor of random numbers drawn from separate normal distributions who’s mean and standard deviation are given. (directly from your doc).
So there is no way to think that normal_ will do what randn does in a Tensor inplace method. Maybe adding randn_ to do what randn does and normal_ for what normal does. One inplace (GPU, CPU) and one returning a tensor CPU
I am currently using the suggested code (see below)
def create_h(n,z,generator,dtype,device):
"""
Creates a Tensor (z x n x 1) where each entry ~ N(0,1). Automatically detects the
precision 64 bits or 32 bits.
-------------
:param n: int
Dimension of the Space where the Polytope Lives
:param z: int
Padding Parameter
:param generator:
:param device: String, default = cpu
Hardware used to make the computations and allocate the result.
If equal to cpu then the CPUs are used for computing the inverse.
If equal to cuda then the a GPU is used for computing the inverse.
-------------
:return: Torch Tensor
Tensor (z x n x 1) where each entry ~ N(0,1)Contains a tensor
"""
if '64' in str(dtype):
if 'cuda' in device:
h = torch.cuda.DoubleTensor(n, z).normal_(generator=generator)
elif 'cpu' == device:
h = torch.DoubleTensor(n, z).normal_(generator=generator)
elif '32' in str(dtype):
if 'cuda' in device:
h = torch.cuda.FloatTensor(n, z).normal_(generator=generator)
elif 'cpu' == device:
h = torch.FloatTensor(n, z).normal_(generator=generator)
elif '16' in str(dtype):
if 'cuda' in device:
h = torch.cuda.HalfTensor(n, z).normal_(generator=generator)
elif 'cpu' == device:
h = torch.HalfTensor(n, z).normal_(generator=generator)
return h
I am refactoring the code, and I got a warning for using the sintax:
/tmp/ipykernel_115130/1586572147.py:34: UserWarning:
The torch.cuda.*DtypeTensor constructors are no longer recommended. It's best to use methods such
as torch.tensor(data, dtype=*, device='cuda') to create tensors. (Triggered internally at
../torch/csrc/tensor/python_tensor.cpp:83.)
Any idea of how to keep the same behavior while using correct practices?