Is it anyway I can sample random numbers directly on GPU, avoiding sampling on CPU and then transfer, such as:
torch.tensor.randn(10,10).cuda() #REALLY SLOW
maybe adding to the torch.distribution package a way to tell sampling is performed on GPU, because the source code of this methods always use torch.tensor but not torch.cuda.tensor.
Or add torch.cuda.FloatTensor(10,10).randn_
This would be really useful.
You can use methods such as
normal_() on cuda tensors to generate random numbers directly on gpu.
Keep in mind that
torch.rand(10, 10) is equivalent to
torch.FloatTensor(10, 10).uniform_() and
torch.randn(10, 10) is equivalent to
torch.cuda.FloatTensor(10, 10).normal_() will do what you want.
thanks, that was the method I was looking for but did not find in docs.
I actually thinkg you should be more consistent with the names. In this case:
randn(mu,std)->samples a tensor of given shape with a gaussian distribution of mean mu and stdev std
normal(mu_vec,std_vec) Returns a Tensor of random numbers drawn from separate normal distributions who’s mean and standard deviation are given. (directly from your doc).
So there is no way to think that normal_ will do what randn does in a Tensor inplace method. Maybe adding randn_ to do what randn does and normal_ for what normal does. One inplace (GPU, CPU) and one returning a tensor CPU
Thanks you anyway.
Hi, Is there a way to sample from a categorical distribution (with some probability distribution specified) on the gpu? For example from cpu I can do:
c = torch.distributions.Categorical(probs=torch.tensor([1.0, 1.0, 2.0]))
s = c.sample(sample_shape=(10,10))
s = s.cuda()
Is there a way to avoid the cuda copy in this case?
I think the output will be on the same device as the input. Simply initialize with a cuda tensor.