I need to create a random vector with a specific distribution each iteration and run it in gpu/cuda. I know one can do x.cuda() on it but that seems rather slow based on me playing around with it. Is there a better way to do this so that it starts of in GPU or something of that sort? Especially cuz Im doing this every iteration.
Obviously if it generalizes it such that any vector/tensor I create are directly set to CUDA that would be very nice.
Note I want it to make sure it does not screw up my dataloader. For example I am aware that:
For now use this pattern torch.cuda.*Tensor(*shape).inplace_sampling_method_here_(). Here are the available inplace sampling methods: http://pytorch.org/docs/master/torch.html#in-place-random-sampling. Notice that these are the basic building block distributions. You can use the results to generate samples of more complex distributions, e.g. multivariate Gaussian.
In the next version we will have dtype in tensor factory methods, so you can just do things like torch.randn(3, 4, dtype=torch.cuda.double).