Torch Dataset: Generate examples directly on GPU

I’m training a network which takes triples of indices as inputs: (u,i,j).

Now, “u” and “i” are predefined. The number “j” is drawn uniformly at random from a large set.

I know how I could pre-store “u” and “i” on the GPU (as done for example in: How to put datasets created by torchvision.datasets in GPU in one operation?) However, it would be prohibitively expensive to pre-generate and store samples of “j” on the GPU. Further, I’d like “j” to vary a lot with each epoch, so generating it randomly would be much better.

For now my approach is to generate “j” on the fly (at random), storing it in a CPU tensor, and then transferring the tensor to the GPU. Is there some way to generate “j” directly on the GPU (without having to transfer it)?