Torch.stack and device

If your fatoms list contains CPU tensors, then your example is the way to go.
You could push the content onto the GPU before calling torch.stack, which is what my example shows.