Why should `at::cuda::philox::unpack(philox_args)` be called from kernels

Looking at the random number generation API of pytorch, it seems that at::cuda::philox::unpack(philox_args) is only called from kernels (global or device functions), yet it is also a __host__ function. Would it matter if it’s only called once on the host? If not, why is that? The function seems to be a pure function.

I’m not interested in potential performance gains by freeing a few bytes; rather, I’m trying to understand why.

is it because *arg.seed_.ptr can only be in the GPU? If so why is the function marked as __host__?