I found a calling of at::cuda::cumsum_out in Multinomial.cu cuda source:
But I can’t find it’s definition all over the torch repo.
I found a calling of at::cuda::cumsum_out in Multinomial.cu cuda source:
But I can’t find it’s definition all over the torch repo.
OK, I got it: it’s definition is in RegisterCUDA.cpp, just a wrapper to at::native::cumsum.