In regard to your questions about multinomial() (but not trying to comment
on the larger use case that generate() is presumably a part):
Let’s say that your two largest probs are rather close together (for example, 0.25 and 0.26). Using argmax() would always give you the index of 0.26,
ignoring, in a sense, that 0.25 is almost the same. On the other hand, using multinomial() will give you the index of 0.26 26% of the time and the index
of 0.25 25% of the time, respecting the fact that the two values are quite close
to one another. (This may or may not be the behavior you want, but it does
make sense.)
You could use other probability distributions and something else (like using argmax(), which is the probability distribution that gives you one specific index
100% of the time) might better fit your use case. However, as noted above,
using multinomial() samples index values according to probs, which could
well be what you want.