Intuition Behind Categorical Distribution Return Shape

If you use logits to represent a categorial distribution with a batch size and then sample N times, what is the intuition behind the return shape as Sampled Shape x Batch Size? Wouldn’t it make more sense to have your Observations (Batch Size) x Samples drawn

logitz = torch.randn(100,15) # 15 classes with 100 batch size represented as log-probabilities
m = Categorical(logitz)
the_samples = m.sample(200)
the_samples.shape = [200, 100]