Questions about using torch.Distributions to create a negative bionomialo mixture distribution

Hi, all, I intend to mix two Negative bionomial distirbution with each parameter size (nm), actually there are two 2n*m distributions in total. I intend to implement this function based on Categorical and MixtureSameFamily.

However, it seems that I cannot directly create this mixture distirbutions based on these two functions. My codes are:

    def forward(
            self, u: torch.Tensor, v: torch.Tensor,
            b: torch.Tensor, l: torch.Tensor # l is sequencing depth
    ) -> NBMixture:
        scale = F.softplus(self.scale_lin[b])
        logit_mu1 = scale * (u @ v.t()) + self.bias1[b]
        logit_mu2 = scale * (u @ v.t()) + self.bias2[b]

        mu1 = F.softmax(logit_mu1, dim=1)
        mu2 = F.softmax(logit_mu2, dim=1)

        log_theta = self.log_theta[b]

        total_count = torch.stack([log_theta.exp(), log_theta.exp()])
        total_logits = torch.stack([(mu1*l + EPS).log() - log_theta, (mu2*l + EPS).log() - log_theta])
        # Create Negative Binomial distributions
        nb1 = NegativeBinomial(total_count, logits=total_logits)
        # Define mixture weights
        mixture_weights = torch.distributions.Bernoulli(mu1).sample()
        mixture_logits = torch.stack([torch.distributions.Bernoulli(mu1).log_prob(mixture_weights),
                                    torch.distributions.Bernoulli(mu1).log_prob(1 - mixture_weights)])
        # mixture_logits = mixture_logits.reshape(mixture_logits.shape[1], mixture_logits.shape[2], 2)
        log_prob_mix = D.Categorical(logits = mixture_logits)
        # log_prob_mix = D.Categorical(torch.FloatTensor([0.6,0.4]))
        # Create the MixtureSameFamily distribution
        mixture_distribution = MixtureSameFamily(log_prob_mix, nb1)

        return mixture_distribution

And I received diminsions not matched errors. Coud anyone please help me? Thanks.