Fast way to generate sample neural networks from a per-weight distribution

I have the distribution of each weight (and bias) of a neural network. What is a fast way to generate sample neural networks? To be specific, the distribution is assumed to be Gaussian and the mean and variance are stored in two dictionaries.

I have a small ResNet with about 400,000 parameters (weights and biases). The fastest way I got so far takes about 1 minute to generate a sample neural network on a GPU. Any ideas how to speed it up?

My code is as follows. It iterates through individual weights and biases to generate a sample of a scalar Gaussian random variable at a time.

I also tried generating a vector (the same size as that of param) at a time by using torch.distributions.MultivariateNormal and diag() matrices, but I got an out-of-memory error.

def generate_net(dic1, dic2, sample_net):  
    # dic1: dictionary storing mean
    # dic2: dictionary storing variance
    # sample_net: model of a network
    for name, param in sample_net.named_parameters():
        param1 = dic1[name]
        param2 = dic2[name]
        variance_final = param2 - torch.mul(param1, param1)
        size_tmp = param2.size()
        param1 = param1.view(-1, 1)  # make it a ?x1 vector
        variance_final = variance_final.view(-1, 1)
        param = param.view(-1,1)
        indx = 0
        for wt1, var in zip(param1, variance_final):
            rv = torch.distributions.Normal(wt1, torch.sqrt(var) * damp_factor)
            param[indx, 0] = rv.sample()
        param = param.view(size_tmp)  # get back to original size
        param1 = param1.view(size_tmp)
        param2 = param2.view(size_tmp)