Exponential Moving Average (EMA) of model weights does not work with Dynamic Networkarchitecture

I implement Exponential Moving Average, it works with a static network architecture. But not for Dynamic ones e.g. Stochastic Depth. My experiments shows no weight update at all.
Thank you in advance.

    model_ema = copy.deepcopy(model)
    for param in model_ema.parameters():

def update_ema_variables(model, ema_model, alpha, global_step):
    alpha = min(1 - 1 / (global_step + 1), alpha)
    for ema_param, param in zip(ema_model.parameters(), model.parameters()):
        ema_param.data.mul_(alpha).add_(1 - alpha, param.data)