Data2Vec: Implement Teacher Update

Hi everyone,

I just stumbled upon the new Data2Vec Paper of Meta AI.

In section 3.3 the authors describe how the parameter updates for the so called teacher model is done. If I got it right, the parameters get updated according to
W_t = (1-k) * W_t + k * W_s,
with W_t being the teacher model’s parameters, W_s being the student model’s parameters and k being a constant factor.

I tried to reimplement this parameter update. In order to do this, I have to multiply all parameters of the teacher Model (W_t) with the constant factor (1-k). The only solution I found to this, was the following (from this post here):

for name, param in state_dict.items():
    # Transform the parameter as required.
    transformed_param = param * 0.9

    # Update the parameter.
    state_dict[name].copy_(transformed_param)

However, this looks highly inefficient to me, as this is no vectorized operation. Is there a faster way to do so?

Thanks for your help! :slight_smile:

Assuming your model is net

state_dict = net.state_dict() # This is no longer required

for cur_param in net.parameters():
    cur_param.data =  cur_param.data*0.1